Sample records for ccl3l gene cluster

  1. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Directory of Open Access Journals (Sweden)

    Edberg Jeffrey C


    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  2. CCL3L gene copy number and survival in an HIV-1 infected Zimbabwean population

    DEFF Research Database (Denmark)

    Larsen, Margit Hørup; Wegner, Lise Thørner; Zinyama, Rutendo;


    . A treatment-naïve cohort, which included 153 HIV infected and 159 HIV uninfected individuals, was followed for up to 4.3 years. The CNV of the CCL3L was determined by duplex real-time polymerase chain reaction. We found no association between four CCL3L CNV strata and HIV status (P=0.7), CD4 T-cell count (P=0......The C-C motif chemokine ligand 3-like (CCL3L) protein is a potent chemoattractant which by binding to C-C chemokine receptor type 5 (CCR5) inhibits human immunodeficiency virus (HIV) entry. Copy number variation (CNV) of the CCL3L has been shown to be associated with HIV susceptibility...... and progression to AIDS, but these results have been inconsistent. We examined a Zimbabwean study population for an association of CCL3L CNV with HIV status, progression (CD4 T-cells and viral load), and survival. Another aim was to investigate the possible effects of CCL3L CNV on CCL3 protein concentration...

  3. Polymorphisms of CCL3L1/CCR5 genes and recurrence of hepatitis B in liver transplant recipients

    Institute of Scientific and Technical Information of China (English)

    HongLi; Hai-YangXie; LinZhou; Wei-LinWang; Ting-BoLiang; MinZhang; Shu-SenZheng


    BACKGROUND: The genetic diversity of chemokines and chemokine receptors has been associated with the outcome of hepatitis B virus infection. The aim of this study was to evaluate whether the copy number variation in the CCL3L1 gene and the polymorphisms of CCR5Δ32 and CCR5-2459A→G (rs1799987) are associated with recurrent hepatitis B in liver transplantation for hepatitis B virus infection-related end-stage liver disease. METHODS: A total of 185 transplant recipients were enrolled in this study. The genomic DNA was extracted from whole blood, the copy number of the CCL3L1 gene was determined by a quantitative real-time PCR based assay, CCR5Δ32 was detected by a sizing PCR method, and a single-nucleotide polymorphism in CCR5-2459 was detected by restriction fragment length polymorphismPCR. RESULTS:  No CCR5Δ32 mutation was detected in any of the individuals from China. Neither copy number variation nor polymorphism in CCR5-2459 was associated with post-transplant re-infection with hepatitis B virus. However, patients with fewer copies ( CONCLUSION: Patients possessing the compound decreased functional genotype of both CCL3L1 and CCR5 genes might be more likely to have recurrence of hepatitis B after transplantation.

  4. CCL3L1 gene copy number in individuals with and without HIV-associated neurocognitive disorder

    Directory of Open Access Journals (Sweden)

    Brown A


    Full Text Available Amanda Brown1, Ned Sacktor1, Karen Marder2, Bruce Cohen3, Giovanni Schifitto4, Richard L Skolasky1, Jason Creighton1, Liping Guo1, Justin C McArthur11Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, 2Department of Neurology, Psychiatry, Sergievsky Center and Taub Institute on Alzheimers Disease and the Aging Brain, New York Presbyterian Hospital, Columbia University College of Physicians and Surgeons, New York, NY, 3Department of Neurology, Northwestern University Feinberg School of Medicine, Chicago, IL, 4Department of Neurology, University of Rochester, School of Medicine and Dentistry, Rochester, NY, USABackground: CCL3L1 copy number variation has been implicated as a marker for susceptibility and immunity to human immunodeficiency virus (HIV-1 infection and its pathogenic sequelae. Some of these findings have been confirmed in several, but not all, subsequent independent cohort studies. A three-fold risk for the development of HIV-associated dementia was reported in individuals possessing a CCL3L1 copy number below the ethnic group median combined with a detrimental CCR5 genotype. With the availability of antiretroviral therapy since 1996, there has been a significant decline in HIV-associated dementia, and milder forms of HIV-associated neurocognitive impairment (HAND are now most prevalent. Moreover, patients are living longer with HIV-1 infection and it is recognized that aging may be a contributory factor to the development of cognitive disorder. Thus, the need for biomarkers that can be used in clinical practice to identify and provide optimal treatment for those at increased risk for HAND is great. HAND affects 20%–30% of HIV-infected individuals, and several genetic loci which have been shown to confer susceptibility to HIV infection may also modulate the development of neurocognitive disorder. The aim of this study was to determine whether CCL3L1 chemokine gene copy number in self-defined ethnic

  5. Copy number variation in chemokine superfamily: the complex scene of CCL3L-CCL4L genes in health and disease. (United States)

    Colobran, R; Pedrosa, E; Carretero-Iglesia, L; Juan, M


    Genome copy number changes (copy number variations: CNVs) include inherited, de novo and somatically acquired deviations from a diploid state within a particular chromosomal segment. CNVs are frequent in higher eukaryotes and associated with a substantial portion of inherited and acquired risk for various human diseases. CNVs are distributed widely in the genomes of apparently healthy individuals and thus constitute significant amounts of population-based genomic variation. Human CNV loci are enriched for immune genes and one of the most striking examples of CNV in humans involves a genomic region containing the chemokine genes CCL3L and CCL4L. The CCL3L-CCL4L copy number variable region (CNVR) shows extensive architectural complexity, with smaller CNVs within the larger ones and with interindividual variation in breakpoints. Furthermore, the individual genes embedded in this CNVR account for an additional level of genetic and mRNA complexity: CCL4L1 and CCL4L2 have identical exonic sequences but produce a different pattern of mRNAs. CCL3L2 was considered previously as a CCL3L1 pseudogene, but is actually transcribed. Since 2005, CCL3L-CCL4L CNV has been associated extensively with various human immunodeficiency virus-related outcomes, but some recent studies called these associations into question. This controversy may be due in part to the differences in alternative methods for quantifying gene copy number and differentiating the individual genes. This review summarizes and discusses the current knowledge about CCL3L-CCL4L CNV and points out that elucidating their complete phenotypic impact requires dissecting the combinatorial genomic complexity posed by various proportions of distinct CCL3L and CCL4L genes among individuals.

  6. Characterization of copy number variants for CCL3L1 gene in rheumatoid arthritis for French trio families and Tunisian cases and controls. (United States)

    Ben Kilani, Mohamed Sahbi; Achour, Yosser; Perea, Javier; Cornelis, François; Bardin, Thomas; Chaudru, Valérie; Maalej, Abdellatif; Petit-Teixeira, Elisabeth


    Analyses of copy number variants (CNVs) for candidate genes in complex diseases are currently a promising research field. CNVs of C-C chemokine ligand 3-like 1 (CCL3L1) gene are candidate genomic factors in rheumatoid arthritis (RA). We investigated CCL3L1 CNVs association with a case-control study in Tunisians and a transmission analysis in French trio families. Relative copy number (rCN) of CCL3L1 gene was quantified by droplet digital PCR (ddPCR) in 100 French trio families (RA patients and their two parents) and in 166 RA cases and 102 healthy controls from Tunisia. We calculated odds ratio (OR) to investigate association risk for CCL3L1 CNVs in RA. rCN identified varied from 0 to 4 in the French population and from 0 to 7 in the Tunisian population. A significant difference was observed in the distribution of these rCNs between the two populations (p = 2.34 × 10(-10)), as when rCN from French and Tunisian RA patients were compared (p = 2.83 × 10(-5)). CNVs transmission in French RA trios allowed the characterization of genotypes with the presence of tandem duplication and triplication on the same chromosome. RA association tests highlighted a protective effect of rCN = 5 for CCL3L1 gene in the Tunisian population (OR = 0.056; CI 95 % [0.01-0.46]). Characterization of CCL3L1 CNVs with ddPCR methodology highlighted specific CN genotypes in a French family sample. A copy number polymorphism of a RA candidate gene was quantified, and its significant association with RA was revealed in a Tunisian sample.

  7. The CCL3L1-CCR5 genotype influences the development of AIDS, but not HIV susceptibility or the response to HAART

    Energy Technology Data Exchange (ETDEWEB)

    Bhattacharya, Tanmoy [Los Alamos National Laboratory; Stanton, Jennifer [NORTHWESTERN UNIV; Kim, Eun - Young [NORTHWESTERN UNIV; Kunstman, Kevin [NORTHWESTERN UNIV; Phair, John [NORTHWESTERN UNIV; Jacobson, Lisa P [JOHNS HOPKINS UNIV; Wolinsky, Steven M [NORTHWESTERN UNIV


    A selective advantage against infectious diseases such as HIV/AIDS is associated with differences in the genes relevant to immunity and virus replication. The CC chemokine receptor 5 (CCR5), the principal coreceptor for HIV, and its chemokine ligands, including CCL3L1, influences the CD4+ target cells susceptibility to infection. The CCL3L1 gene is in a region of segmental duplication on the q-arm of human chromosome 17. Increased numbers of CCL3L1 gene copies that affect the gene expression phenotype might have substantial protective effects. Here we show that the population-specific CCL3L1 gene copy number and the CCR5 {Delta}32 protein-inactivating deletion that categorizes the CCL3L1-CCR5 genotype do not influence HIV/AIDS susceptibility or the robustness of immune recovery after the initiation of highly active antiretroviral therapy (HAART).

  8. CCL3L1-CCR5 genotype influences durability of immune recovery during antiretroviral therapy of HIV-1-infected individuals. (United States)

    Ahuja, Sunil K; Kulkarni, Hemant; Catano, Gabriel; Agan, Brian K; Camargo, Jose F; He, Weijing; O'Connell, Robert J; Marconi, Vincent C; Delmar, Judith; Eron, Joseph; Clark, Robert A; Frost, Simon; Martin, Jeffrey; Ahuja, Seema S; Deeks, Steven G; Little, Susan; Richman, Douglas; Hecht, Frederick M; Dolan, Matthew J


    The basis for the extensive variability seen in the reconstitution of CD4(+) T cell counts in HIV-infected individuals receiving highly active antiretroviral therapy (HAART) is not fully known. Here, we show that variations in CCL3L1 gene dose and CCR5 genotype, but not major histocompatibility complex HLA alleles, influence immune reconstitution, especially when HAART is initiated at CCR5 genotypes favoring CD4(+) T cell recovery are similar to those that blunted CD4(+) T cell depletion during the time before HAART became available (pre-HAART era), suggesting that a common CCL3L1-CCR5 genetic pathway regulates the balance between pathogenic and reparative processes from early in the disease course. Hence, CCL3L1-CCR5 variations influence HIV pathogenesis even in the presence of HAART and, therefore, may prospectively identify subjects in whom earlier initiation of therapy is more likely to mitigate immunologic failure despite viral suppression by HAART. Furthermore, as reconstitution of CD4(+) cells during HAART is more sensitive to CCL3L1 dose than to CCR5 genotypes, CCL3L1 analogs might be efficacious in supporting immunological reconstitution.

  9. CCL3L1 copy number variation and susceptibility to HIV-1 infection: a meta-analysis.

    Directory of Open Access Journals (Sweden)

    SiJie Liu

    Full Text Available BACKGROUND: Although several studies have investigated whether CCL3L1 copy number variation (CNV influences the risk of HIV-1 infection, there are still no clear conclusions. Therefore, we performed a meta-analysis using two models to generate a more robust estimate of the association between CCL3L1 CNV and susceptibility to HIV-1 infection. METHODS: We divided the cases and controls into two parts as individuals with CCL3L1 gene copy number (GCN above the population specific median copy number (PMN and individuals with CCL3L1 GCN below PMN, respectively. Odds ratios (ORs with 95% confidence intervals (95% CIs were given for the main analysis. We also conducted stratified analyses by ethnicity, age group and sample size. Relevant literatures were searched through PubMed and ISI Web of Knowledge up to March 2010. RESULTS: In total, 9 studies with 2434 cases and 4029 controls were included. ORs for the main analysis were 1.35 (95% CI, 1.02-1.78, model: GCN ≤ PMN Vs. GCN > PMN and 1.70 (95% CI, 1.30-2.23, model: GCN < PMN Vs. GCN ≥ PMN, respectively. Either in stratified analysis, statistically significant results can be detected in some subgroups. CONCLUSIONS: Our analyses indicate that CCL3L1 CNV is associated with susceptibility to HIV-1 infection. A lower copy number is associated with an increased risk of HIV-1 infection, while a higher copy number is associated with reduced risk for acquiring HIV-1.

  10. 人趋化因子CCL3L1融合蛋白表达和活性分析%Human CC ligand 3-like protein 1 (CCL3L1) fusion protein expression and function assay

    Institute of Scientific and Technical Information of China (English)

    徐斌; 石英; 李俊红; 张薇; 赵国庆; 陈德喜; 吴昊


    目的 人趋化因子CCL3L1进行融合蛋白原核表达和真核表达,纯化后活性分析.方法 克隆人类CCL3L1 cDNA,构建两种CCL3L1表达载体,获得两个CCL3L1融合蛋白,一个在BL21大肠杆菌表达的GST-CCL3L1融合蛋白,另一个在S2果蝇细胞表达的His-CCL3L1融合蛋白.同时克隆了pcDNA3.1-flag-CCR5表达载体,培养了稳定表达flag-CCR5的细胞株,进行人趋化因子CCL3L1活性分析.结果 成功构建人趋化因子CCL3L1融合蛋白原核表达载体pGEX-4T和真核表达载体pMT/BiP/V5-His,免疫沉淀法检测和Western blot法分析His-CCL3L1蛋白在浓度1 nmol/L到50 nmol/L存在剂量依赖性,浓度50 nmol/L到100 nmol/L没有剂量依赖性.纯化的His-CCL3L1蛋白能特异性结合CCR5受体.结论 成功表达了融合蛋白GST-CCL3L1和His-CCL3L1,果蝇细胞表达的His-CCL3L1蛋白具有与天然CCL3L1相同的生物学活性,为进一步制备CCL3L1单克隆和多克隆抗体及研究CCL3L1影响HIV-1感染的机制提供基础资料.

  11. Association of CCR2-CCR5 haplotypes and CCL3L1 copy number with Kawasaki Disease, coronary artery lesions, and IVIG responses in Japanese children.

    Directory of Open Access Journals (Sweden)

    Manju Mamtani

    Full Text Available BACKGROUND: The etiology of Kawasaki Disease (KD is enigmatic, although an infectious cause is suspected. Polymorphisms in CC chemokine receptor 5 (CCR5 and/or its potent ligand CCL3L1 influence KD susceptibility in US, European and Korean populations. However, the influence of these variations on KD susceptibility, coronary artery lesions (CAL and response to intravenous immunoglobulin (IVIG in Japanese children, who have the highest incidence of KD, is unknown. METHODOLOGY/PRINCIPAL FINDINGS: We used unconditional logistic regression analyses to determine the associations of the copy number of the CCL3L1 gene-containing duplication and CCR2-CCR5 haplotypes in 133 Japanese KD cases [33 with CAL and 25 with resistance to IVIG] and 312 Japanese controls without a history of KD. We observed that the deviation from the population average of four CCL3L1 copies (i.e., four copies was associated with an increased risk of KD and IVIG resistance (adjusted odds ratio (OR=2.25, p=0.004 and OR=6.26, p=0.089, respectively. Heterozygosity for the CCR5 HHF*2 haplotype was associated with a reduced risk of both IVIG resistance (OR=0.21, p=0.026 and CAL development (OR=0.44, p=0.071. CONCLUSIONS/SIGNIFICANCE: The CCL3L1-CCR5 axis may play an important role in KD pathogenesis. In addition to clinical and laboratory parameters, genetic markers may also predict risk of CAL and resistance to IVIG.

  12. Accuracy and differential bias in copy number measurement of CCL3L1 in association studies with three auto-immune disorders

    Directory of Open Access Journals (Sweden)

    Carpenter Danielle


    Full Text Available Abstract Background Copy number variation (CNV contributes to the variation observed between individuals and can influence human disease progression, but the accurate measurement of individual copy numbers is technically challenging. In the work presented here we describe a modification to a previously described paralogue ratio test (PRT method for genotyping the CCL3L1/CCL4L1 copy variable region, which we use to ascertain CCL3L1/CCL4L1 copy number in 1581 European samples. As the products of CCL3L1 and CCL4L1 potentially play a role in autoimmunity we performed case control association studies with Crohn's disease, rheumatoid arthritis and psoriasis clinical cohorts. Results We evaluate the PRT methodology used, paying particular attention to accuracy and precision, and highlight the problems of differential bias in copy number measurements. Our PRT methods for measuring copy number were of sufficient precision to detect very slight but systematic differential bias between results from case and control DNA samples in one study. We find no evidence for an association between CCL3L1 copy number and Crohn's disease, rheumatoid arthritis or psoriasis. Conclusions Differential bias of this small magnitude, but applied systematically across large numbers of samples, would create a serious risk of false positive associations in copy number, if measured using methods of lower precision, or methods relying on single uncorroborated measurements. In this study the small differential bias detected by PRT in one sample set was resolved by a simple pre-treatment by restriction enzyme digestion.

  13. FunGeneClusterS

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Brandl, Julian; Andersen, Mikael Rørdam


    and industrial biotechnology applications. We have previously published a method for accurate prediction of clusters from genome and transcriptome data, which could also suggest cross-chemistry, however, this method was limited both in the number of parameters which could be adjusted as well as in user......Secondary metabolites of fungi are receiving an increasing amount of interest due to their prolific bioactivities and the fact that fungal biosynthesis of secondary metabolites often occurs from co-regulated and co-located gene clusters. This makes the gene clusters attractive for synthetic biology...

  14. Minimum Information about a Biosynthetic Gene cluster

    NARCIS (Netherlands)

    Medema, M.H.; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, J.B.; Blin, Kai; Bruijn, De Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R.C.; Cruz-Morales, Pablo; Duddela, Srikanth; Düsterhus, Stephanie; Edwards, Daniel J.; Fewer, David P.; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S.; Helfrich, Eric J.N.; Hillwig, Matthew L.; Ishida, Keishi; Jones, Adam C.; Jones, Carla S.; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kötter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V.; Mantovani, Simone M.; Monroe, Emily A.; Moore, Marcus; Moss, Nathan; Nützmann, Hans Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F.J.; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J.; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K.; Balibar, Carl J.; Balskus, Emily P.; Barona-Gómez, Francisco; Bechthold, Andreas; Bode, Helge B.; Borriss, Rainer; Brady, Sean F.; Brakhage, Axel A.; Caffrey, Patrick; Cheng, Yi Qiang; Clardy, Jon; Cox, Russell J.; Mot, De René; Donadio, Stefano; Donia, Mohamed S.; Donk, Van Der Wilfred A.; Dorrestein, Pieter C.; Doyle, Sean; Driessen, Arnold J.M.; Ehling-Schulz, Monika; Entian, Karl Dieter; Fischbach, Michael A.; Gerwick, Lena; Gerwick, William H.; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Höfte, Monica; Jensen, Susan E.; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L.; Keller, Nancy P.; Kormanec, Jan; Kuipers, Oscar P.; Kuzuyama, Tomohisa; Kyrpides, Nikos C.; Kwon, Hyung Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y.; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Méndez, Carmen; Metsä-Ketelä, Mikko; Micklefield, Jason; Mitchell, Douglas A.; Moore, Bradley S.; Moreira, Leonilde M.; Müller, Rolf; Neilan, Brett A.; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S.; Ostash, Bohdan; Payne, Shelley M.; Pernodet, Jean Luc; Petricek, Miroslav; Piel, Jörn; Ploux, Olivier; Raaijmakers, Jos M.; Salas, José A.; Schmitt, Esther K.; Scott, Barry; Seipke, Ryan F.; Shen, Ben; Sherman, David H.; Sivonen, Kaarina; Smanski, Michael J.; Sosio, Margherita; Stegmann, Evi; Süssmuth, Roderich D.; Tahlan, Kapil; Thomas, Christopher M.; Tang, Yi; Truman, Andrew W.; Viaud, Muriel; Walton, Jonathan D.; Walsh, Christopher T.; Weber, Tilmann; Wezel, Van Gilles P.; Wilkinson, Barrie; Willey, Joanne M.; Wohlleben, Wolfgang; Wright, Gerard D.; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B.; Breitling, Rainer; Takano, Eriko; Glöckner, Frank Oliver


    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploi

  15. Evolution of homeobox gene clusters in animals: the Giga-cluster and primary versus secondary clustering.

    Directory of Open Access Journals (Sweden)

    David Ellard Keith Ferrier


    Full Text Available The Hox gene cluster has been a major focus in evolutionary developmental biology. This is because of its key role in patterning animal development and widespread examples of changes in Hox genes being linked to the evolution of animal body plans and morphologies. Also, the distinctive organisation of the Hox genes into genomic clusters in which the order of the genes along the chromosome corresponds to the order of their activity along the embryo, or during a developmental process, has been a further source of great interest. This is known as Colinearity, and it provides a clear link between genome organisation and the regulation of genes during development, with distinctive changes marking evolutionary transitions. The Hox genes are not alone, however. The homeobox genes are a large super-class, of which the Hox genes are only a small subset, and an ever-increasing number of further gene clusters besides the Hox are being discovered. This is of great interest because of the potential for such gene clusters to help understand major evolutionary transitions, both in terms of changes to development and morphology as well as evolution of genome organisation. However, there is uncertainty in our understanding of homeobox gene cluster evolution at present. This relates to our still rudimentary understanding of the dynamics of genome rearrangements and evolution over the evolutionary timescales being considered when we compare lineages from across the animal kingdom. A major goal is to deduce whether particular instances of clustering are primary (conserved from ancient ancestral clusters or secondary (reassortment of genes into clusters in lineage-specific fashion. The following summary of the various instances of homeobox gene clusters in animals, and the hypotheses about their evolution, provides a framework for the future resolution of this uncertainty.

  16. Filtering Genes for Cluster and Network Analysis

    Directory of Open Access Journals (Sweden)

    Parkhomenko Elena


    Full Text Available Abstract Background Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias. Results This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks. Conclusion The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated.

  17. Lateral transfer of the lux gene cluster. (United States)

    Kasai, Sabu; Okada, Kazuhisa; Hoshino, Akinori; Iida, Tetsuya; Honda, Takeshi


    The lux operon is an uncommon gene cluster. To find the pathway through which the operon has been transferred, we sequenced the operon and both flanking regions in four typical luminous species. In Vibrio cholerae NCIMB 41, a five-gene cluster, most genes of which were highly similar to orthologues present in Gram-positive bacteria, along with the lux operon, is inserted between VC1560 and VC1563, on chromosome 1. Because this entire five-gene cluster is present in Photorhabdus luminescens TT01, about 1.5 Mbp upstream of the operon, we deduced that the operon and the gene cluster were transferred from V. cholerae to an ancestor of Pr. luminescens. Because in both V. fischeri and Shewanella hanedai, luxR and luxI were found just upstream of the operon, we concluded that the operon was transferred from either species to the other. Because most of the genes flanking the operon were highly similar to orthologues present on chromosome 2 of vibrios, we speculated that the operon of most species is located on this chromosome. The undigested genomic DNAs of five luminous species were analysed by pulsed-field gel electrophoresis and Southern hybridization. In all the species except V. cholerae, the operons are located on chromosome 2.

  18. Cluster Analysis of Gene Expression Data

    CERN Document Server

    Domany, E


    The expression levels of many thousands of genes can be measured simultaneously by DNA microarrays (chips). This novel experimental tool has revolutionized research in molecular biology and generated considerable excitement. A typical experiment uses a few tens of such chips, each dedicated to a single sample - such as tissue extracted from a particular tumor. The results of such an experiment contain several hundred thousand numbers, that come in the form of a table, of several thousand rows (one for each gene) and 50 - 100 columns (one for each sample). We developed a clustering methodology to mine such data. In this review I provide a very basic introduction to the subject, aimed at a physics audience with no prior knowledge of either gene expression or clustering methods. I explain what genes are, what is gene expression and how it is measured by DNA chips. Next I explain what is meant by "clustering" and how we analyze the massive amounts of data from such experiments, and present results obtained from a...

  19. Gene Expression Data Knowledge Discovery using Global and Local Clustering

    CERN Document Server

    H, Swathi


    To understand complex biological systems, the research community has produced huge corpus of gene expression data. A large number of clustering approaches have been proposed for the analysis of gene expression data. However, extracting important biological knowledge is still harder. To address this task, clustering techniques are used. In this paper, hybrid Hierarchical k-Means algorithm is used for clustering and biclustering gene expression data is used. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. A validation technique, Figure of Merit is used to determine the quality of clustering results. Appropriate knowledge is mined from the clusters by embedding a BLAST similarity search program into the clustering and biclustering process. To discover both local and global clustering structure biclustering and clustering algorithms are utilized. To determine the quality of clustering results, a validation technique, Figure of Merit is used. Appropriate ...

  20. Gene ordering in partitive clustering using microarray expressions. (United States)

    Ray, Shubhra Sankar; Bandyopadhyay, Sanghamitra; Pal, Sankar K


    A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering and ordering the genes using gene expression data into homogeneous groups was shown to be useful in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on gene ordering in hierarchical clustering framework for gene expression analysis, there is no work addressing and evaluating the importance of gene ordering in partitive clustering framework, to the best knowledge of the authors. Outside the framework of hierarchical clustering, different gene ordering algorithms are applied on the whole data set, and the domain of partitive clustering is still unexplored with gene ordering approaches. A new hybrid method is proposed for ordering genes in each of the clusters obtained from partitive clustering solution, using microarray gene expressions.Two existing algorithms for optimally ordering cities in travelling salesman problem (TSP), namely, FRAG_GALK and Concorde, are hybridized individually with self organizing MAP to show the importance of gene ordering in partitive clustering framework. We validated our hybrid approach using yeast and fibroblast data and showed that our approach improves the result quality of partitive clustering solution, by identifying subclusters within big clusters, grouping functionally correlated genes within clusters, minimization of summation of gene expression distances, and the maximization of biological gene ordering using MIPS categorization. Moreover, the new hybrid approach, finds comparable or sometimes superior biological gene order in less computation time than those obtained by optimal leaf ordering in hierarchical clustering solution.

  1. Gene ordering in partitive clustering using microarray expressions

    Indian Academy of Sciences (India)

    Shubhra Sankar Ray; Sanghamitra Bandyopadhyay; Sankar K Pal


    A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering and ordering the genes using gene expression data into homogeneous groups was shown to be useful in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on gene ordering in hierarchical clustering framework for gene expression analysis, there is no work addressing and evaluating the importance of gene ordering in partitive clustering framework, to the best knowledge of the authors. Outside the framework of hierarchical clustering, different gene ordering algorithms are applied on the whole data set, and the domain of partitive clustering is still unexplored with gene ordering approaches. A new hybrid method is proposed for ordering genes in each of the clusters obtained from partitive clustering solution, using microarray gene expressions. Two existing algorithms for optimally ordering cities in travelling salesman problem (TSP), namely, FRAG_GALK and Concorde, are hybridized individually with self organizing MAP to show the importance of gene ordering in partitive clustering framework. We validated our hybrid approach using yeast and fibroblast data and showed that our approach improves the result quality of partitive clustering solution, by identifying subclusters within big clusters, grouping functionally correlated genes within clusters, minimization of summation of gene expression distances, and the maximization of biological gene ordering using MIPS categorization. Moreover, the new hybrid approach, finds comparable or sometimes superior biological gene order in less computation time than those obtained by optimal leaf ordering in hierarchical clustering solution.

  2. The rise of operon-like gene clusters in plants. (United States)

    Boycheva, Svetlana; Daviet, Laurent; Wolfender, Jean-Luc; Fitzpatrick, Teresa B


    Gene clusters are common features of prokaryotic genomes also present in eukaryotes. Most clustered genes known are involved in the biosynthesis of secondary metabolites. Although horizontal gene transfer is a primary source of prokaryotic gene cluster (operon) formation and has been reported to occur in eukaryotes, the predominant source of cluster formation in eukaryotes appears to arise de novo or through gene duplication followed by neo- and sub-functionalization or translocation. Here we aim to provide an overview of the current knowledge and open questions related to plant gene cluster functioning, assembly, and regulation. We also present potential research approaches and point out the benefits of a better understanding of gene clusters in plants for both fundamental and applied plant science.

  3. Genetic characteristics of vancomycin resistance gene cluster in Enterococcus spp. (United States)

    Chunhui, Chen; Xiaogang, Xu


    Vancomycin resistant enterococci has become an important nosocomial pathogen since it is discovered in late 1980s. The products, encoded by vancomycin resistant gene cluster in enterococci, catalyze the synthesis of peptidoglycan precursors with low affinity with glycopeptide antibiotics including vancomycin and teicoplanin and lead to resistance. These vancomycin resistant gene clusters are classified into nine types according to their gene sequences and organization, or D-Ala:D-Lac (VanA, VanB, VanD and VanM) and D-Ala:D-Ser (VanC, VanE, VanG, VanL and VanN) ligase gene clusters based on the differences of their encoded ligases. Moreover, these gene clusters are characterized by their different resistance levels and infection models. In this review, we summarize the classification, gene organization and infection model of vancomycin resistant gene cluster in Enterococcus spp.

  4. Diversity and evolution of MicroRNA gene clusters

    Institute of Scientific and Technical Information of China (English)

    ZHANG YanFeng; ZHANG Rui; SU Bing


    microRNA (miRNA) gene clusters are a group of miRNA genes clustered within a proximal distance on a chromosome. Although a large number of miRNA clusters have been uncovered in animal and plant genomes, the functional consequences of this arrangement are still poorly understood. Located in a polycistron, the coexpressed miRNA clusters are pivotal in coordinately regulating multiple processes, including embryonic development, cell cycles and cell differentiation. In this review, based on recent progress, we discuss the genomic diversity of miRNA gene clusters, the coordination of expression and function of the clustered miRNAs, and the evolutionarily adaptive processes with gain and loss of the clustering miRNA genes mediated by duplication and transposition events.

  5. Diversity and evolution of MicroRNA gene clusters

    Institute of Scientific and Technical Information of China (English)


    microRNA(miRNA) gene clusters are a group of miRNA genes clustered within a proximal distance on a chromosome.Although a large number of miRNA clusters have been uncovered in animal and plant genomes,the functional consequences of this arrangement are still poorly understood.Located in a polycistron,the coexpressed miRNA clusters are pivotal in coordinately regulating multiple processes,including embryonic development,cell cycles and cell differentiation.In this review,based on recent progress,we discuss the genomic diversity of miRNA gene clusters,the coordination of expression and function of the clustered miRNAs,and the evolutionarily adaptive processes with gain and loss of the clustering miRNA genes mediated by duplication and transposition events.


    Directory of Open Access Journals (Sweden)



    Full Text Available Microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. But the high dimensionality property of gene expression data makes it difficult to be analyzed. Lot of clustering algorithms are available for clustering. In this paper we first briefly introduce the concepts of microarray technology and discuss the basic elements of clustering on gene expression data. Then we introduce rough clustering and itsadvantage over strict and fuzzy clustering is explored. We also explain why rough clustering is preferred over other conventional methods by presenting a survey on few clustering algorithms based on rough set theory for gene expression data. We conclude by stating that this area proves to be potential research field for the researchcommunity.

  7. Computing gene expression data with a knowledge-based gene clustering approach. (United States)

    Rosa, Bruce A; Oh, Sookyung; Montgomery, Beronda L; Chen, Jin; Qin, Wensheng


    Computational analysis methods for gene expression data gathered in microarray experiments can be used to identify the functions of previously unstudied genes. While obtaining the expression data is not a difficult task, interpreting and extracting the information from the datasets is challenging. In this study, a knowledge-based approach which identifies and saves important functional genes before filtering based on variability and fold change differences was utilized to study light regulation. Two clustering methods were used to cluster the filtered datasets, and clusters containing a key light regulatory gene were located. The common genes to both of these clusters were identified, and the genes in the common cluster were ranked based on their coexpression to the key gene. This process was repeated for 11 key genes in 3 treatment combinations. The initial filtering method reduced the dataset size from 22,814 probes to an average of 1134 genes, and the resulting common cluster lists contained an average of only 14 genes. These common cluster lists scored higher gene enrichment scores than two individual clustering methods. In addition, the filtering method increased the proportion of light responsive genes in the dataset from 1.8% to 15.2%, and the cluster lists increased this proportion to 18.4%. The relatively short length of these common cluster lists compared to gene groups generated through typical clustering methods or coexpression networks narrows the search for novel functional genes while increasing the likelihood that they are biologically relevant.

  8. Detecting Sequence Homology at the Gene Cluster Level with MultiGeneBlast

    NARCIS (Netherlands)

    Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Nowick, Katja


    The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualizatio

  9. A Nomadic Subtelomeric Disease Resistance Gene Cluster in Common Bean (United States)

    The B4 resistance (R)-gene cluster, located in subtelomeric region of chromosome 4, is one of the largest clusters known in common bean (Phaseolus vulgaris, Pv). We sequenced 650 kb spanning this locus and annotated 97 genes, 26 of which correspond to Coiled-coil-Nucleotide-Binding-Site-Leucine-Rich...

  10. Simultaneous clustering of multiple gene expression and physical interaction datasets.

    Directory of Open Access Journals (Sweden)

    Manikandan Narayanan


    Full Text Available Many genome-wide datasets are routinely generated to study different aspects of biological systems, but integrating them to obtain a coherent view of the underlying biology remains a challenge. We propose simultaneous clustering of multiple networks as a framework to integrate large-scale datasets on the interactions among and activities of cellular components. Specifically, we develop an algorithm JointCluster that finds sets of genes that cluster well in multiple networks of interest, such as coexpression networks summarizing correlations among the expression profiles of genes and physical networks describing protein-protein and protein-DNA interactions among genes or gene-products. Our algorithm provides an efficient solution to a well-defined problem of jointly clustering networks, using techniques that permit certain theoretical guarantees on the quality of the detected clustering relative to the optimal clustering. These guarantees coupled with an effective scaling heuristic and the flexibility to handle multiple heterogeneous networks make our method JointCluster an advance over earlier approaches. Simulation results showed JointCluster to be more robust than alternate methods in recovering clusters implanted in networks with high false positive rates. In systematic evaluation of JointCluster and some earlier approaches for combined analysis of the yeast physical network and two gene expression datasets under glucose and ethanol growth conditions, JointCluster discovers clusters that are more consistently enriched for various reference classes capturing different aspects of yeast biology or yield better coverage of the analysed genes. These robust clusters, which are supported across multiple genomic datasets and diverse reference classes, agree with known biology of yeast under these growth conditions, elucidate the genetic control of coordinated transcription, and enable functional predictions for a number of uncharacterized genes.

  11. Super-paramagnetic clustering of yeast gene expression profiles

    CERN Document Server

    Getz, G; Domany, E; Zhang, M Q


    High-density DNA arrays, used to monitor gene expression at a genomic scale, have produced vast amounts of information which require the development of efficient computational methods to analyze them. The important first step is to extract the fundamental patterns of gene expression inherent in the data. This paper describes the application of a novel clustering algorithm, Super-Paramagnetic Clustering (SPC) to analysis of gene expression profiles that were generated recently during a study of the yeast cell cycle. SPC was used to organize genes into biologically relevant clusters that are suggestive for their co-regulation. Some of the advantages of SPC are its robustness against noise and initialization, a clear signature of cluster formation and splitting, and an unsupervised self-organized determination of the number of clusters at each resolution. Our analysis revealed interesting correlated behavior of several groups of genes which has not been previously identified.

  12. Super-paramagnetic clustering of yeast gene expression profiles (United States)

    Getz, G.; Levine, E.; Domany, E.; Zhang, M. Q.


    High-density DNA arrays, used to monitor gene expression at a genomic scale, have produced vast amounts of information which require the development of efficient computational methods to analyze them. The important first step is to extract the fundamental patterns of gene expression inherent in the data. This paper describes the application of a novel clustering algorithm, super-paramagnetic clustering (SPC) to analysis of gene expression profiles that were generated recently during a study of the yeast cell cycle. SPC was used to organize genes into biologically relevant clusters that are suggestive for their co-regulation. Some of the advantages of SPC are its robustness against noise and initialization, a clear signature of cluster formation and splitting, and an unsupervised self-organized determination of the number of clusters at each resolution. Our analysis revealed interesting correlated behavior of several groups of genes which has not been previously identified.

  13. Bioinformatics Prediction of Polyketide Synthase Gene Clusters from Mycosphaerella fijiensis.

    Directory of Open Access Journals (Sweden)

    Roslyn D Noar

    Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that

  14. Some statistical properties of gene expression clustering for array data

    DEFF Research Database (Denmark)

    Abreu, G C G; Pinheiro, A; Drummond, R D;


    DNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented......DNA arrays have been a rich source of data for the study of genomic expression of a wide variety of biological systems. Gene clustering is one of the paradigms quite used to assess the significance of a gene (or group of genes). However, most of the gene clustering techniques are applied to c...... for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https...

  15. clusterProfiler: an R package for comparing biological themes among gene clusters. (United States)

    Yu, Guangchuang; Wang, Li-Gen; Han, Yanyan; He, Qing-Yu


    Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis. Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters. The analysis module and visualization module were combined into a reusable workflow. Currently, clusterProfiler supports three species, including humans, mice, and yeast. Methods provided in this package can be easily extended to other species and ontologies. The clusterProfiler package is released under Artistic-2.0 License within Bioconductor project. The source code and vignette are freely available at

  16. Genomic Analyses of Bacterial Porin-Cytochrome Gene Clusters

    Directory of Open Access Journals (Sweden)

    Liang eShi


    Full Text Available The porin-cytochrome (Pcc protein complex is responsible for trans-outer membrane electron transfer during extracellular reduction of Fe(III by the dissimilatory metal-reducing bacterium Geobacter sulfurreducens PCA. The identified and characterized Pcc complex of G. sulfurreducens PCA consists of a porin-like outer-membrane protein, a periplasmic 8-heme c-type cytochrome (c-Cyt and an outer-membrane 12-heme c-Cyt, and the genes encoding the Pcc proteins are clustered in the same regions of genome (i.e., the pcc gene clusters of G. sulfurreducens PCA. A survey of additionally microbial genomes has identified the pcc gene clusters in all sequenced Geobacter spp. and other bacteria from six different phyla, including Anaeromyxobacter dehalogenans 2CP-1, A. dehalogenans 2CP-C, Anaeromyxobacter sp. K, Candidatus Kuenenia stuttgartiensis, Denitrovibrio acetiphilus DSM 12809, Desulfurispirillum indicum S5, Desulfurivibrio alkaliphilus AHT2, Desulfurobacterium thermolithotrophum DSM 11699, Desulfuromonas acetoxidans DSM 684, Ignavibacterium album JCM 16511, and Thermovibrio ammonificans HB-1. The numbers of genes in the pcc gene clusters vary, ranging from two to nine. Similar to the metal-reducing (Mtr gene clusters of other Fe(III-reducing bacteria, such as Shewanella spp., additional genes that encode putative c-Cyts with predicted cellular localizations at the cytoplasmic membrane, periplasm and outer membrane often associate with the pcc gene clusters. This suggests that the Pcc-associated c-Cyts may be part of the pathways for extracellular electron transfer reactions. The presence of pcc gene clusters in the microorganisms that do not reduce solid-phase Fe(III and Mn(IV oxides, such as D. alkaliphilus AHT2 and I. album JCM 16511, also suggests that some of the pcc gene clusters may be involved in extracellular electron transfer reactions with the substrates other than Fe(III and Mn(IV oxides.

  17. Minimum Information about a Biosynthetic Gene cluster : commentary

    NARCIS (Netherlands)

    Medema, Marnix H; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, John B; Blin, Kai; de Bruijn, Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R Cameron; Cruz-Morales, Pablo; Duddela, Srikanth; Dusterhus, Stephanie; Edwards, Daniel J; Fewer, David P; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S; Helfrich, Eric J N; Hillwig, Matthew L; Ishida, Keishi; Jones, Adam C; Jones, Carla S; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kotter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V; Mantovani, Simone M; Monroe, Emily A; Moore, Marcus; Moss, Nathan; Nutzmann, Hans-Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F Jerry; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K; Balibar, Carl J; Balskus, Emily P; Barona-Gomez, Francisco; Bechthold, Andreas; Bode, Helge B; Borriss, Rainer; Brady, Sean F; Brakhage, Axel A; Caffrey, Patrick; Cheng, Yi-Qiang; Clardy, Jon; Cox, Russell J; De Mot, Rene; Donadio, Stefano; Donia, Mohamed S; van der Donk, Wilfred A; Dorrestein, Pieter C; Doyle, Sean; Driessen, Arnold J M; Ehling-Schulz, Monika; Entian, Karl-Dieter; Fischbach, Michael A; Gerwick, Lena; Gerwick, William H; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Hofte, Monica; Jensen, Susan E; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L; Keller, Nancy P; Kormanec, Jan; Kuipers, Oscar P; Kuzuyama, Tomohisa; Kyrpides, Nikos C; Kwon, Hyung-Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Mendez, Carmen; Metsa-Ketela, Mikko; Micklefield, Jason; Mitchell, Douglas A; Moore, Bradley S; Moreira, Leonilde M; Muller, Rolf; Neilan, Brett A; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S; Ostash, Bohdan; Payne, Shelley M; Pernodet, Jean-Luc; Petricek, Miroslav; Piel, Jorn; Ploux, Olivier; Raaijmakers, Jos M; Salas, Jose A; Schmitt, Esther K; Scott, Barry; Seipke, Ryan F; Shen, Ben; Sherman, David H; Sivonen, Kaarina; Smanski, Michael J; Sosio, Margherita; Stegmann, Evi; Sussmuth, Roderich D; Tahlan, Kapil; Thomas, Christopher M; Tang, Yi; Truman, Andrew W; Viaud, Muriel; Walton, Jonathan D; Walsh, Christopher T; Weber, Tilmann; van Wezel, Gilles P; Wilkinson, Barrie; Willey, Joanne M; Wohlleben, Wolfgang; Wright, Gerard D; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B; Breitling, Rainer; Takano, Eriko; Glockner, Frank Oliver


    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploit.

  18. Hox gene clusters in the Indonesian coelacanth, Latimeria menadoensis. (United States)

    Koh, Esther G L; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V; Brenner, Sydney; Venkatesh, Byrappa


    The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes.

  19. A Rough Set based Gene Expression Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    J. J. Emilyn


    Full Text Available Problem statement: Microarray technology helps in monitoring the expression levels of thousands of genes across collections of related samples. Approach: The main goal in the analysis of large and heterogeneous gene expression datasets was to identify groups of genes that get expressed in a set of experimental conditions. Results: Several clustering techniques have been proposed for identifying gene signatures and to understand their role and many of them have been applied to gene expression data, but with partial success. The main aim of this work was to develop a clustering algorithm that would successfully indentify gene patterns. The proposed novel clustering technique (RCGED provides an efficient way of finding the hidden and unique gene expression patterns. It overcomes the restriction of one object being placed in only one cluster. Conclusion/Recommendations: The proposed algorithm is termed intelligent because it automatically determines the optimum number of clusters. The proposed algorithm was experimented with colon cancer dataset and the results were compared with Rough Fuzzy K Means algorithm.

  20. Phylogeny of the Insect Homeobox Gene (Hox) Cluster

    Institute of Scientific and Technical Information of China (English)

    Sangeeta Dhawan; K. P. Gopinathan


    The homeobox (Hox) genes form an evolutionarily conserved family encoding transcription factors that play major roles in segmental identity and organ specification across species. The canonical grouping of Hox genes present in the HOM-C cluster of Drosophila or related clusters in other organisms includes eight "typical" genes,which are localized in the order labial (lab), proboscipedia (pb), Deformed (Dfd),Sex combs reduced ( Scr), Antennapedia (Antp), Ultrabithorax (Ubx), abdominalA (abdA), and AbdominalB (AbdB). The members of Hox cluster are expressed in a distinct anterior to posterior order in the embryo. Analysis of the relatedness of different members of the Hox gene cluster to each other in four evolutionarily diverse insect taxa revealed that the loci pb/Dfd and AbdB, which are farthest apart in linkage, had a high degree of evolutionary relatedness, indicating that pb/Dfd type anterior genes and AbdB are closest to the ancestral anterior and posterior Hox genes, respectively. The greater relatedness of other posterior genes Ubx and abdA to the more anterior genes such as Antp and Scr suggested that they arose by gene duplications in the more anterior members rather than the posterior AbdB.

  1. Secondary metabolic gene clusters: evolutionary toolkits for chemical innovation. (United States)

    Osbourn, Anne


    Microbes and plants produce a huge array of secondary metabolites that have important ecological functions. These molecules have long been exploited in medicine as antibiotics, anticancer and anti-infective agents and for a wide range of other applications. Gene clusters for secondary metabolic pathways are common in bacteria and filamentous fungi, and examples have now been discovered in plants. Here, current knowledge of gene clusters across the kingdoms is evaluated with the aim of trying to understand the rules behind cluster existence and evolution. Such knowledge will be crucial in learning how to activate the enormous number of 'silent' gene clusters being revealed by whole-genome sequencing and hence in making available a wealth of novel compounds for evaluation as drug leads and other bioactives. It could also facilitate the development of crop plants with enhanced pest or disease resistance, improved nutritional qualities and/or elevated levels of high-value products.

  2. Characterization of the largest effector gene cluster of Ustilago maydis. (United States)

    Brefort, Thomas; Tanaka, Shigeyuki; Neidig, Nina; Doehlemann, Gunther; Vincon, Volker; Kahmann, Regine


    In the genome of the biotrophic plant pathogen Ustilago maydis, many of the genes coding for secreted protein effectors modulating virulence are arranged in gene clusters. The vast majority of these genes encode novel proteins whose expression is coupled to plant colonization. The largest of these gene clusters, cluster 19A, encodes 24 secreted effectors. Deletion of the entire cluster results in severe attenuation of virulence. Here we present the functional analysis of this genomic region. We show that a 19A deletion mutant behaves like an endophyte, i.e. is still able to colonize plants and complete the infection cycle. However, tumors, the most conspicuous symptoms of maize smut disease, are only rarely formed and fungal biomass in infected tissue is significantly reduced. The generation and analysis of strains carrying sub-deletions identified several genes significantly contributing to tumor formation after seedling infection. Another of the effectors could be linked specifically to anthocyanin induction in the infected tissue. As the individual contributions of these genes to tumor formation were small, we studied the response of maize plants to the whole cluster mutant as well as to several individual mutants by array analysis. This revealed distinct plant responses, demonstrating that the respective effectors have discrete plant targets. We propose that the analysis of plant responses to effector mutant strains that lack a strong virulence phenotype may be a general way to visualize differences in effector function.

  3. Characterization of the largest effector gene cluster of Ustilago maydis.

    Directory of Open Access Journals (Sweden)

    Thomas Brefort


    Full Text Available In the genome of the biotrophic plant pathogen Ustilago maydis, many of the genes coding for secreted protein effectors modulating virulence are arranged in gene clusters. The vast majority of these genes encode novel proteins whose expression is coupled to plant colonization. The largest of these gene clusters, cluster 19A, encodes 24 secreted effectors. Deletion of the entire cluster results in severe attenuation of virulence. Here we present the functional analysis of this genomic region. We show that a 19A deletion mutant behaves like an endophyte, i.e. is still able to colonize plants and complete the infection cycle. However, tumors, the most conspicuous symptoms of maize smut disease, are only rarely formed and fungal biomass in infected tissue is significantly reduced. The generation and analysis of strains carrying sub-deletions identified several genes significantly contributing to tumor formation after seedling infection. Another of the effectors could be linked specifically to anthocyanin induction in the infected tissue. As the individual contributions of these genes to tumor formation were small, we studied the response of maize plants to the whole cluster mutant as well as to several individual mutants by array analysis. This revealed distinct plant responses, demonstrating that the respective effectors have discrete plant targets. We propose that the analysis of plant responses to effector mutant strains that lack a strong virulence phenotype may be a general way to visualize differences in effector function.

  4. A maize-specifically expressed gene cluster in Ustilago maydis. (United States)

    Basse, Christoph W; Kolb, Sebastian; Kahmann, Regine


    The corn pathogen Ustilago maydis requires its host plant maize for development and completion of its sexual cycle. We have identified the fungal mig2-1 gene as being specifically expressed during this biotrophic stage. Intriguingly, mig2-1 is part of a gene cluster comprising five highly homologous and similarly regulated genes designated mig2-1 to mig2-5. Deletion analysis of the mig2-1 promoter provides evidence for negative and positive regulation. The predicted polypeptides of all five genes lack significant homologies to known genes but have characteristic N-terminal secretion sequences. The secretion signals of mig2-1 and mig2-5 were shown to be functional, and secretion of a full length Mig2-1-eGFP fusion protein to the extracellular space was demonstrated. The central domains of the Mig2 proteins are highly variable whereas the C-termini are strongly conserved and share a characteristic pattern of eight cysteine residues. The mig2 gene cluster was conserved in a wide collection of U. maydis strains. Interestingly, some U. maydis isolates from South America had lost the mig2-4 gene as a result of a homologous recombination event. Furthermore, the related Ustilago scitaminea strain, which is pathogenic on sugar cane, appears to lack the mig2 cluster. We describe a model of how the mig2 cluster might have evolved and discuss its possible role in governing host interaction.

  5. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage.

    Directory of Open Access Journals (Sweden)

    Zhimin Dai

    Full Text Available Biological nitrogen fixation is an essential function of acid mine drainage (AMD microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  6. Identification of nitrogen-fixing genes and gene clusters from metagenomic library of acid mine drainage. (United States)

    Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan


    Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.

  7. Genome classification by gene distribution: An overlapping subspace clustering approach

    Directory of Open Access Journals (Sweden)

    Halgamuge Saman K


    Full Text Available Abstract Background Genomes of lower organisms have been observed with a large amount of horizontal gene transfers, which cause difficulties in their evolutionary study. Bacteriophage genomes are a typical example. One recent approach that addresses this problem is the unsupervised clustering of genomes based on gene order and genome position, which helps to reveal species relationships that may not be apparent from traditional phylogenetic methods. Results We propose the use of an overlapping subspace clustering algorithm for such genome classification problems. The advantage of subspace clustering over traditional clustering is that it can associate clusters with gene arrangement patterns, preserving genomic information in the clusters produced. Additionally, overlapping capability is desirable for the discovery of multiple conserved patterns within a single genome, such as those acquired from different species via horizontal gene transfers. The proposed method involves a novel strategy to vectorize genomes based on their gene distribution. A number of existing subspace clustering and biclustering algorithms were evaluated to identify the best framework upon which to develop our algorithm; we extended a generic subspace clustering algorithm called HARP to incorporate overlapping capability. The proposed algorithm was assessed and applied on bacteriophage genomes. The phage grouping results are consistent overall with the Phage Proteomic Tree and showed common genomic characteristics among the TP901-like, Sfi21-like and sk1-like phage groups. Among 441 phage genomes, we identified four significantly conserved distribution patterns structured by the terminase, portal, integrase, holin and lysin genes. We also observed a subgroup of Sfi21-like phages comprising a distinctive divergent genome organization and identified nine new phage members to the Sfi21-like genus: Staphylococcus 71, phiPVL108, Listeria A118, 2389, Lactobacillus phi AT3, A2

  8. Unique nucleotide polymorphism of ankyrin gene cluster in Arabidopsis

    Indian Academy of Sciences (India)

    Jianchang Du; Xingna Wang; Mingsheng Zhang; Dacheng Tian; Yong-Hua Yang


    The ankyrin (ANK) gene cluster is a part of a multigene family encoding ANK transmembrane proteins in Arabidopsis thaliana, and plays an important role in protein–protein interactions and in signal pathways. In contrast to other regions of a genome, the ANK gene cluster exhibits an extremely high level of DNA polymorphism in an ∼5-kb region, without apparent decay. Phylogenetic analysis detects two clear, deeply differentiated haplotypes (dimorphism). The divergence between haplotypes of accession Col-0 and Ler-0 (Hap-C and Hap-L) is estimated to be 10.7%, approximately equal to the 10.5% average divergence between A. thaliana and A. lyrata. Sequence comparisons for the ANK gene cluster homologues in Col-0 indicate that the members evolve independently, and that the similarity among paralogues is lower than between alleles. Very little intralocus recombination or gene conversion is detected in ANK regions. All these characteristics of the ANK gene cluster are consistent with a tandem gene duplication and birth-and-death process. The possible mechanisms for and implications of this elevated nucleotide variation are also discussed, including the suggestion of balancing selection.

  9. Accurate prediction of secondary metabolite gene clusters in filamentous fungi. (United States)

    Andersen, Mikael R; Nielsen, Jakob B; Klitgaard, Andreas; Petersen, Lene M; Zachariasen, Mia; Hansen, Tilde J; Blicher, Lene H; Gotfredsen, Charlotte H; Larsen, Thomas O; Nielsen, Kristian F; Mortensen, Uffe H


    Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify supporting enzymes for key synthases one cluster at a time. In this study, we design and apply a DNA expression array for Aspergillus nidulans in combination with legacy data to form a comprehensive gene expression compendium. We apply a guilt-by-association-based analysis to predict the extent of the biosynthetic clusters for the 58 synthases active in our set of experimental conditions. A comparison with legacy data shows the method to be accurate in 13 of 16 known clusters and nearly accurate for the remaining 3 clusters. Furthermore, we apply a data clustering approach, which identifies cross-chemistry between physically separate gene clusters (superclusters), and validate this both with legacy data and experimentally by prediction and verification of a supercluster consisting of the synthase AN1242 and the prenyltransferase AN11080, as well as identification of the product compound nidulanin A. We have used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom.

  10. Identification of the Scopularide Biosynthetic Gene Cluster in Scopulariopsis brevicaulis

    Directory of Open Access Journals (Sweden)

    Mie Bech Lukassen


    Full Text Available Scopularide A is a promising potent anticancer lipopeptide isolated from a marine derived Scopulariopsis brevicaulis strain. The compound consists of a reduced carbon chain (3-hydroxy-methyldecanoyl attached to five amino acids (glycine, l-valine, d-leucine, l-alanine, and l-phenylalanine. Using the newly sequenced S. brevicaulis genome we were able to identify the putative biosynthetic gene cluster using genetic information from the structurally related emericellamide A from Aspergillus nidulans and W493-B from Fusarium pseudograminearum. The scopularide A gene cluster includes a nonribosomal peptide synthetase (NRPS1, a polyketide synthase (PKS2, a CoA ligase, an acyltransferase, and a transcription factor. Homologous recombination was low in S. brevicaulis so the local transcription factor was integrated randomly under a constitutive promoter, which led to a three to four-fold increase in scopularide A production. This indirectly verifies the identity of the proposed biosynthetic gene cluster.

  11. Evolution and differential expression of a vertebrate vitellogenin gene cluster

    Directory of Open Access Journals (Sweden)

    Kongshaug Heidi


    Full Text Available Abstract Background The multiplicity or loss of the vitellogenin (vtg gene family in vertebrates has been argued to have broad implications for the mode of reproduction (placental or non-placental, cleavage pattern (meroblastic or holoblastic and character of the egg (pelagic or benthic. Earlier proposals for the existence of three forms of vertebrate vtgs present conflicting models for their origin and subsequent duplication. Results By integrating phylogenetics of novel vtg transcripts from old and modern teleosts with syntenic analyses of all available genomic variants of non-metatherian vertebrates we identify the gene orthologies between the Sarcopterygii (tetrapod branch and Actinopterygii (fish branch. We argue that the vertebrate vtg gene cluster originated in proto-chromosome m, but that vtg genes have subsequently duplicated and rearranged following whole genome duplications. Sequencing of a novel fourth vtg transcript in labrid species, and the presence of duplicated paralogs in certain model organisms supports the notion that lineage-specific gene duplications frequently occur in teleosts. The data show that the vtg gene cluster is more conserved between acanthomorph teleosts and tetrapods, than in ostariophysan teleosts such as the zebrafish. The differential expression of the labrid vtg genes are further consistent with the notion that neofunctionalized Aa-type vtgs are important determinants of the pelagic or benthic character of the eggs in acanthomorph teleosts. Conclusion The vertebrate vtg gene cluster existed prior to the separation of Sarcopterygii from Actinopterygii >450 million years ago, a period associated with the second round of whole genome duplication. The presence of higher copy numbers in a more highly expressed subcluster is particularly prevalent in teleosts. The differential expression and latent neofunctionalization of vtg genes in acanthomorph teleosts is an adaptive feature associated with oocyte hydration

  12. An alanine tRNA gene cluster from Nephila clavipes. (United States)

    Luciano, E; Candelas, G C


    We report the sequence of a 2.3-kb genomic DNA fragment from the orb-web spider, Nephila clavipes (Nc). The fragment contains four regions of high homology to tRNA(Ala). The members of this irregularly spaced cluster of genes are oriented in the same direction and have the same anticodon (GCA), but their sequence differs at several positions. Initiation and termination signals, as well as consensus intragenic promoter sequences characteristic of tRNA genes, have been identified in all genes. tRNA(Ala) are involved in the regulation of the fibroin synthesis in the large ampullate Nc glands.

  13. Coupled Two-Way Clustering Analysis of Gene Microarray Data

    CERN Document Server

    Getz, G; Domany, E


    We present a novel coupled two-way clustering approach to gene microarray data analysis. The main idea is to identify subsets of the genes and samples, such that when one of these is used to cluster the other, stable and significant partitions emerge. The search for such subsets is a computationally complex task: we present an algorithm, based on iterative clustering, which performs such a search. This analysis is especially suitable for gene microarray data, where the contributions of a variety of biological mechanisms to the gene expression levels are entangled in a large body of experimental data. The method was applied to two gene microarray data sets, on colon cancer and leukemia. By identifying relevant subsets of the data and focusing on them we were able to discover partitions and correlations that were masked and hidden when the full dataset was used in the analysis. Some of these partitions have clear biological interpretation; others can serve to identify possible directions for future research.

  14. Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

    Energy Technology Data Exchange (ETDEWEB)

    Santini, Simona; Boore, Jeffrey L.; Meyer, Axel


    Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.

  15. Coupled two-way clustering analysis of gene microarray data (United States)

    Getz, Gad; Levine, Erel; Domany, Eytan


    We present a coupled two-way clustering approach to gene microarray data analysis. The main idea is to identify subsets of the genes and samples, such that when one of these is used to cluster the other, stable and significant partitions emerge. The search for such subsets is a computationally complex task. We present an algorithm, based on iterative clustering, that performs such a search. This analysis is especially suitable for gene microarray data, where the contributions of a variety of biological mechanisms to the gene expression levels are entangled in a large body of experimental data. The method was applied to two gene microarray data sets, on colon cancer and leukemia. By identifying relevant subsets of the data and focusing on them we were able to discover partitions and correlations that were masked and hidden when the full dataset was used in the analysis. Some of these partitions have clear biological interpretation; others can serve to identify possible directions for future research.

  16. Gene duplication, modularity and adaptation in the evolution of the aflatoxin gene cluster

    Directory of Open Access Journals (Sweden)

    Jakobek Judy L


    Full Text Available Abstract Background The biosynthesis of aflatoxin (AF involves over 20 enzymatic reactions in a complex polyketide pathway that converts acetate and malonate to the intermediates sterigmatocystin (ST and O-methylsterigmatocystin (OMST, the respective penultimate and ultimate precursors of AF. Although these precursors are chemically and structurally very similar, their accumulation differs at the species level for Aspergilli. Notable examples are A. nidulans that synthesizes only ST, A. flavus that makes predominantly AF, and A. parasiticus that generally produces either AF or OMST. Whether these differences are important in the evolutionary/ecological processes of species adaptation and diversification is unknown. Equally unknown are the specific genomic mechanisms responsible for ordering and clustering of genes in the AF pathway of Aspergillus. Results To elucidate the mechanisms that have driven formation of these clusters, we performed systematic searches of aflatoxin cluster homologs across five Aspergillus genomes. We found a high level of gene duplication and identified seven modules consisting of highly correlated gene pairs (aflA/aflB, aflR/aflS, aflX/aflY, aflF/aflE, aflT/aflQ, aflC/aflW, and aflG/aflL. With the exception of A. nomius, contrasts of mean Ka/Ks values across all cluster genes showed significant differences in selective pressure between section Flavi and non-section Flavi species. A. nomius mean Ka/Ks values were more similar to partial clusters in A. fumigatus and A. terreus. Overall, mean Ka/Ks values were significantly higher for section Flavi than for non-section Flavi species. Conclusion Our results implicate several genomic mechanisms in the evolution of ST, OMST and AF cluster genes. Gene modules may arise from duplications of a single gene, whereby the function of the pre-duplication gene is retained in the copy (aflF/aflE or the copies may partition the ancestral function (aflA/aflB. In some gene modules, the

  17. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum. (United States)

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia


    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  18. Functional Analysis of the Fusarielin Biosynthetic Gene Cluster

    Directory of Open Access Journals (Sweden)

    Aida Droce


    Full Text Available Fusarielins are polyketides with a decalin core produced by various species of Aspergillus and Fusarium. Although the responsible gene cluster has been identified, the biosynthetic pathway remains to be elucidated. In the present study, members of the gene cluster were deleted individually in a Fusarium graminearum strain overexpressing the local transcription factor. The results suggest that a trans-acting enoyl reductase (FSL5 assists the polyketide synthase FSL1 in biosynthesis of a polyketide product, which is released by hydrolysis by a trans-acting thioesterase (FSL2. Deletion of the epimerase (FSL3 resulted in accumulation of an unstable compound, which could be the released product. A novel compound, named prefusarielin, accumulated in the deletion mutant of the cytochrome P450 monooxygenase FSL4. Unlike the known fusarielins from Fusarium, this compound does not contain oxygenized decalin rings, suggesting that FSL4 is responsible for the oxygenation.

  19. Loss of Bloom syndrome protein destabilizes human gene cluster architecture. (United States)

    Killen, Michael W; Stults, Dawn M; Adachi, Noritaka; Hanakahi, Les; Pierce, Andrew J


    Bloom syndrome confers strong predisposition to malignancy in multiple tissue types. The Bloom syndrome patient (BLM) protein defective in the disease biochemically functions as a Holliday junction dissolvase and human cells lacking functional BLM show 10-fold elevated rates of sister chromatid exchange. Collectively, these phenomena suggest that dysregulated mitotic recombination drives the genomic instability underpinning the development of cancer in these individuals. Here we use physical analysis of the highly repeated, highly self-similar human ribosomal RNA gene clusters as sentinel biomarkers for dysregulated homologous recombination to demonstrate that loss of BLM protein function causes a striking increase in spontaneous molecular level genomic restructuring. Analysis of single-cell derived sub-clonal populations from wild-type human cell lines shows that gene cluster architecture is ordinarily very faithfully preserved under mitosis, but is so unstable in cell lines derived from BLMs as to make gene cluster architecture in different sub-clonal populations essentially unrecognizable one from another. Human cells defective in a different RecQ helicase, the WRN protein involved in the premature aging Werner syndrome, do not exhibit the gene cluster instability (GCI) phenotype, indicating that the BLM protein specifically, rather than RecQ helicases generally, holds back this recombination-mediated genomic instability. An ataxia-telangiectasia defective cell line also shows elevated rDNA GCI, although not to the extent of BLM defective cells. Genomic restructuring mediated by dysregulated recombination between the abundant low-copy repeats in the human genome may prove to be an important additional mechanism of genomic instability driving the initiation and progression of human cancer.

  20. Evaluation of clustering algorithms for gene expression data using gene ontology annotations

    Institute of Scientific and Technical Information of China (English)

    MA Ning; ZHANG Zheng-guo


    Background Clustering is a useful exploratory technique for interpreting gene expression data to reveal groups of genes sharing common functional attributes.Biologists frequently face the problem of choosing an appropriate algorithm.We aimed to provide a standalone,easily accessible and biologically oriented criterion for expression data clustering evaluation.Methods An external criterion utilizing annotation based similarities between genes is proposed in this work.Gene ontology information is employed as the annotation source.Comparisons among six widely used clustering algorithms over various types of gene expression data sets were carried out based on the criterion proposed.Results The rank of these algorithms given by the criterion coincides with our common knowledge.Single-linkage has significantly poorer performance,even worse than the random algorithm.Ward's method archives the best performance in most cases.Conclusions The criterion proposed has a strong ability to distinguish among different clustering algorithms with different distance measurements.It is also demonstrated that analyzing main contributors of the criterion may offer some guidelines in finding local compact clusters.As an addition,we suggest using Ward's algorithm for gene expression data analysis.

  1. Cloning large natural product gene clusters from the environment: Piecing environmental DNA gene clusters back together with TAR


    Kim, Jeffrey H.; Feng, Zhiyang; Bauer, John D.; Kallifidas, Dimitris; Calle, Paula Y.; Brady, Sean F


    A single gram of soil can contain thousands of unique bacterial species, of which only a small fraction is regularly cultured in the laboratory. Although the fermentation of cultured microorganisms has provided access to numerous bioactive secondary metabolites, with these same methods it is not possible to characterize the natural products encoded by the uncultured majority. The heterologous expression of biosynthetic gene clusters cloned from DNA extracted directly from environmental sample...

  2. Metabolic diversification--independent assembly of operon-like gene clusters in different plants. (United States)

    Field, Ben; Osbourn, Anne E


    Operons are clusters of unrelated genes with related functions that are a feature of prokaryotic genomes. Here, we report on an operon-like gene cluster in the plant Arabidopsis thaliana that is required for triterpene synthesis (the thalianol pathway). The clustered genes are coexpressed, as in bacterial operons. However, despite the resemblance to a bacterial operon, this gene cluster has been assembled from plant genes by gene duplication, neofunctionalization, and genome reorganization, rather than by horizontal gene transfer from bacteria. Furthermore, recent assembly of operon-like gene clusters for triterpene synthesis has occurred independently in divergent plant lineages (Arabidopsis and oat). Thus, selection pressure may act during the formation of certain plant metabolic pathways to drive gene clustering.

  3. Engineered Streptomyces avermitilis host for heterologous expression of biosynthetic gene cluster for secondary metabolites. (United States)

    Komatsu, Mamoru; Komatsu, Kyoko; Koiwai, Hanae; Yamada, Yuuki; Kozone, Ikuko; Izumikawa, Miho; Hashimoto, Junko; Takagi, Motoki; Omura, Satoshi; Shin-ya, Kazuo; Cane, David E; Ikeda, Haruo


    An industrial microorganism, Streptomyces avermitilis, which is a producer of anthelmintic macrocyclic lactones, avermectins, has been constructed as a versatile model host for heterologous expression of genes encoding secondary metabolite biosynthesis. Twenty of the entire biosynthetic gene clusters for secondary metabolites were successively cloned and introduced into a versatile model host S. avermitilis SUKA17 or 22. Almost all S. avermitilis transformants carrying the entire gene cluster produced metabolites as a result of the expression of biosynthetic gene clusters introduced. A few transformants were unable to produce metabolites, but their production was restored by the expression of biosynthetic genes using an alternative promoter or the expression of a regulatory gene in the gene cluster that controls the expression of biosynthetic genes in the cluster using an alternative promoter. Production of metabolites in some transformants of the versatile host was higher than that of the original producers, and cryptic biosynthetic gene clusters in the original producer were also expressed in a versatile host.

  4. Global Analysis of miRNA Gene Clusters and Gene Families Reveals Dynamic and Coordinated Expression

    Directory of Open Access Journals (Sweden)

    Li Guo


    Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.

  5. Data Preprocessing in Cluster Analysis of Gene Expression

    Institute of Scientific and Technical Information of China (English)

    杨春梅; 万柏坤; 高晓峰


    Considering that the DNA microarray technology has generated explosive gene expression data and that it is urgent to analyse and to visualize such massive datasets with efficient methods, we investigate the data preprocessing methods used in cluster analysis, normalization or logarithm of the matrix, by using hierarchical clustering, principal component analysis (PCA) and self-organizing maps (SOMs). The results illustrate that when using the Euclidean distance as measuring metrics, logarithm of relative expression level is the best preprocessing method, while data preprocessed by normalization cannot attain the expected results because the data structure is ruined. If there are only a few principal components, the PCA is an effective method to extract the frame structure, while SOMs are more suitable for a specific structure.

  6. Identification and structural analysis of a novel snoRNA gene cluster from Arabidopsis thaliana

    Institute of Scientific and Technical Information of China (English)


    A Z2 snoRNA gene cluster,consisting of four antisense snoRNA genes, was identified from Arabidopsis thaliana. The sequence and structural analysis showed that the Z2 snoRNA gene cluster might be transcribed as a polycistronic precursor from an upstream promoter, and the intergenic spacers of the gene cluster encode the 'hairpin' structures similar to the processing recognition signals of yeast Saccharomyces cerevisiae polycistronic snoRNA precursor. The results also revealed that plant snoRNA gene with multiple copies is a characteristic in common, and provides a good system for further revealing the transcription and expression mechanism of plant snoRNA gene cluster.

  7. Coupled Two-Way Clustering Analysis of Breast Cancer and Colon Cancer Gene Expression Data

    CERN Document Server

    Getz, G; Kela, I; Domany, E; Notterman, D A; Getz, Gad; Gal, Hilah; Kela, Itai; Domany, Eytan; Notterman, Dan A.


    We present and review Coupled Two Way Clustering, a method designed to mine gene expression data. The method identifies submatrices of the total expression matrix, whose clustering analysis reveals partitions of samples (and genes) into biologically relevant classes. We demonstrate, on data from colon and breast cancer, that we are able to identify partitions that elude standard clustering analysis.

  8. Evolutionary formation of gene clusters by reorganization: the meleagrin/roquefortine paradigm in different fungi. (United States)

    Martín, Juan F; Liras, Paloma


    The biosynthesis of secondary metabolites in fungi is catalyzed by enzymes encoded by genes linked in clusters that are frequently co-regulated at the transcriptional level. Formation of gene clusters may take place by de novo assembly of genes recruited from other cellular functions, but also novel gene clusters are formed by reorganization of progenitor clusters and are distributed by horizontal gene transfer. This article reviews (i) the published information on the roquefortine/meleagrin/neoxaline gene clusters of Penicillium chrysogenum (Penicillium rubens) and the short roquefortine cluster of Penicillium roqueforti, and (ii) the correlation of the genes present in those clusters with the enzymes and metabolites derived from these pathways. The P. chrysogenum roq/mel cluster consists of seven genes and includes a gene (roqT) encoding a 12-TMS transporter protein of the MFS family. Interestingly, the orthologous P. roquefortine gene cluster has only four genes and the roqT gene is present as a residual pseudogene that encodes only small peptides. Two of the genes present in the central region of the P. chrysogenum roq/mel cluster have been lost during the evolutionary formation of the short cluster and the order of the structural genes in the cluster has been rearranged. The two lost genes encode a N1 atom hydroxylase (nox) and a roquefortine scaffold-reorganizing oxygenase (sro). As a consequence P. roqueforti has lost the ability to convert the roquefortine-type carbon skeleton to the glandicoline/meleagrin-type scaffold and is unable to produce glandicoline B, meleagrin and neoxaline. The loss of this genetic information is not recent and occurred probably millions of years ago when a progenitor Penicillium strain got adapted to life in a few rich habitats such as cheese, fermented cereal grains or silage. P. roqueforti may be considered as a "domesticated" variant of a progenitor common to contemporary P. chrysogenum and related Penicillia.

  9. Functional clustering of time series gene expression data by Granger causality

    Directory of Open Access Journals (Sweden)

    Fujita André


    Full Text Available Abstract Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them.

  10. Gravitation field algorithm and its application in gene cluster

    Directory of Open Access Journals (Sweden)

    Zheng Ming


    Full Text Available Abstract Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab are publicly available at

  11. Arrangement of the Clostridium baratii F7 toxin gene cluster with identification of a σ factor that recognizes the botulinum toxin gene cluster promoters. (United States)

    Dover, Nir; Barash, Jason R; Burke, Julianne N; Hill, Karen K; Detter, John C; Arnon, Stephen S


    Botulinum neurotoxin (BoNT) is the most poisonous substances known and its eight toxin types (A to H) are distinguished by the inability of polyclonal antibodies that neutralize one toxin type to neutralize any of the other seven toxin types. Infant botulism, an intestinal toxemia orphan disease, is the most common form of human botulism in the United States. It results from swallowed spores of Clostridium botulinum (or rarely, neurotoxigenic Clostridium butyricum or Clostridium baratii) that germinate and temporarily colonize the lumen of the large intestine, where, as vegetative cells, they produce botulinum toxin. Botulinum neurotoxin is encoded by the bont gene that is part of a toxin gene cluster that includes several accessory genes. We sequenced for the first time the complete botulinum neurotoxin gene cluster of nonproteolytic C. baratii type F7. Like the type E and the nonproteolytic type F6 botulinum toxin gene clusters, the C. baratii type F7 had an orfX toxin gene cluster that lacked the regulatory botR gene which is found in proteolytic C. botulinum strains and codes for an alternative σ factor. In the absence of botR, we identified a putative alternative regulatory gene located upstream of the C. baratii type F7 toxin gene cluster. This putative regulatory gene codes for a predicted σ factor that contains DNA-binding-domain homologues to the DNA-binding domains both of BotR and of other members of the TcdR-related group 5 of the σ70 family that are involved in the regulation of toxin gene expression in clostridia. We showed that this TcdR-related protein in association with RNA polymerase core enzyme specifically binds to the C. baratii type F7 botulinum toxin gene cluster promoters. This TcdR-related protein may therefore be involved in regulating the expression of the genes of the botulinum toxin gene cluster in neurotoxigenic C. baratii.

  12. Sequencing, characterization, and gene expression analysis of the histidine decarboxylase gene cluster of Morganella morganii. (United States)

    Ferrario, Chiara; Borgo, Francesca; de Las Rivas, Blanca; Muñoz, Rosario; Ricci, Giovanni; Fortina, Maria Grazia


    The histidine decarboxylase gene cluster of Morganella morganii DSM30146(T) was sequenced, and four open reading frames, named hdcT1, hdc, hdcT2, and hisRS were identified. Two putative histidine/histamine antiporters (hdcT1 and hdcT2) were located upstream and downstream the hdc gene, codifying a pyridoxal-P dependent histidine decarboxylase, and followed by hisRS gene encoding a histidyl-tRNA synthetase. This organization was comparable with the gene cluster of other known Gram negative bacteria, particularly with that of Klebsiella oxytoca. Recombinant Escherichia coli strains harboring plasmids carrying the M. morganii hdc gene were shown to overproduce histidine decarboxylase, after IPTG induction at 37 °C for 4 h. Quantitative RT-PCR experiments revealed the hdc and hisRS genes were highly induced under acidic and histidine-rich conditions. This work represents the first description and identification of the hdc-related genes in M. morganii. Results support the hypothesis that the histidine decarboxylation reaction in this prolific histamine producing species may play a role in acid survival. The knowledge of the role and the regulation of genes involved in histidine decarboxylation should improve the design of rational strategies to avoid toxic histamine production in foods.

  13. Recurrent adenylation domain replacement in the microcystin synthetase gene cluster

    Directory of Open Access Journals (Sweden)

    Laakso Kati


    Full Text Available Abstract Background Microcystins are small cyclic heptapeptide toxins produced by a range of distantly related cyanobacteria. Microcystins are synthesized on large NRPS-PKS enzyme complexes. Many structural variants of microcystins are produced simulatenously. A recombination event between the first module of mcyB (mcyB1 and mcyC in the microcystin synthetase gene cluster is linked to the simultaneous production of microcystin variants in strains of the genus Microcystis. Results Here we undertook a phylogenetic study to investigate the order and timing of recombination between the mcyB1 and mcyC genes in a diverse selection of microcystin producing cyanobacteria. Our results provide support for complex evolutionary processes taking place at the mcyB1 and mcyC adenylation domains which recognize and activate the amino acids found at X and Z positions. We find evidence for recent recombination between mcyB1 and mcyC in strains of the genera Anabaena, Microcystis, and Hapalosiphon. We also find clear evidence for independent adenylation domain conversion of mcyB1 by unrelated peptide synthetase modules in strains of the genera Nostoc and Microcystis. The recombination events replace only the adenylation domain in each case and the condensation domains of mcyB1 and mcyC are not transferred together with the adenylation domain. Our findings demonstrate that the mcyB1 and mcyC adenylation domains are recombination hotspots in the microcystin synthetase gene cluster. Conclusion Recombination is thought to be one of the main mechanisms driving the diversification of NRPSs. However, there is very little information on how recombination takes place in nature. This study demonstrates that functional peptide synthetases are created in nature through transfer of adenylation domains without the concomitant transfer of condensation domains.

  14. Application of Multi-SOM clustering approach to macrophage gene expression analysis. (United States)

    Ghouila, Amel; Yahia, Sadok Ben; Malouche, Dhafer; Jmel, Haifa; Laouini, Dhafer; Guerfali, Fatma Z; Abdelhak, Sonia


    The production of increasingly reliable and accessible gene expression data has stimulated the development of computational tools to interpret such data and to organize them efficiently. The clustering techniques are largely recognized as useful exploratory tools for gene expression data analysis. Genes that show similar expression patterns over a wide range of experimental conditions can be clustered together. This relies on the hypothesis that genes that belong to the same cluster are coregulated and involved in related functions. Nevertheless, clustering algorithms still show limits, particularly for the estimation of the number of clusters and the interpretation of hierarchical dendrogram, which may significantly influence the outputs of the analysis process. We propose here a multi level SOM based clustering algorithm named Multi-SOM. Through the use of clustering validity indices, Multi-SOM overcomes the problem of the estimation of clusters number. To test the validity of the proposed clustering algorithm, we first tested it on supervised training data sets. Results were evaluated by computing the number of misclassified samples. We have then used Multi-SOM for the analysis of macrophage gene expression data generated in vitro from the same individual blood infected with 5 different pathogens. This analysis led to the identification of sets of tightly coregulated genes across different pathogens. Gene Ontology tools were then used to estimate the biological significance of the clustering, which showed that the obtained clusters are coherent and biologically significant.

  15. Unusual Gene Order and Organization of the Sea Urchin Hox Cluster


    Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew; Rowen, Lee; Nesbitt, Ryan; Bloom, Scott; Rast, Jonathan P.; Berney, Kevin; Arenas-Mena, Cesar; Martinez, Pedro; Davidson, Eric H.; Peterson, Kevin J.; Hood, Leroy


    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and...

  16. A hybrid distance measure for clustering expressed sequence tags originating from the same gene family.

    Directory of Open Access Journals (Sweden)

    Keng-Hoong Ng

    Full Text Available BACKGROUND: Clustering is a key step in the processing of Expressed Sequence Tags (ESTs. The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes. METHODOLOGY/PRINCIPAL FINDINGS: We propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy than both EST clustering tools. CONCLUSIONS/SIGNIFICANCE: The clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem.

  17. Comparisons of Graph-structure Clustering Methods for Gene Expression Data

    Institute of Scientific and Technical Information of China (English)

    Zhuo FANG; Lei LIU; Jiong YANG; Qing-Ming LUO; Yi-Xue LI


    Although many numerical clustering algorithms have been applied to gene expression data analysis, the essential step is still biological interpretation by manual inspection. The correlation between genetic co-regulation and affiliation to a common biological process is what biologists expect. Here, we introduce some clustering algorithms that are based on graph structure constituted by biological knowledge. After applying a widely used dataset, we compared the result clusters of two of these algorithms in terms of the homogeneity of clusters and coherence of annotation and matching ratio. The results show that the clusters of knowledge-guided analysis are the kernel parts of the clusters of Gene Ontology (GO)-Cluster software, which contains the genes that are most expression correlative and most consistent with biological functions. Moreover, knowledge-guided analysis seems much more applicable than GO-Cluster in a larger dataset.

  18. Genetic and bibliographic information: CCL3L1 [GenLibi

    Lifescience Database Archive (English)

    Full Text Available ctions (C02.782.815.616) > HIV Infections (C02.782.815.616.400) Virus Diseases (C02) > Sexually Transmitted Diseases (C02.800) > Transmitted Diseases, Viral (C02.800.801) > HIV Inf

  19. Selections of data preprocessing methods and similarity metrics for gene cluster analysis

    Institute of Scientific and Technical Information of China (English)

    YANG Chunmei; WAN Baikun; GAO Xiaofeng


    Clustering is one of the major exploratory techniques for gene expression data analysis. Only with suitable similarity metrics and when datasets are properly preprocessed, can results of high quality be obtained in cluster analysis. In this study, gene expression datasets with external evaluation criteria were preprocessed as normalization by line, normalization by column or logarithm transformation by base-2, and were subsequently clustered by hierarchical clustering, k-means clustering and self-organizing maps (SOMs) with Pearson correlation coefficient or Euclidean distance as similarity metric. Finally, the quality of clusters was evaluated by adjusted Rand index. The results illustrate that k-means clustering and SOMs have distinct advantages over hierarchical clustering in gene clustering, and SOMs are a bit better than k-means when randomly initialized. It also shows that hierarchical clustering prefers Pearson correlation coefficient as similarity metric and dataset normalized by line. Meanwhile, k-means clustering and SOMs can produce better clusters with Euclidean distance and logarithm transformed datasets. These results will afford valuable reference to the implementation of gene expression cluster analysis.

  20. Enzymology of aminoglycoside biosynthesis-deduction from gene clusters. (United States)

    Wehmeier, Udo F; Piepersberg, Wolfgang


    The classical aminoglycosides are, with very few exceptions, typically actinobacterial secondary metabolites with antimicrobial activities all mediated by inhibiting translation on the 30S subunit of the bacterial ribosome. Some chemically related natural products inhibit glucosidases by mimicking oligo-alpha-1,4-glucosides. The biochemistry of the aminoglycoside biosynthetic pathways is still a developing field since none of the pathways has been analyzed to completeness as yet. In this chapter we treat the enzymology of aminoglycoside biosyntheses as far as it becomes apparent from recent investigations based on the availability of DNA sequence data of biosynthetic gene clusters for all major structural classes of these bacterial metabolites. We give a more general overview of the field, including descriptions of some key enzymes in various aminoglycoside pathways, whereas in Chapter 20 provides a detailed account of the better-studied enzymology thus far known for the neomycin and butirosin pathways.

  1. The gsdf gene locus harbors evolutionary conserved and clustered genes preferentially expressed in fish previtellogenic oocytes. (United States)

    Gautier, Aude; Le Gac, Florence; Lareyre, Jean-Jacques


    display a different cellular localization compared to that of the gsdf gene indicating that the later gene is not co-regulated. Interestingly, our study identifies new clustered genes that are specifically expressed in previtellogenic oocytes (nup54, aff1, klhl8, sdad1).

  2. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    Directory of Open Access Journals (Sweden)

    Landfors Mattias


    Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that

  3. Physical and genetic map of the major nif gene cluster from Azotobacter vinelandii.


    Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M. S.; Cash, V L; Beynon, J.; Newton, W.E.; Dean, D R


    Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include ...

  4. Recursive Cluster Elimination (RCE for classification and feature selection from gene expression data

    Directory of Open Access Journals (Sweden)

    Showe Louise C


    Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together

  5. A phylogenomic gene cluster resource: The phylogeneticallyinferred groups (PhlGs) database

    Energy Technology Data Exchange (ETDEWEB)

    Dehal, Paramvir S.; Boore, Jeffrey L.


    We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.

  6. Base J represses genes at the end of polycistronic gene clusters in Leishmania major by promoting RNAP II termination. (United States)

    Reynolds, David L; Hofmeister, Brigitte T; Cliffe, Laura; Siegel, T Nicolai; Anderson, Britta A; Beverley, Stephen M; Schmitz, Robert J; Sabatini, Robert


    The genomes of kinetoplastids are organized into polycistronic gene clusters that are flanked by the modified DNA base J. Previous work has established a role of base J in promoting RNA polymerase II termination in Leishmania spp. where the loss of J leads to termination defects and transcription into adjacent gene clusters. It remains unclear whether these termination defects affect gene expression and whether read through transcription is detrimental to cell growth, thus explaining the essential nature of J. We now demonstrate that reduction of base J at specific sites within polycistronic gene clusters in L. major leads to read through transcription and increased expression of downstream genes in the cluster. Interestingly, subsequent transcription into the opposing polycistronic gene cluster does not lead to downregulation of sense mRNAs. These findings indicate a conserved role for J regulating transcription termination and expression of genes within polycistronic gene clusters in trypanosomatids. In contrast to the expectations often attributed to opposing transcription, the essential nature of J in Leishmania spp. is related to its role in gene repression rather than preventing transcriptional interference resulting from read through and dual strand transcription.

  7. Identification and characterization of a novel diterpene gene cluster in Aspergillus nidulans.

    Directory of Open Access Journals (Sweden)

    Kirsi Bromann

    Full Text Available Fungal secondary metabolites are a rich source of medically useful compounds due to their pharmaceutical and toxic properties. Sequencing of fungal genomes has revealed numerous secondary metabolite gene clusters, yet products of many of these biosynthetic pathways are unknown since the expression of the clustered genes usually remains silent in normal laboratory conditions. Therefore, to discover new metabolites, it is important to find ways to induce the expression of genes in these otherwise silent biosynthetic clusters. We discovered a novel secondary metabolite in Aspergillus nidulans by predicting a biosynthetic gene cluster with genomic mining. A Zn(II(2Cys(6-type transcription factor, PbcR, was identified, and its role as a pathway-specific activator for the predicted gene cluster was demonstrated. Overexpression of pbcR upregulated the transcription of seven genes in the identified cluster and led to the production of a diterpene compound, which was characterized with GC/MS as ent-pimara-8(14,15-diene. A change in morphology was also observed in the strains overexpressing pbcR. The activation of a cryptic gene cluster by overexpression of its putative Zn(II(2Cys(6-type transcription factor led to discovery of a novel secondary metabolite in Aspergillus nidulans. Quantitative real-time PCR and DNA array analysis allowed us to predict the borders of the biosynthetic gene cluster. Furthermore, we identified a novel fungal pimaradiene cyclase gene as well as genes encoding 3-hydroxy-3-methyl-glutaryl-coenzyme A (HMG-CoA reductase and a geranylgeranyl pyrophosphate (GGPP synthase. None of these genes have been previously implicated in the biosynthesis of terpenes in Aspergillus nidulans. These results identify the first Aspergillus nidulans diterpene gene cluster and suggest a biosynthetic pathway for ent-pimara-8(14,15-diene.

  8. Identification and Characterization of a Novel Diterpene Gene Cluster in Aspergillus nidulans (United States)

    Bromann, Kirsi; Toivari, Mervi; Viljanen, Kaarina; Vuoristo, Anu; Ruohonen, Laura; Nakari-Setälä, Tiina


    Fungal secondary metabolites are a rich source of medically useful compounds due to their pharmaceutical and toxic properties. Sequencing of fungal genomes has revealed numerous secondary metabolite gene clusters, yet products of many of these biosynthetic pathways are unknown since the expression of the clustered genes usually remains silent in normal laboratory conditions. Therefore, to discover new metabolites, it is important to find ways to induce the expression of genes in these otherwise silent biosynthetic clusters. We discovered a novel secondary metabolite in Aspergillus nidulans by predicting a biosynthetic gene cluster with genomic mining. A Zn(II)2Cys6–type transcription factor, PbcR, was identified, and its role as a pathway-specific activator for the predicted gene cluster was demonstrated. Overexpression of pbcR upregulated the transcription of seven genes in the identified cluster and led to the production of a diterpene compound, which was characterized with GC/MS as ent-pimara-8(14),15-diene. A change in morphology was also observed in the strains overexpressing pbcR. The activation of a cryptic gene cluster by overexpression of its putative Zn(II)2Cys6–type transcription factor led to discovery of a novel secondary metabolite in Aspergillus nidulans. Quantitative real-time PCR and DNA array analysis allowed us to predict the borders of the biosynthetic gene cluster. Furthermore, we identified a novel fungal pimaradiene cyclase gene as well as genes encoding 3-hydroxy-3-methyl-glutaryl-coenzyme A (HMG-CoA) reductase and a geranylgeranyl pyrophosphate (GGPP) synthase. None of these genes have been previously implicated in the biosynthesis of terpenes in Aspergillus nidulans. These results identify the first Aspergillus nidulans diterpene gene cluster and suggest a biosynthetic pathway for ent-pimara-8(14),15-diene. PMID:22506079

  9. Nonlinear biosynthetic gene cluster dose effect on penicillin production by Penicillium chrysogenum. (United States)

    Nijland, Jeroen G; Ebbendorf, Bjorg; Woszczynska, Marta; Boer, Rémon; Bovenberg, Roel A L; Driessen, Arnold J M


    Industrial penicillin production levels by the filamentous fungus Penicillium chrysogenum increased dramatically by classical strain improvement. High-yielding strains contain multiple copies of the penicillin biosynthetic gene cluster that encodes three key enzymes of the β-lactam biosynthetic pathway. We have analyzed the gene cluster dose effect on penicillin production using the high-yielding P. chrysogenum strain DS17690 that was cured from its native clusters. The amount of penicillin V produced increased with the penicillin biosynthetic gene cluster number but was saturated at high copy numbers. Likewise, transcript levels of the biosynthetic genes pcbAB [δ-(l-α-aminoadipyl)-l-cysteinyl-d-valine synthetase], pcbC (isopenicillin N synthase), and penDE (acyltransferase) correlated with the cluster copy number. Remarkably, the protein level of acyltransferase, which localizes to peroxisomes, was saturated already at low cluster copy numbers. At higher copy numbers, intracellular levels of isopenicillin N increased, suggesting that the acyltransferase reaction presents a limiting step at a high gene dose. Since the number and appearance of the peroxisomes did not change significantly with the gene cluster copy number, we conclude that the acyltransferase activity is limiting for penicillin biosynthesis at high biosynthetic gene cluster copy numbers. These results suggest that at a high penicillin production level, productivity is limited by the peroxisomal acyltransferase import activity and/or the availability of coenzyme A (CoA)-activated side chains.

  10. Genes for iron-sulphur cluster assembly are targets of abiotic stress in rice, Oryza sativa. (United States)

    Liang, Xuejiao; Qin, Lu; Liu, Peiwei; Wang, Meihuan; Ye, Hong


    Iron-sulphur (Fe-S) cluster assembly occurs in chloroplasts, mitochondria and cytosol, involving dozens of genes in higher plants. In this study, we have identified 41 putative Fe-S cluster assembly genes in rice (Oryza sativa) genome, and the expression of all genes was verified. To investigate the role of Fe-S cluster assembly as a metabolic pathway, we applied abiotic stresses to rice seedlings and analysed Fe-S cluster assembly gene expression by qRT-PCR. Our data showed that genes for Fe-S cluster assembly in chloroplasts of leaves are particularly sensitive to heavy metal treatments, and that Fe-S cluster assembly genes in roots were up-regulated in response to iron toxicity, oxidative stress and some heavy metal assault. The effect of each stress treatment on the Fe-S cluster assembly machinery demonstrated an unexpected tissue or organelle specificity, suggesting that the physiological relevance of the Fe-S cluster assembly is more complex than thought. Furthermore, our results may reveal potential candidate genes for molecular breeding of rice.

  11. Identification and structural analysis of a novel snoRNA gene cluster from Arabidopsis thaliana

    Institute of Scientific and Technical Information of China (English)

    周惠; 孟清; 屈良鹄


    A 22 snoRNA gene cluster, consisting of four antisense snoRNA genes, was identified from Arabidopsis thaliana. The sequence and structural analysis showed that the 22 snoRNA gene cluster might be transcribed as a polycistronic precursor from an upstream promoter, and the in-tergenic spacers of the gene cluster encode the ’hairpin’ structures similar to the processing recognition signals of yeast Saccharomyces cerevisiae polycistronic snoRNA precursor. The results also revealed that plant snoRNA gene with multiple copies is a characteristic in common, and provides a good system for further revealing the transcription and expression mechanism of plant snoRNA gene cluster.

  12. The urease gene cluster of Vibrio parahaemolyticus does not influence the expression of the thermostable direct hemolysin (TDH) gene or the TDH-related hemolysin gene. (United States)

    Nakaguchi, Yoshitsugu; Okuda, Jun; Iida, Tetsuya; Nishibuchi, Mitsuaki


    In order to investigate why the thermostable direct hemolysin (TDH) and the TDH-related hemolysin (TRH) of Vibrio parahaemolyticus are produced at low levels from urease-positive strains, the effect of the functional urease gene cluster of V. parahaemolyticus on the expression of the tdh and trh genes was examined. Transcriptional lacZ fusions with the tdh1, tdh2, trh1 and trh2 genes representing variants of the tdh and trh genes were integrated into the chromosome of an Escherichia coli strain and a urease-negative V. parahaemolyticus strain. The plasmid-borne urease gene cluster introduced and expressed in these constructs did not affect expression of any of the fusion genes. The amount of TDH produced from a Kanagawa phenomenon-positive V. parahaemolyticus did not change by introduction of the urease gene cluster either. It was concluded therefore that the urease gene cluster is not involved in the regulation of tdh and trh expression.

  13. A putative gene cluster from a Lyngbya wollei bloom that encodes paralytic shellfish toxin biosynthesis.

    Directory of Open Access Journals (Sweden)

    Troco K Mihali

    Full Text Available Saxitoxin and its analogs cause the paralytic shellfish-poisoning syndrome, adversely affecting human health and coastal shellfish industries worldwide. Here we report the isolation, sequencing, annotation, and predicted pathway of the saxitoxin biosynthetic gene cluster in the cyanobacterium Lyngbya wollei. The gene cluster spans 36 kb and encodes enzymes for the biosynthesis and export of the toxins. The Lyngbya wollei saxitoxin gene cluster differs from previously identified saxitoxin clusters as it contains genes that are unique to this cluster, whereby the carbamoyltransferase is truncated and replaced by an acyltransferase, explaining the unique toxin profile presented by Lyngbya wollei. These findings will enable the creation of toxin probes, for water monitoring purposes, as well as proof-of-concept for the combinatorial biosynthesis of these natural occurring alkaloids for the production of novel, biologically active compounds.

  14. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number

    Directory of Open Access Journals (Sweden)

    Cooper James B


    Full Text Available Abstract Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at

  15. Improvement of gougerotin and nikkomycin production by engineering their biosynthetic gene clusters. (United States)

    Du, Deyao; Zhu, Yu; Wei, Junhong; Tian, Yuqing; Niu, Guoqing; Tan, Huarong


    Nikkomycins and gougerotin are peptidyl nucleoside antibiotics with broad biological activities. The nikkomycin biosynthetic gene cluster comprises one pathway-specific regulatory gene (sanG) and 21 structural genes, whereas the gene cluster for gougerotin biosynthesis includes one putative regulatory gene, one major facilitator superfamily transporter gene, and 13 structural genes. In the present study, we introduced sanG driven by six different promoters into Streptomyces ansochromogenes TH322. Nikkomycin production was increased significantly with the highest increase in engineered strain harboring hrdB promoter-driven sanG. In the meantime, we replaced the native promoter of key structural genes in the gougerotin (gou) gene cluster with the hrdB promoters. The heterologous producer Streptomyces coelicolor M1146 harboring the modified gene cluster produced gougerotin up to 10-fold more than strains carrying the unmodified cluster. Therefore, genetic manipulations of genes involved in antibiotics biosynthesis with the constitutive hrdB promoter present a robust, easy-to-use system generally useful for the improvement of antibiotics production in Streptomyces.

  16. Many nonuniversal archaeal ribosomal proteins are found in conserved gene clusters

    Directory of Open Access Journals (Sweden)

    Jiachen Wang


    Full Text Available The genomic associations of the archaeal ribosomal proteins, (r-proteins, were examined in detail. The archaeal versions of the universal r-protein genes are typically in clusters similar or identical and to those found in bacteria. Of the 35 nonuniversal archaeal r-protein genes examined, the gene encoding L18e was found to be associated with the conserved L13 cluster, whereas the genes for S4e, L32e and L19e were found in the archaeal version of the spc operon. Eleven nonuniversal protein genes were not associated with any common genomic context. Of the remaining 19 protein genes, 17 were convincingly assigned to one of 10 previously unrecognized gene clusters. Examination of the gene content of these clusters revealed multiple associations with genes involved in the initiation of protein synthesis, transcription or other cellular processes. The lack of such associations in the universal clusters suggests that initially the ribosome evolved largely independently of other processes. More recently it likely has evolved in concert with other cellular systems. It was also verified that a second copy of the gene encoding L7ae found in some bacteria is actually a homolog of the gene encoding L30e and should be annotated as such.

  17. Sequencing and comparative analysis of fugu protocadherin clusters reveal diversity of protocadherin genes among teleosts

    Directory of Open Access Journals (Sweden)

    Rajasegaran Vikneswari


    Full Text Available Abstract Background The synaptic cell adhesion molecules, protocadherins, are a vertebrate innovation that accompanied the emergence of the neural tube and the elaborate central nervous system. In mammals, the protocadherins are encoded by three closely-linked clusters (α, β and γ of tandem genes and are hypothesized to provide a molecular code for specifying the remarkably-diverse neural connections in the central nervous system. Like mammals, the coelacanth, a lobe-finned fish, contains a single protocadherin locus, also arranged into α, β and γ clusters. Zebrafish, however, possesses two protocadherin loci that contain more than twice the number of genes as the coelacanth, but arranged only into α and γ clusters. To gain further insight into the evolutionary history of protocadherin clusters, we have sequenced and analyzed protocadherin clusters from the compact genome of the pufferfish, Fugu rubripes. Results Fugu contains two unlinked protocadherin loci, Pcdh1 and Pcdh2, that collectively consist of at least 77 genes. The fugu Pcdh1 locus has been subject to extensive degeneration, resulting in the complete loss of Pcdh1γ cluster. The fugu Pcdh genes have undergone lineage-specific regional gene conversion processes that have resulted in a remarkable regional sequence homogenization among paralogs in the same subcluster. Phylogenetic analyses show that most protocadherin genes are orthologous between fugu and zebrafish either individually or as paralog groups. Based on the inferred phylogenetic relationships of fugu and zebrafish genes, we have reconstructed the evolutionary history of protocadherin clusters in the teleost fish lineage. Conclusion Our results demonstrate the exceptional evolutionary dynamism of protocadherin genes in vertebrates in general, and in teleost fishes in particular. Besides the 'fish-specific' whole genome duplication, the evolution of protocadherin genes in teleost fishes is influenced by lineage

  18. Sphingolipids regulate telomere clustering by affecting the transcription of genes involved in telomere homeostasis. (United States)

    Ikeda, Atsuko; Muneoka, Tetsuya; Murakami, Suguru; Hirota, Ayaka; Yabuki, Yukari; Karashima, Takefumi; Nakazono, Kota; Tsuruno, Masahiro; Pichler, Harald; Shirahige, Katsuhiko; Kodama, Yukiko; Shimamoto, Toshi; Mizuta, Keiko; Funato, Kouichi


    In eukaryotic organisms, including mammals, nematodes and yeasts, the ends of chromosomes, telomeres are clustered at the nuclear periphery. Telomere clustering is assumed to be functionally important because proper organization of chromosomes is necessary for proper genome function and stability. However, the mechanisms and physiological roles of telomere clustering remain poorly understood. In this study, we demonstrate a role for sphingolipids in telomere clustering in the budding yeast Saccharomyces cerevisiae. Because abnormal sphingolipid metabolism causes downregulation of expression levels of genes involved in telomere organization, sphingolipids appear to control telomere clustering at the transcriptional level. In addition, the data presented here provide evidence that telomere clustering is required to protect chromosome ends from DNA-damage checkpoint signaling. As sphingolipids are found in all eukaryotes, we speculate that sphingolipid-based regulation of telomere clustering and the protective role of telomere clusters in maintaining genome stability might be conserved in eukaryotes.

  19. The B-type lamin is required for somatic repression of testis-specific gene clusters (United States)

    Shevelyov, Y. Y.; Lavrov, S. A.; Mikhaylova, L. M.; Nurminsky, I. D.; Kulathinal, R. J.; Egorova, K. S.; Rozovsky, Y. M.; Nurminsky, D. I.


    Large clusters of coexpressed tissue-specific genes are abundant on chromosomes of diverse species. The genes coordinately misexpressed in diverse diseases are also found in similar clusters, suggesting that evolutionarily conserved mechanisms regulate expression of large multigenic regions both in normal development and in its pathological disruptions. Studies on individual loci suggest that silent clusters of coregulated genes are embedded in repressed chromatin domains, often localized to the nuclear periphery. To test this model at the genome-wide scale, we studied transcriptional regulation of large testis-specific gene clusters in somatic tissues of Drosophila. These gene clusters showed a drastic paucity of known expressed transgene insertions, indicating that they indeed are embedded in repressed chromatin. Bioinformatics analysis suggested the major role for the B-type lamin, LamDmo, in repression of large testis-specific gene clusters, showing that in somatic cells as many as three-quarters of these clusters interact with LamDmo. Ablation of LamDmo by using mutants and RNAi led to detachment of testis-specific clusters from nuclear envelope and to their selective transcriptional up-regulation in somatic cells, thus providing the first direct evidence for involvement of the B-type lamin in tissue-specific gene repression. Finally, we found that transcriptional activation of the lamina-bound testis-specific gene cluster in male germ line is coupled with its translocation away from the nuclear envelope. Our studies, which directly link nuclear architecture with coordinated regulation of tissue-specific genes, advance understanding of the mechanisms underlying both normal cell differentiation and developmental disorders caused by lesions in the B-type lamins and interacting proteins. PMID:19218438

  20. Genetic Algorithms Applied to Multi-Class Clustering for Gene Expression Data

    Institute of Scientific and Technical Information of China (English)

    Haiyan Pan; Jun Zhu; Danfu Han


    A hybrid GA (genetic algorithm)-based clustering (HGACLUS) schema, combining merits of the Simulated Annealing, was described for finding an optimal or near-optimal set of medoids. This schema maximized the clustering success by achieving internal cluster cohesion and external cluster isolation. The performance of HGACLUS and other methods was compared by using simulated data and open microarray gene-expression datasets. HGACLUS was generally found to be more accurate and robust than other methods discussed in this paper by the exact validation strategy and the explicit cluster number.

  1. Identification of certain cancer-mediating genes using Gaussian fuzzy cluster validity index

    Indian Academy of Sciences (India)

    Anupam Ghosh; Rajat K De


    In this article, we have used an index, called Gaussian fuzzy index (GFI), recently developed by the authors, based on the notion of fuzzy set theory, for validating the clusters obtained by a clustering algorithm applied on cancer gene expression data. GFI is then used for the identification of genes that have altered quite significantly from normal state to carcinogenic state with respect to their mRNA expression patterns. The effectiveness of the methodology has been demonstrated on three gene expression cancer datasets dealing with human lung, colon and leukemia. The performance of GFI is compared with 19 exiting cluster validity indices. The results are appropriately validated biologically and statistically. In this context, we have used biochemical pathways, -value statistics of GO attributes, -test and -score for the validation of the results. It has been reported that GFI is capable of identifying high-quality enriched clusters of genes, and thereby is able to select more cancer-mediating genes.

  2. An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information

    Directory of Open Access Journals (Sweden)

    Ao Li


    Full Text Available Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS is introduced to automatically determine the boundary threshold. Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.

  3. A rough set based rational clustering framework for determining correlated genes. (United States)

    Jeyaswamidoss, Jeba Emilyn; Thangaraj, Kesavan; Ramar, Kadarkarai; Chitra, Muthusamy


    Cluster analysis plays a foremost role in identifying groups of genes that show similar behavior under a set of experimental conditions. Several clustering algorithms have been proposed for identifying gene behaviors and to understand their significance. The principal aim of this work is to develop an intelligent rough clustering technique, which will efficiently remove the irrelevant dimensions in a high-dimensional space and obtain appropriate meaningful clusters. This paper proposes a novel biclustering technique that is based on rough set theory. The proposed algorithm uses correlation coefficient as a similarity measure to simultaneously cluster both the rows and columns of a gene expression data matrix and mean squared residue to generate the initial biclusters. Furthermore, the biclusters are refined to form the lower and upper boundaries by determining the membership of the genes in the clusters using mean squared residue. The algorithm is illustrated with yeast gene expression data and the experiment proves the effectiveness of the method. The main advantage is that it overcomes the problem of selection of initial clusters and also the restriction of one object belonging to only one cluster by allowing overlapping of biclusters.

  4. Functional Analysis of Promoters in the Nisin Gene Cluster of Lactococcus lactis

    NARCIS (Netherlands)

    Ruyter, Pascalle G.G.A. de; Kuipers, Oscar P.; Beerthuyzen, Marke M.; Alen-Boerrigter, Ingrid van; Vos, Willem M. de


    The promoters in the nisin gene cluster nisABTCIPRKFEG of Lactococcus lactis were characterized by primer extension and transcriptional fusions to the Escherichia coli promoterless β-glucuronidase gene (gusA). Three promoters preceding the nisA, nisR, and nisF genes, which all give rise to gusA expr

  5. Horizontal transfer of a nitrate assimilation gene cluster and ecological transitions in fungi: a phylogenetic study.

    Directory of Open Access Journals (Sweden)

    Jason C Slot

    Full Text Available High affinity nitrate assimilation genes in fungi occur in a cluster (fHANT-AC that can be coordinately regulated. The clustered genes include nrt2, which codes for a high affinity nitrate transporter; euknr, which codes for nitrate reductase; and NAD(PH-nir, which codes for nitrite reductase. Homologs of genes in the fHANT-AC occur in other eukaryotes and prokaryotes, but they have only been found clustered in the oomycete Phytophthora (heterokonts. We performed independent and concatenated phylogenetic analyses of homologs of all three genes in the fHANT-AC. Phylogenetic analyses limited to fungal sequences suggest that the fHANT-AC has been transferred horizontally from a basidiomycete (mushrooms and smuts to an ancestor of the ascomycetous mold Trichoderma reesei. Phylogenetic analyses of sequences from diverse eukaryotes and eubacteria, and cluster structure, are consistent with a hypothesis that the fHANT-AC was assembled in a lineage leading to the oomycetes and was subsequently transferred to the Dikarya (Ascomycota+Basidiomycota, which is a derived fungal clade that includes the vast majority of terrestrial fungi. We propose that the acquisition of high affinity nitrate assimilation contributed to the success of Dikarya on land by allowing exploitation of nitrate in aerobic soils, and the subsequent transfer of a complete assimilation cluster improved the fitness of T. reesei in a new niche. Horizontal transmission of this cluster of functionally integrated genes supports the "selfish operon" hypothesis for maintenance of gene clusters.

  6. Bayesian History Reconstruction of Complex Human Gene Clusters on a Phylogeny

    CERN Document Server

    Vinař, Tomáš; Song, Giltae; Siepel, Adam


    Clusters of genes that have evolved by repeated segmental duplication present difficult challenges throughout genomic analysis, from sequence assembly to functional analysis. Improved understanding of these clusters is of utmost importance, since they have been shown to be the source of evolutionary innovation, and have been linked to multiple diseases, including HIV and a variety of cancers. Previously, Zhang et al. (2008) developed an algorithm for reconstructing parsimonious evolutionary histories of such gene clusters, using only human genomic sequence data. In this paper, we propose a probabilistic model for the evolution of gene clusters on a phylogeny, and an MCMC algorithm for reconstruction of duplication histories from genomic sequences in multiple species. Several projects are underway to obtain high quality BAC-based assemblies of duplicated clusters in multiple species, and we anticipate that our method will be useful in analyzing these valuable new data sets.

  7. MADIBA: A web server toolkit for biological interpretation of Plasmodium and plant gene clusters

    Directory of Open Access Journals (Sweden)

    Louw Abraham I


    Full Text Available Abstract Background Microarray technology makes it possible to identify changes in gene expression of an organism, under various conditions. Data mining is thus essential for deducing significant biological information such as the identification of new biological mechanisms or putative drug targets. While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill. Description MADIBA (MicroArray Data Interface for Biological Annotation facilitates the assignment of biological meaning to gene expression clusters by automating the post-processing stage. A relational database has been designed to store the data from gene to pathway for Plasmodium, rice and Arabidopsis. Tools within the web interface allow rapid analyses for the identification of the Gene Ontology terms relevant to each cluster; visualising the metabolic pathways where the genes are implicated, their genomic localisations, putative common transcriptional regulatory elements in the upstream sequences, and an analysis specific to the organism being studied. Conclusion MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes. Functionality of MADIBA was validated by analysing a number of gene clusters from several published experiments – expression profiling of the Plasmodium life cycle, and salt stress treatments of Arabidopsis and rice. In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.

  8. A gene cluster for amylovoran synthesis in Erwinia amylovora: characterization and relationship to cps genes in Erwinia stewartii. (United States)

    Bernhard, F; Coplin, D L; Geider, K


    A large ams gene cluster required for production of the acidic extracellular polysaccharide (EPS) amylovoran by the fire blight pathogen Erwinia amylovora was cloned. Tn5 mutagenesis and gene replacement were used to construct chromosomal ams mutants. Five complementation groups, essential for amylovoran synthesis and virulence in E. amylovora, were identified and designated ams A-E. The ams gene cluster is about 7 kb in size and functionally equivalent to the cps gene cluster involved in EPS synthesis by the related pathogen Erwinia stewartii. Mucoidy and virulence were restored to E. stewartii mutants in four cps complementation groups by the cloned E. amylovora ams genes. Conversely, the E. stewartii cps gene cluster was able to complement mutations in E. amylovora ams genes. Correspondence was found between the amsA-E complementation groups and the cpsB-D region, but the arrangement of the genes appears to be different. EPS production and virulence were also restored to E. amylovora amsE and E. stewartii cpsD mutants by clones containing the Rhizobium meliloti exo A gene.

  9. The biosynthetic gene cluster for the beta-lactam carbapenem thienamycin in Streptomyces cattleya. (United States)

    Núñez, Luz Elena; Méndez, Carmen; Braña, Alfredo F; Blanco, Gloria; Salas, José A


    beta-lactam ring formation in carbapenem and clavam biosynthesis proceeds through an alternative mechanism to the biosynthetic pathway of classic beta-lactam antibiotics. This involves the participation of a beta-lactam synthetase. Using available information from beta-lactam synthetases, we generated a probe for the isolation of the thienamycin cluster from Streptomyces cattleya. Genes homologous to carbapenem and clavulanic acid biosynthetic genes have been identified. They would participate in early steps of thienamycin biosynthesis leading to the formation of the beta-lactam ring. Other genes necessary for the biosynthesis of thienamycin have also been identified in the cluster (methyltransferases, cysteinyl transferases, oxidoreductases, hydroxylase, etc.) together with two regulatory genes, genes involved in exportation and/or resistance, and a quorum sensing system. Involvement of the cluster in thienamycin biosynthesis was demonstrated by insertional inactivation of several genes generating thienamycin nonproducing mutants.

  10. Characterization of the fumonisin B2 biosynthetic gene cluster in Aspergillus niger and A. awamori. (United States)

    Aspergillus niger and A. awamori strains isolated from grapes cultivated in Mediterranean basin were examined for fumonisin B2 (FB2) production and presence/absence of sequences within the fumonisin biosynthetic gene (fum) cluster. Presence of 13 regions in the fum cluster was evaluated by PCR assay...

  11. Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics.

    Directory of Open Access Journals (Sweden)

    Korsuk Sirinukunwattana

    Full Text Available Clustering analysis is an important tool in studying gene expression data. The Bayesian hierarchical clustering (BHC algorithm can automatically infer the number of clusters and uses Bayesian model selection to improve clustering quality. In this paper, we present an extension of the BHC algorithm. Our Gaussian BHC (GBHC algorithm represents data as a mixture of Gaussian distributions. It uses normal-gamma distribution as a conjugate prior on the mean and precision of each of the Gaussian components. We tested GBHC over 11 cancer and 3 synthetic datasets. The results on cancer datasets show that in sample clustering, GBHC on average produces a clustering partition that is more concordant with the ground truth than those obtained from other commonly used algorithms. Furthermore, GBHC frequently infers the number of clusters that is often close to the ground truth. In gene clustering, GBHC also produces a clustering partition that is more biologically plausible than several other state-of-the-art methods. This suggests GBHC as an alternative tool for studying gene expression data. The implementation of GBHC is available at

  12. Integrating Data Clustering and Visualization for the Analysis of 3D Gene Expression Data

    Energy Technology Data Exchange (ETDEWEB)

    Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd


    The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.

  13. Regulation of transcription of cell division genes in the Escherichia coli dcw cluster. (United States)

    Vicente, M; Gomez, M J; Ayala, J A


    The Escherichia coli dcw cluster contains cell division genes, such as the phylogenetically ubiquitous ftsZ, and genes involved in peptidoglycan synthesis. Transcription in the cluster proceeds in the same direction as the progress of the replication fork along the chromosome. Regulation is exerted at the transcriptional and post-transcriptional levels. The absence of transcriptional termination signals may, in principle, allow extension of the transcripts initiated at the up-stream promoter (mraZ1p) even to the furthest down-stream gene (envA). Complementation tests suggest that they extend into ftsW in the central part of the cluster. In addition, the cluster contains other promoters individually regulated by cis- and trans-acting signals. Dissociation of the expression of the ftsZ gene, located after ftsQ and A near the 3' end of the cluster, from its natural regulatory signals leads to an alteration in the physiology of cell division. The complexities observed in the regulation of gene expression in the cluster may then have an important biological role. Among them, LexA-binding SOS boxes have been found at the 5' end of the cluster, preceding promoters which direct the expression of ftsI (coding for PBP3, the penicillin-binding protein involved in septum formation). A gearbox promoter, ftsQ1p, forms part of the signals regulating the transcription of ftsQ, A and Z. It is an inversely growth-dependent mechanism driven by RNA polymerase containing sigma s, the factor involved in the expression of stationary phase-specific genes. Although the dcw cluster is conserved to a different extent in a variety of bacteria, the regulation of gene expression, the presence or absence of individual genes, and even the essentiality of some of them, show variations in the phylogenetic scale which may reflect adaptation to specific life cycles.

  14. Yeast homologous recombination-based promoter engineering for the activation of silent natural product biosynthetic gene clusters. (United States)

    Montiel, Daniel; Kang, Hahk-Soo; Chang, Fang-Yuan; Charlop-Powers, Zachary; Brady, Sean F


    Large-scale sequencing of prokaryotic (meta)genomic DNA suggests that most bacterial natural product gene clusters are not expressed under common laboratory culture conditions. Silent gene clusters represent a promising resource for natural product discovery and the development of a new generation of therapeutics. Unfortunately, the characterization of molecules encoded by these clusters is hampered owing to our inability to express these gene clusters in the laboratory. To address this bottleneck, we have developed a promoter-engineering platform to transcriptionally activate silent gene clusters in a model heterologous host. Our approach uses yeast homologous recombination, an auxotrophy complementation-based yeast selection system and sequence orthogonal promoter cassettes to exchange all native promoters in silent gene clusters with constitutively active promoters. As part of this platform, we constructed and validated a set of bidirectional promoter cassettes consisting of orthogonal promoter sequences, Streptomyces ribosome binding sites, and yeast selectable marker genes. Using these tools we demonstrate the ability to simultaneously insert multiple promoter cassettes into a gene cluster, thereby expediting the reengineering process. We apply this method to model active and silent gene clusters (rebeccamycin and tetarimycin) and to the silent, cryptic pseudogene-containing, environmental DNA-derived Lzr gene cluster. Complete promoter refactoring and targeted gene exchange in this "dead" cluster led to the discovery of potent indolotryptoline antiproliferative agents, lazarimides A and B. This potentially scalable and cost-effective promoter reengineering platform should streamline the discovery of natural products from silent natural product biosynthetic gene clusters.

  15. Unusual Gene Order and Organization of the Sea Urchin Hox Cluster

    Energy Technology Data Exchange (ETDEWEB)

    Cameron, R A; Rowen, L; Nesbitt, R; Bloom, S; Rast, J P; Berney, K; Arenas-Mena, C; Martinez, P; Lucas, S; Richardson, P M; Davidson, E H; Peterson, K J; Hood, L


    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3 gene is Hox5. (The gene order is : 5-Hox1, 2, 3, 11/13c, 11/13b, 11/13a, 9/10, 8, 7, 6, 5 - 3). The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  16. Unusual Gene Order and Organization of the Sea Urchin HoxCluster

    Energy Technology Data Exchange (ETDEWEB)

    Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew; Rowen,Lee; Nesbitt, Ryan; Bloom, Scott; Rast, Jonathan P.; Berney, Kevin; Arenas-Mena, Cesar; Martinez, Pedro; Davidson, Eric H.; Peterson, KevinJ.; Hood, Leroy


    The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is : 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.

  17. Natural product proteomining, a quantitative proteomics platform, allows rapid discovery of biosynthetic gene clusters for different classes of natural products. (United States)

    Gubbens, Jacob; Zhu, Hua; Girard, Geneviève; Song, Lijiang; Florea, Bogdan I; Aston, Philip; Ichinose, Koji; Filippov, Dmitri V; Choi, Young H; Overkleeft, Herman S; Challis, Gregory L; van Wezel, Gilles P


    Information on gene clusters for natural product biosynthesis is accumulating rapidly because of the current boom of available genome sequencing data. However, linking a natural product to a specific gene cluster remains challenging. Here, we present a widely applicable strategy for the identification of gene clusters for specific natural products, which we name natural product proteomining. The method is based on using fluctuating growth conditions that ensure differential biosynthesis of the bioactivity of interest. Subsequent combination of metabolomics and quantitative proteomics establishes correlations between abundance of natural products and concomitant changes in the protein pool, which allows identification of the relevant biosynthetic gene cluster. We used this approach to elucidate gene clusters for different natural products in Bacillus and Streptomyces, including a novel juglomycin-type antibiotic. Natural product proteomining does not require prior knowledge of the gene cluster or secondary metabolite and therefore represents a general strategy for identification of all types of gene clusters.

  18. The Local Maximum Clustering Method and Its Application in Microarray Gene Expression Data Analysis

    Directory of Open Access Journals (Sweden)

    Chen Yidong


    Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.

  19. β-globin gene cluster haplotypes in ethnic minority populations of southwest China (United States)

    Sun, Hao; Liu, Hongxian; Huang, Kai; Lin, Keqin; Huang, Xiaoqin; Chu, Jiayou; Ma, Shaohui; Yang, Zhaoqing


    The genetic diversity and relationships among ethnic minority populations of southwest China were investigated using seven polymorphic restriction enzyme sites in the β-globin gene cluster. The haplotypes of 1392 chromosomes from ten ethnic populations living in southwest China were determined. Linkage equilibrium and recombination hotspot were found between the 5′ sites and 3′ sites of the β-globin gene cluster. 5′ haplotypes 2 (+−−−), 6 (−++−+), 9 (−++++) and 3′ haplotype FW3 (−+) were the predominant haplotypes. Notably, haplotype 9 frequency was significantly high in the southwest populations, indicating their difference with other Chinese. The interpopulation differentiation of southwest Chinese minority populations is less than those in populations of northern China and other continents. Phylogenetic analysis shows that populations sharing same ethnic origin or language clustered to each other, indicating current β-globin cluster diversity in the Chinese populations reflects their ethnic origin and linguistic affiliations to a great extent. This study characterizes β-globin gene cluster haplotypes in southwest Chinese minorities for the first time, and reveals the genetic variability and affinity of these populations using β-globin cluster haplotype frequencies. The results suggest that ethnic origin plays an important role in shaping variations of the β-globin gene cluster in the southwestern ethnic populations of China. PMID:28205625

  20. Ensemble attribute profile clustering: discovering and characterizing groups of genes with similar patterns of biological features

    Directory of Open Access Journals (Sweden)

    Bissell MJ


    Full Text Available Abstract Background Ensemble attribute profile clustering is a novel, text-based strategy for analyzing a user-defined list of genes and/or proteins. The strategy exploits annotation data present in gene-centered corpora and utilizes ideas from statistical information retrieval to discover and characterize properties shared by subsets of the list. The practical utility of this method is demonstrated by employing it in a retrospective study of two non-overlapping sets of genes defined by a published investigation as markers for normal human breast luminal epithelial cells and myoepithelial cells. Results Each genetic locus was characterized using a finite set of biological properties and represented as a vector of features indicating attributes associated with the locus (a gene attribute profile. In this study, the vector space models for a pre-defined list of genes were constructed from the Gene Ontology (GO terms and the Conserved Domain Database (CDD protein domain terms assigned to the loci by the gene-centered corpus LocusLink. This data set of GO- and CDD-based gene attribute profiles, vectors of binary random variables, was used to estimate multiple finite mixture models and each ensuing model utilized to partition the profiles into clusters. The resultant partitionings were combined using a unanimous voting scheme to produce consensus clusters, sets of profiles that co-occured consistently in the same cluster. Attributes that were important in defining the genes assigned to a consensus cluster were identified. The clusters and their attributes were inspected to ascertain the GO and CDD terms most associated with subsets of genes and in conjunction with external knowledge such as chromosomal location, used to gain functional insights into human breast biology. The 52 luminal epithelial cell markers and 89 myoepithelial cell markers are disjoint sets of genes. Ensemble attribute profile clustering-based analysis indicated that both lists

  1. Clustering based gene expression feature selection method: A computational approach to enrich the classifier efficiency of differentially expressed genes

    KAUST Repository

    Abusamra, Heba


    The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset [1]. The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.

  2. Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes. (United States)

    Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko


    Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system.

  3. Gene identification and protein classification in microbial metagenomic sequence data via incremental clustering

    Directory of Open Access Journals (Sweden)

    Li Weizhong


    Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, ( Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.

  4. Shared gene structures and clusters of mutually exclusive spliced exons within the metazoan muscle myosin heavy chain genes.

    Directory of Open Access Journals (Sweden)

    Martin Kollmar

    Full Text Available Multicellular animals possess two to three different types of muscle tissues. Striated muscles have considerable ultrastructural similarity and contain a core set of proteins including the muscle myosin heavy chain (Mhc protein. The ATPase activity of this myosin motor protein largely dictates muscle performance at the molecular level. Two different solutions to adjusting myosin properties to different muscle subtypes have been identified so far: Vertebrates and nematodes contain many independent differentially expressed Mhc genes while arthropods have single Mhc genes with clusters of mutually exclusive spliced exons (MXEs. The availability of hundreds of metazoan genomes now allowed us to study whether the ancient bilateria already contained MXEs, how MXE complexity subsequently evolved, and whether additional scenarios to control contractile properties in different muscles could be proposed, By reconstructing the Mhc genes from 116 metazoans we showed that all intron positions within the motor domain coding regions are conserved in all bilateria analysed. The last common ancestor of the bilateria already contained a cluster of MXEs coding for part of the loop-2 actin-binding sequence. Subsequently the protostomes and later the arthropods gained many further clusters while MXEs got completely lost independently in several branches (vertebrates and nematodes and species (for example the annelid Helobdella robusta and the salmon louse Lepeophtheirus salmonis. Several bilateria have been found to encode multiple Mhc genes that might all or in part contain clusters of MXEs. Notable examples are a cluster of six tandemly arrayed Mhc genes, of which two contain MXEs, in the owl limpet Lottia gigantea and four Mhc genes with three encoding MXEs in the predatory mite Metaseiulus occidentalis. Our analysis showed that similar solutions to provide different myosin isoforms (multiple genes or clusters of MXEs or both have independently been developed

  5. An Sp185/333 gene cluster from the purple sea urchin and putative microsatellite-mediated gene diversification

    Directory of Open Access Journals (Sweden)

    Buckley Katherine M


    Full Text Available Abstract Background The immune system of the purple sea urchin, Strongylocentrotus purpuratus, is complex and sophisticated. An important component of sea urchin immunity is the Sp185/333 gene family, which is significantly upregulated in immunologically challenged animals. The Sp185/333 genes are less than 2 kb with two exons and are members of a large diverse family composed of greater than 40 genes. The S. purpuratus genome assembly, however, contains only six Sp185/333 genes. This underrepresentation could be due to the difficulties that large gene families present in shotgun assembly, where multiple similar genes can be collapsed into a single consensus gene. Results To understand the genomic organization of the Sp185/333 gene family, a BAC insert containing Sp185/333 genes was assembled, with careful attention to avoiding artifacts resulting from collapse or artificial duplication/expansion of very similar genes. Twelve candidate BAC assemblies were generated with varying parameters and the optimal assembly was identified by PCR, restriction digests, and subclone sequencing. The validated assembly contained six Sp185/333 genes that were clustered in a 34 kb region at one end of the BAC with five of the six genes tightly clustered within 20 kb. The Sp185/333 genes in this cluster were no more similar to each other than to previously sequenced Sp185/333 genes isolated from three different animals. This was unexpected given their proximity and putative effects of gene homogenization in closely linked, similar genes. All six genes displayed significant similarity including both 5' and 3' flanking regions, which were bounded by microsatellites. Three of the Sp185/333 genes and their flanking regions were tandemly duplicated such that each repeated segment consisted of a gene plus 0.7 kb 5' and 2.4 kb 3' of the gene (4.5 kb total. Both edges of the segmental duplications were bounded by different microsatellites. Conclusions The high sequence

  6. Molecular population genetics of the -esterase gene cluster of Drosophila melanogaster

    Indian Academy of Sciences (India)

    Evgeniy S. Balakirev; Francisco J. Ayala


    We have investigated nucleotide polymorphism at the -esterase gene cluster including the Est-6 gene and Est-6 putative pseudogene in four samples of Drosophila melanogaster derived from natural populations of southern Africa (Zimbabwe), Europe (Spain), North America (USA: California), and South America (Venezuela). A complex haplo-type structure is revealed in both Est-6 and Est-6. Total nucleotide diversity is twice in Est-6 as in Est-6; diversity is higher in the African sample than in the non-African ones. Strong linkage disequilibrium occurs within the -esterase gene cluster in non-African samples, but not in the African one. Intragenic gene conversion events are detected within Est-6 and, to a much greater extent, within Est-6; intergenic gene conversion events are rare. Tests of neutrality with recombination are significant for the -esterase gene cluster in the non-African samples but not significant in the African one. We suggest that the demographic history (bottleneck and admixture of genetically differentiated populations) is the major factor shaping the pattern of nucleotide polymorphism in the -esterase gene cluster. However there are some ‘footprints’ of directional and balancing selection shaping specific distribution of nucleotide polymorphism within the cluster. Intergenic epistatic selection between Est-6 and Est-6 may play an important role in the evolution of the -esterase gene cluster preserving the putative pseudogene from degenerative destruction and reflecting possible functional interaction between the functional gene and the putative pseudogene. Est-6 and Est-6 may represent an indivisible intergenic complex (‘intergene’) in which each single component (Est-6 or Est-6) cannot separately carry out the full functional role.

  7. Phylogenomics of the benzoxazinoid biosynthetic pathway of Poaceae: gene duplications and origin of the Bx cluster

    Directory of Open Access Journals (Sweden)

    Dutartre Leslie


    Full Text Available Abstract Background The benzoxazinoids 2,4-dihydroxy-1,4-benzoxazin-3-one (DIBOA and 2,4-dihydroxy-7- methoxy-1,4-benzoxazin-3-one (DIMBOA, are key defense compounds present in major agricultural crops such as maize and wheat. Their biosynthesis involves nine enzymes thought to form a linear pathway leading to the storage of DI(MBOA as glucoside conjugates. Seven of the genes (Bx1-Bx6 and Bx8 form a cluster at the tip of the short arm of maize chromosome 4 that includes four P450 genes (Bx2-5 belonging to the same CYP71C subfamily. The origin of this cluster is unknown. Results We show that the pathway appeared following several duplications of the TSA gene (α-subunit of tryptophan synthase and of a Bx2-like ancestral CYP71C gene and the recruitment of Bx8 before the radiation of Poaceae. The origins of Bx6 and Bx7 remain unclear. We demonstrate that the Bx2-like CYP71C ancestor was not committed to the benzoxazinoid pathway and that after duplications the Bx2-Bx5 genes were under positive selection on a few sites and underwent functional divergence, leading to the current specific biochemical properties of the enzymes. The absence of synteny between available Poaceae genomes involving the Bx gene regions is in contrast with the conserved synteny in the TSA gene region. Conclusions These results demonstrate that rearrangements following duplications of an IGL/TSA gene and of a CYP71C gene probably resulted in the clustering of the new copies (Bx1 and Bx2 at the tip of a chromosome in an ancestor of grasses. Clustering favored cosegregation and tip chromosomal location favored gene rearrangements that allowed the further recruitment of genes to the pathway. These events, a founding event and elongation events, may have been the key to the subsequent evolution of the benzoxazinoid biosynthetic cluster.

  8. Degeneration of aflatoxin gene cluster in Aspergillus flavus from Africa and North America (United States)

    Aspergillus flavus is the primary causal agent of food and feed contamination with the toxic fungal metabolites aflatoxins. Aflatoxin-producing potential of A. flavus is known to vary among isolates. The genes involved in aflatoxin biosynthesis are clustered together and the order of genes within th...

  9. Mycobiota and identification of aflatoxin gene cluster in marketed spices in West Africa

    DEFF Research Database (Denmark)

    Gnonlonfin, G. J. B.; Adjovi, Y. C.; Tokpo, A. F.


    of Aspergillus were dominant on all marketed dried and milled spices irrespective of country. Gene characterization and amplification analysis showed that most of the Aspergillus flavus isolates possess the cluster genes for aflatoxin production. Aflatoxin B1 assessment by Thin Layer Chromatography showed...

  10. The clustering of functionally related genes contributes to CNV-mediated disease

    NARCIS (Netherlands)

    Andrews, T.; Honti, F.; Pfundt, R.P.; Leeuw, N. de; Hehir, J.Y.; Vulto-van Silfhout, A.T.; Vries, B. de; Webber, C.


    Clusters of functionally related genes can be disrupted by a single copy number variant (CNV). We demonstrate that the simultaneous disruption of multiple functionally related genes is a frequent and significant characteristic of de novo CNVs in patients with developmental disorders (P = 1 x 10(-3))

  11. Sequence breakpoints in the aflatoxin biosynthesis gene cluster and flanking regions in nonaflatoxigenic Aspergillus flavus isolates. (United States)

    Chang, Perng-Kuang; Horn, Bruce W; Dorner, Joe W


    Aspergillus flavus populations are genetically diverse. Isolates that produce either, neither, or both aflatoxins and cyclopiazonic acid (CPA) are present in the field. We investigated defects in the aflatoxin gene cluster in 38 nonaflatoxigenic A. flavus isolates collected from southern United States. PCR assays using aflatoxin-gene-specific primers grouped these isolates into eight (A-H) deletion patterns. Patterns C, E, G, and H, which contain 40 kb deletions, were examined for their sequence breakpoints. Pattern C has one breakpoint in the cypA 3' untranslated region (UTR) and another in the verA coding region. Pattern E has a breakpoint in the amdA coding region and another in the ver1 5'UTR. Pattern G contains a deletion identical to the one found in pattern C and has another deletion that extends from the cypA coding region to one end of the chromosome as suggested by the presence of telomeric sequence repeats, CCCTAATGTTGA. Pattern H has a deletion of the entire aflatoxin gene cluster from the hexA coding region in the sugar utilization gene cluster to the telomeric region. Thus, deletions in the aflatoxin gene cluster among A. flavus isolates are not rare, and the patterns appear to be diverse. Genetic drift may be a driving force that is responsible for the loss of the entire aflatoxin gene cluster in nonaflatoxigenic A. flavus isolates when aflatoxins have lost their adaptive value in nature.

  12. Lack of clinical manifestations in asymptomatic dengue infection is attributed to broad down-regulation and selective up-regulation of host defence response genes.

    Directory of Open Access Journals (Sweden)

    Adeline S L Yeo

    Full Text Available OBJECTIVES: Dengue represents one of the most serious life-threatening vector-borne infectious diseases that afflicts approximately 50 million people across the globe annually. Whilst symptomatic infections are frequently reported, asymptomatic dengue remains largely unnoticed. Therefore, we sought to investigate the immune correlates conferring protection to individuals that remain clinically asymptomatic. METHODS: We determined the levels of neutralizing antibodies (nAbs and gene expression profiles of host immune factors in individuals with asymptomatic infections, and whose cognate household members showed symptoms consistent to clinical dengue infection. RESULTS: We observed broad down-regulation of host defense response (innate, adaptive and matrix metalloprotease genes in asymptomatic individuals as against symptomatic patients, with selective up-regulation of distinct genes that have been associated with protection. Selected down-regulated genes include: TNF α (TNF, IL8, C1S, factor B (CFB, IL2, IL3, IL4, IL5, IL8, IL9, IL10 and IL13, CD80, CD28, and IL18, MMP8, MMP10, MMP12, MMP15, MMP16, and MMP24. Selected up-regulated genes include: RANTES (CCL5, MIP-1α (CCL3L1/CCL3L3, MIP-1β (CCL4L1, TGFβ (TGFB, and TIMP1. CONCLUSION: Our findings highlight the potential association of certain host genes conferring protection against clinical dengue. These data are valuable to better explore the mysteries behind the hitherto poorly understood immunopathogenesis of subclinical dengue infection.

  13. Structural variation of the ribosomal gene cluster within the class Insecta

    Energy Technology Data Exchange (ETDEWEB)

    Mukha, D.V.; Sidorenko, A.P.; Lazebnaya, I.V. [Vavilov Institute of General Genetics, Moscow (Russian Federation)] [and others


    General estimation of ribosomal DNA variation within the class Insecta is presented. It is shown that, using blot-hybridization, one can detect differences in the structure of the ribosomal gene cluster not only between genera within an order, but also between species within a genera, including sibling species. Structure of the ribosomal gene cluster of the Coccinellidae family (ladybirds) is analyzed. It is shown that cloned highly conservative regions of ribosomal DNA of Tetrahymena pyriformis can be used as probes for analyzing ribosomal genes in insects. 24 refs., 4 figs.

  14. Identification of transcriptional activators for thienamycin and cephamycin C biosynthetic genes within the thienamycin gene cluster from Streptomyces cattleya. (United States)

    Rodríguez, Miriam; Núñez, Luz Elena; Braña, Alfredo F; Méndez, Carmen; Salas, José A; Blanco, Gloria


    Two regulatory genes, thnI and thnU, were identified in the thienamycin (thn) gene cluster from Streptomyces cattleya. ThnI resembles LysR-type transcriptional activators and ThnU belongs to the SARP family of transcriptional activators. Their functional role was established after independent inactivation by gene replacement together with transcriptional analysis involving reverse transcription polymerase chain reaction (RT-PCR). Deletion of thnI abolished thienamycin production showing its involvement in thienamycin biosynthesis. Gene expression analysis applied to the thn gene cluster demonstrated that ThnI is a transcriptional activator essential for thienamycin biosynthesis that regulates the expression of nine genes involved in thienamycin assembly and export (thnH, thnJ, thnK, thnL, thnM, thnN, thnO, thnP and thnQ). Unexpectedly, the thnU disrupted mutant was not affected in thienamycin production but turned out to be essential for cephamycin C biosynthesis. Transcript analysis applied to early and late structural genes for cephamycin C biosynthesis (pcbAB and cmcI), revealed that ThnU is the transcriptional activator of these cephamycin C genes although they are not physically linked to the thn cluster. In addition, it was shown that deletion of thnI has an upregulatory effect on pcbAB and cmcI transcription consistent with a significant increase in cephamycin C biosynthesis in this mutant.

  15. Identification and manipulation of the pleuromutilin gene cluster from Clitopilus passeckerianus for increased rapid antibiotic production (United States)

    Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O’Dwyer, Karen; Spence, David W.; Foster, Gary D.


    Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi.


    Directory of Open Access Journals (Sweden)

    Silaghi Gheorghe Cosmin


    Full Text Available Previously we employed the Gene Trajectory Clustering methodology to search for different associations of the stocks composing the DJA index, with the aim of finding different, logic clusters, supported by economic reasons, preferably different than the

  17. MS/MS networking guided analysis of molecule and gene cluster families. (United States)

    Nguyen, Don Duy; Wu, Cheng-Hsuan; Moree, Wilna J; Lamsa, Anne; Medema, Marnix H; Zhao, Xiling; Gavilan, Ronnie G; Aparicio, Marystella; Atencio, Librada; Jackson, Chanaye; Ballesteros, Javier; Sanchez, Joel; Watrous, Jeramie D; Phelan, Vanessa V; van de Wiel, Corine; Kersten, Roland D; Mehnaz, Samina; De Mot, René; Shank, Elizabeth A; Charusanti, Pep; Nagarajan, Harish; Duggan, Brendan M; Moore, Bradley S; Bandeira, Nuno; Palsson, Bernhard Ø; Pogliano, Kit; Gutiérrez, Marcelino; Dorrestein, Pieter C


    The ability to correlate the production of specialized metabolites to the genetic capacity of the organism that produces such molecules has become an invaluable tool in aiding the discovery of biotechnologically applicable molecules. Here, we accomplish this task by matching molecular families with gene cluster families, making these correlations to 60 microbes at one time instead of connecting one molecule to one organism at a time, such as how it is traditionally done. We can correlate these families through the use of nanospray desorption electrospray ionization MS/MS, an ambient pressure MS technique, in conjunction with MS/MS networking and peptidogenomics. We matched the molecular families of peptide natural products produced by 42 bacilli and 18 pseudomonads through the generation of amino acid sequence tags from MS/MS data of specific clusters found in the MS/MS network. These sequence tags were then linked to biosynthetic gene clusters in publicly accessible genomes, providing us with the ability to link particular molecules with the genes that produced them. As an example of its use, this approach was applied to two unsequenced Pseudoalteromonas species, leading to the discovery of the gene cluster for a molecular family, the bromoalterochromides, in the previously sequenced strain P. piscicida JCM 20779(T). The approach itself is not limited to 60 related strains, because spectral networking can be readily adopted to look at molecular family-gene cluster families of hundreds or more diverse organisms in one single MS/MS network.

  18. Genome mining demonstrates the widespread occurrence of gene clusters encoding bacteriocins in cyanobacteria. (United States)

    Wang, Hao; Fewer, David P; Sivonen, Kaarina


    Cyanobacteria are a rich source of natural products with interesting biological activities. Many of these are peptides and the end products of a non-ribosomal pathway. However, several cyanobacterial peptide classes were recently shown to be produced through the proteolytic cleavage and post-translational modification of short precursor peptides. A new class of bacteriocins produced through the proteolytic cleavage and heterocyclization of precursor proteins was recently identified from marine cyanobacteria. Here we show the widespread occurrence of bacteriocin gene clusters in cyanobacteria through comparative analysis of 58 cyanobacterial genomes. A total of 145 bacteriocin gene clusters were discovered through genome mining. These clusters encoded 290 putative bacteriocin precursors. They ranged in length from 28 to 164 amino acids with very little sequence conservation of the core peptide. The gene clusters could be classified into seven groups according to their gene organization and domain composition. This classification is supported by phylogenetic analysis, which further indicated independent evolutionary trajectories of gene clusters in different groups. Our data suggests that cyanobacteria are a prolific source of low-molecular weight post-translationally modified peptides.

  19. Genome mining demonstrates the widespread occurrence of gene clusters encoding bacteriocins in cyanobacteria.

    Directory of Open Access Journals (Sweden)

    Hao Wang

    Full Text Available Cyanobacteria are a rich source of natural products with interesting biological activities. Many of these are peptides and the end products of a non-ribosomal pathway. However, several cyanobacterial peptide classes were recently shown to be produced through the proteolytic cleavage and post-translational modification of short precursor peptides. A new class of bacteriocins produced through the proteolytic cleavage and heterocyclization of precursor proteins was recently identified from marine cyanobacteria. Here we show the widespread occurrence of bacteriocin gene clusters in cyanobacteria through comparative analysis of 58 cyanobacterial genomes. A total of 145 bacteriocin gene clusters were discovered through genome mining. These clusters encoded 290 putative bacteriocin precursors. They ranged in length from 28 to 164 amino acids with very little sequence conservation of the core peptide. The gene clusters could be classified into seven groups according to their gene organization and domain composition. This classification is supported by phylogenetic analysis, which further indicated independent evolutionary trajectories of gene clusters in different groups. Our data suggests that cyanobacteria are a prolific source of low-molecular weight post-translationally modified peptides.

  20. A genomics based discovery of secondary metabolite biosynthetic gene clusters in Aspergillus ustus.

    Directory of Open Access Journals (Sweden)

    Borui Pi

    Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.

  1. Close linkage of the two keratin gene clusters in the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Milisavljevic, V.; Freedberg, I.M.; Blumenberg, M. [New York Univ. Medical Center, New York, NY (United States)


    Mapping studies of functional keratin genes in the human genome have localized most of the acidic keratin genes to chromosome 17q12-q21 and the basic keratin genes to chromosome 12 q11-q13. Within the acidic keratin locus two clusters were identified, one containing the genes for K15 and K19, the other the genes for K14, K16, and K17. The relative positions and the distance between the two clusters have not been determined previously. In this paper we describe our analysis of P1 clones containing multiple acidic keratin genes, which were studied using restriction analysis and Southern blot hybridization with PCR-amplified probes specific for functional human keratin genes 15, 17, and 19. Our results show that the two clusters are very closely linked to each other, within a 55-kb region in the human genome. The genes are organized 5{prime} to 3{prime} in the following order: 5{prime}-K19-K15-K17-K16-K14. Between K15 and K17 at least one additional, unidentified keratin gene is present. 30 refs., 2 figs.

  2. Operon and non-operon gene clusters in the C. elegans genome. (United States)

    Blumenthal, Thomas; Davis, Paul; Garrido-Lecca, Alfonso


    Nearly 15% of the ~20,000 C. elegans genes are contained in operons, multigene clusters controlled by a single promoter. The vast majority of these are of a type where the genes in the cluster are ~100 bp apart and the pre-mRNA is processed by 3' end formation accompanied by trans-splicing. A spliced leader, SL2, is specialized for operon processing. Here we summarize current knowledge on several variations on this theme including: (1) hybrid operons, which have additional promoters between genes; (2) operons with exceptionally long (> 1 kb) intercistronic regions; (3) operons with a second 3' end formation site close to the trans-splice site; (4) alternative operons, in which the exons are sometimes spliced as a single gene and sometimes as two genes; (5) SL1-type operons, which use SL1 instead of SL2 to trans-splice and in which there is no intercistronic space; (6) operons that make dicistronic mRNAs; and (7) non-operon gene clusters, in which either two genes use a single exon as the 3' end of one and the 5' end of the next, or the 3' UTR of one gene serves as the outron of the next. Each of these variations is relatively infrequent, but together they show a remarkable variety of tight-linkage gene arrangements in the C. elegans genome.

  3. Expression Analysis of Genes in the Nif Cluster of Clostridium beijerinckii



    The nif genes of Clostridium beijerinckii NRRL B593 occupy a region of about 16 kilobases. Besides the two glnB-like genes, five other genes are interspersed between the nifNB and the nifVw genes. An expression analysis of the nif genes in nitrogen-fixing and non-nitrogen-fixing cells with probes generated from various regions of the nif cluster by northern blot analysis revealed the presence of four different transcripts in nitrogen-fixing cells. Two of these transcripts had the predicted si...

  4. Genetic localization and in vivo characterization of a Monascus azaphilone pigment biosynthetic gene cluster. (United States)

    Balakrishnan, Bijinu; Karki, Suman; Chiu, Shih-Hau; Kim, Hyun-Ju; Suh, Jae-Won; Nam, Bora; Yoon, Yeo-Min; Chen, Chien-Chi; Kwon, Hyung-Jin


    Monascus spp. produce several well-known polyketides such as monacolin K, citrinin, and azaphilone pigments. In this study, the azaphilone pigment biosynthetic gene cluster was identified through T-DNA random mutagenesis in Monascus purpureus. The albino mutant W13 bears a T-DNA insertion upstream of a transcriptional regulator gene (mppR1). The transcription of mppR1 and the nearby polyketide synthase gene (MpPKS5) was significantly repressed in the W13 mutant. Targeted inactivation of MpPKS5 also gave rise to an albino mutant, confirming that mppR1 and MpPKS5 belong to an azaphilone pigment biosynthetic gene cluster. This M. purpureus sequence was used to identify the whole biosynthetic gene cluster in the Monascus pilosus genome. MpPKS5 contains SAT/KS/AT/PT/ACP/MT/R domains, and this domain organization is preserved in other azaphilone polyketide synthases. This biosynthetic gene cluster also encodes fatty acid synthase (FAS), which is predicted to assist the synthesis of 3-oxooactanoyl-CoA and 3-oxodecanoyl-CoA. These 3-oxoacyl compounds are proposed to be incorporated into the azaphilone backbone to complete the pigment biosynthesis. A monooxygenase gene (an azaH and tropB homolog) that is located far downstream of the FAS gene is proposed to be involved in pyrone ring formation. A homology search on other fungal genome sequences suggests that this azaphilone pigment gene cluster also exists in the Penicillium marneffei and Talaromyces stipitatus genomes.

  5. Simultaneous clustering of gene expression data with clinical chemistry and pathological evaluations reveals phenotypic prototypes

    Directory of Open Access Journals (Sweden)

    Wolfinger Russell D


    Full Text Available Abstract Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and

  6. Epigenetic characterization of the growth hormone gene identifies SmcHD1 as a regulator of autosomal gene clusters.

    Directory of Open Access Journals (Sweden)

    Shabnam Massah

    Full Text Available Regulatory elements for the mouse growth hormone (GH gene are located distally in a putative locus control region (LCR in addition to key elements in the promoter proximal region. The role of promoter DNA methylation for GH gene regulation is not well understood. Pit-1 is a POU transcription factor required for normal pituitary development and obligatory for GH gene expression. In mammals, Pit-1 mutations eliminate GH production resulting in a dwarf phenotype. In this study, dwarf mice illustrated that Pit-1 function was obligatory for GH promoter hypomethylation. By monitoring promoter methylation levels during developmental GH expression we found that the GH promoter became hypomethylated coincident with gene expression. We identified a promoter differentially methylated region (DMR that was used to characterize a methylation-dependent DNA binding activity. Upon DNA affinity purification using the DMR and nuclear extracts, we identified structural maintenance of chromosomes hinge domain containing -1 (SmcHD1. To better understand the role of SmcHD1 in genome-wide gene expression, we performed microarray analysis and compared changes in gene expression upon reduced levels of SmcHD1 in human cells. Knock-down of SmcHD1 in human embryonic kidney (HEK293 cells revealed a disproportionate number of up-regulated genes were located on the X-chromosome, but also suggested regulation of genes on non-sex chromosomes. Among those, we identified several genes located in the protocadherin β cluster. In addition, we found that imprinted genes in the H19/Igf2 cluster associated with Beckwith-Wiedemann and Silver-Russell syndromes (BWS & SRS were dysregulated. For the first time using human cells, we showed that SmcHD1 is an important regulator of imprinted and clustered genes.

  7. Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

    Directory of Open Access Journals (Sweden)

    Lee Yun-Shien


    Full Text Available Abstract Background The hierarchical clustering tree (HCT with a dendrogram 1 and the singular value decomposition (SVD with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. Results This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. Conclusion We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at

  8. Identification of the Fucose Synthetase Gene in the Colanic Acid Gene Cluster of Escherichia coli K-12


    Andrianopoulos, Kanella; Wang, Lei; Reeves, Peter R.


    GDP–l-fucose, the substrate for fucosyltransferases for addition of fucose to polysaccharides or glycoproteins in both procaryotes and eucaryotes, is made from GDP–d-mannose. l-Fucose is a component of bacterial surface antigens, including the extracellular polysaccharide colanic acid produced by most Escherichia coli strains. We previously sequenced the E. coli colanic acid gene cluster and identified one of the GDP–l-fucose biosynthetic pathway genes, gmd. We report here the identification ...

  9. Clustering Time Series Gene Expression Data Based on Sum-of-Exponentials Fitting

    Directory of Open Access Journals (Sweden)

    Giurcăneanu Ciprian Doru


    Full Text Available This paper presents a method based on fitting a sum-of-exponentials model to the nonuniformly sampled data, for clustering the time series of gene expression data. The structure of the model is estimated by using the minimum description length (MDL principle for nonlinear regression, in a new form, incorporating a normalized maximum-likelihood (NML model for a subset of the parameters. The performance of the structure estimation method is studied using simulated data, and the superiority of the new selection criterion over earlier criteria is demonstrated. The accuracy of the nonlinear estimates of the model parameters is analyzed with respect to the Cramér-Rao lower bounds. Clustering examples of gene expression data sets from a developmental biology application are presented, revealing gene grouping into clusters according to functional classes.

  10. Form gene clustering method about pan-ethnic-group products based on emotional semantic (United States)

    Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui


    The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.

  11. Functional identification of gene cluster for the aniline metabolic pathway mediated by transposable element

    Institute of Scientific and Technical Information of China (English)

    LIANG Quanfeng; Takeo Masahiro; LIN Min; CHEN Ming; XU Yuquan; ZHANG Wei; PING Shuzhen; LU Wei; SONG Xianlong; WANG Weiwei; GENG Lizhao


    A convenient and widely applicable method has been developed to clone aniline metabolic gene cluster in this study. Three positive recombinant plasmids pDA1, pDB2 and pDB11 were cloned from genomic library of aniline degradation strain AD9. The result of aniline dioxygenase (AD) activity and catechol 2,3-oxygenase (C23O) activity assay showed that pDA1 and pDB11 contain aniline dioxygenase genes and catechol 2,3-dioxygenase genes, respectively. The sequence analysis of the total 24.7-kb region revealed that this region contains 25 ORFs, of which 17 genes involve metabolism of aniline. In the gene cluster, the first five genes (tadQTA1A2B) and the subsequent gene (tadR1) were predicted to encode a multi-component aniline dioxygenase and a LysR-type regulator, respectively, while the others (tadD1C1D2C2EFGIJKL) were expected to encode meta- cleavage pathway enzymes for catechol degradation. The gene cluster was surrounded by two IS1071 sequences.

  12. A remarkably stable TipE gene cluster: evolution of insect Para sodium channel auxiliary subunits

    Directory of Open Access Journals (Sweden)

    Li Jia


    Full Text Available Abstract Background First identified in fruit flies with temperature-sensitive paralysis phenotypes, the Drosophila melanogaster TipE locus encodes four voltage-gated sodium (NaV channel auxiliary subunits. This cluster of TipE-like genes on chromosome 3L, and a fifth family member on chromosome 3R, are important for the optional expression and functionality of the Para NaV channel but appear quite distinct from auxiliary subunits in vertebrates. Here, we exploited available arthropod genomic resources to trace the origin of TipE-like genes by mapping their evolutionary histories and examining their genomic architectures. Results We identified a remarkably conserved synteny block of TipE-like orthologues with well-maintained local gene arrangements from 21 insect species. Homologues in the water flea, Daphnia pulex, suggest an ancestral pancrustacean repertoire of four TipE-like genes; a subsequent gene duplication may have generated functional redundancy allowing gene losses in the silk moth and mosquitoes. Intronic nesting of the insect TipE gene cluster probably occurred following the divergence from crustaceans, but in the flour beetle and silk moth genomes the clusters apparently escaped from nesting. Across Pancrustacea, TipE gene family members have experienced intronic nesting, escape from nesting, retrotransposition, translocation, and gene loss events while generally maintaining their local gene neighbourhoods. D. melanogaster TipE-like genes exhibit coordinated spatial and temporal regulation of expression distinct from their host gene but well-correlated with their regulatory target, the Para NaV channel, suggesting that functional constraints may preserve the TipE gene cluster. We identified homology between TipE-like NaV channel regulators and vertebrate Slo-beta auxiliary subunits of big-conductance calcium-activated potassium (BKCa channels, which suggests that ion channel regulatory partners have evolved distinct lineage

  13. Physical and genetic map of the major nif gene cluster from Azotobacter vinelandii. (United States)

    Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M S; Cash, V L; Beynon, J; Newton, W E; Dean, D R


    Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include nifH, nifD, nifK, nifT, nifY, nifE, nifN, nifX, nifU, nifS, nifV, nifW, nifZ, nifM, and nifF. Although there are significant spatial differences, the identified A. vinelandii nif-specific genes have the same sequential arrangement as the corresponding nif-specific genes from K. pneumoniae. Twelve other potential genes whose expression could be subject to nif-specific regulation were also found interspersed among the identified nif-specific genes. These potential genes do not encode products that are structurally related to the identified nif-specific gene products. Eleven potential nif-specific promoters were identified within the major nif cluster, and nine of these are preceded by an appropriate upstream activator sequence. A + T-rich regions were identified between 8 of the 11 proposed nif promoter sequences and their upstream activator sequences. Site-directed deletion-and-insertion mutagenesis was used to establish a genetic map of the major nif cluster.

  14. Gene microarray data analysis using parallel point-symmetry-based clustering. (United States)

    Sarkar, Anasua; Maulik, Ujjwal


    Identification of co-expressed genes is the central goal in microarray gene expression analysis. Point-symmetry-based clustering is an important unsupervised learning technique for recognising symmetrical convex- or non-convex-shaped clusters. To enable fast clustering of large microarray data, we propose a distributed time-efficient scalable approach for point-symmetry-based K-Means algorithm. A natural basis for analysing gene expression data using symmetry-based algorithm is to group together genes with similar symmetrical expression patterns. This new parallel implementation also satisfies linear speedup in timing without sacrificing the quality of clustering solution on large microarray data sets. The parallel point-symmetry-based K-Means algorithm is compared with another new parallel symmetry-based K-Means and existing parallel K-Means over eight artificial and benchmark microarray data sets, to demonstrate its superiority, in both timing and validity. The statistical analysis is also performed to establish the significance of this message-passing-interface based point-symmetry K-Means implementation. We also analysed the biological relevance of clustering solutions.

  15. Apple contains receptor-like genes homologous to the Cladosporium fulvum resistance gene family of tomato with a cluster of genes cosegregating with Vf apple scab resistance. (United States)

    Vinatzer, B A; Patocchi, A; Gianfranceschi, L; Tartarini, S; Zhang, H B; Gessler, C; Sansavini, S


    Scab caused by the fungal pathogen Venturia inaequalis is the most common disease of cultivated apple (Malus x domestica Borkh.). Monogenic resistance against scab is found in some small-fruited wild Malus species and has been used in apple breeding for scab resistance. Vf resistance of Malus floribunda 821 is the most widely used scab resistance source. Because breeding a high-quality cultivar in perennial fruit trees takes dozens of years, cloning disease resistance genes and using them in the transformation of high-quality apple varieties would be advantageous. We report the identification of a cluster of receptor-like genes with homology to the Cladosporium fulvum (Cf) resistance gene family of tomato on bacterial artificial chromosome clones derived from the Vf scab resistance locus. Three members of the cluster were sequenced completely. Similar to the Cf gene family of tomato, the deduced amino acid sequences coded by these genes contain an extracellular leucine-rich repeat domain and a transmembrane domain. The transcription of three members of the cluster was determined by reverse transcriptionpolymerase chain reaction to be constitutive, and the transcription and translation start of one member was verified by 5' rapid amplification of cDNA ends. We discuss the parallels between Cf resistance of tomato and Vf resistance of apple and the possibility that one of the members of the gene cluster is the Vf gene. Cf homologs from other regions of the apple genome also were identified and are likely to present other scab resistance genes.

  16. A genome-wide analysis of nonribosomal peptide synthetase gene clusters and their peptides in a Planktothrix rubescens strain

    Directory of Open Access Journals (Sweden)

    Nederbragt Alexander J


    Full Text Available Abstract Background Cyanobacteria often produce several different oligopeptides, with unknown biological functions, by nonribosomal peptide synthetases (NRPS. Although some cyanobacterial NRPS gene cluster types are well described, the entire NRPS genomic content within a single cyanobacterial strain has never been investigated. Here we have combined a genome-wide analysis using massive parallel pyrosequencing ("454" and mass spectrometry screening of oligopeptides produced in the strain Planktothrix rubescens NIVA CYA 98 in order to identify all putative gene clusters for oligopeptides. Results Thirteen types of oligopeptides were uncovered by mass spectrometry (MS analyses. Microcystin, cyanopeptolin and aeruginosin synthetases, highly similar to already characterized NRPS, were present in the genome. Two novel NRPS gene clusters were associated with production of anabaenopeptins and microginins, respectively. Sequence-depth of the genome and real-time PCR data revealed three copies of the microginin gene cluster. Since NRPS gene cluster candidates for microviridin and oscillatorin synthesis could not be found, putative (gene encoded precursor peptide sequences to microviridin and oscillatorin were found in the genes mdnA and oscA, respectively. The genes flanking the microviridin and oscillatorin precursor genes encode putative modifying enzymes of the precursor oligopeptides. We therefore propose ribosomal pathways involving modifications and cyclisation for microviridin and oscillatorin. The microviridin, anabaenopeptin and cyanopeptolin gene clusters are situated in close proximity to each other, constituting an oligopeptide island. Conclusion Altogether seven nonribosomal peptide synthetase (NRPS gene clusters and two gene clusters putatively encoding ribosomal oligopeptide biosynthetic pathways were revealed. Our results demonstrate that whole genome shotgun sequencing combined with MS-directed determination of oligopeptides successfully

  17. The evolution and maintenance of Hox gene clusters in vertebrates and the teleost-specific genome duplication. (United States)

    Kuraku, Shigehiro; Meyer, Axel


    Hox genes are known to specify spatial identities along the anterior-posterior axis during embryogenesis. In vertebrates and most other deuterostomes, they are arranged in sets of uninterrupted clusters on chromosomes, and are in most cases expressed in a "colinear" fashion, in which genes closer to the 3-end of the Hox clusters are expressed earlier and more anteriorly and genes close to the 5-end of the clusters later and more posteriorly. In this review, we summarize the current understanding of how Hox gene clusters have been modified from basal lineages of deuterostomes to diverse taxa of vertebrates. Our parsimony reconstruction of Hox cluster architecture at various stages of vertebrate evolution highlights that the variation in Hox cluster structures among jawed vertebrates is mostly due to secondary lineage-specific gene losses and an additional genome duplication that occurred in the actinopterygian stem lineage, the teleost-specific genome duplication (TSGD).

  18. A phase synchronization clustering algorithm for identifying interesting groups of genes from cell cycle expression data

    Directory of Open Access Journals (Sweden)

    Tcha Hong


    Full Text Available Abstract Background The previous studies of genome-wide expression patterns show that a certain percentage of genes are cell cycle regulated. The expression data has been analyzed in a number of different ways to identify cell cycle dependent genes. In this study, we pose the hypothesis that cell cycle dependent genes are considered as oscillating systems with a rhythm, i.e. systems producing response signals with period and frequency. Therefore, we are motivated to apply the theory of multivariate phase synchronization for clustering cell cycle specific genome-wide expression data. Results We propose the strategy to find groups of genes according to the specific biological process by analyzing cell cycle specific gene expression data. To evaluate the propose method, we use the modified Kuramoto model, which is a phase governing equation that provides the long-term dynamics of globally coupled oscillators. With this equation, we simulate two groups of expression signals, and the simulated signals from each group shares their own common rhythm. Then, the simulated expression data are mixed with randomly generated expression data to be used as input data set to the algorithm. Using these simulated expression data, it is shown that the algorithm is able to identify expression signals that are involved in the same oscillating process. We also evaluate the method with yeast cell cycle expression data. It is shown that the output clusters by the proposed algorithm include genes, which are closely associated with each other by sharing significant Gene Ontology terms of biological process and/or having relatively many known biological interactions. Therefore, the evaluation analysis indicates that the method is able to identify expression signals according to the specific biological process. Our evaluation analysis also indicates that some portion of output by the proposed algorithm is not obtainable by the traditional clustering algorithm with

  19. Sequencing, physical organization and kinetic expression of the patulin biosynthetic gene cluster from Penicillium expansum. (United States)

    Tannous, Joanna; El Khoury, Rhoda; Snini, Selma P; Lippi, Yannick; El Khoury, André; Atoui, Ali; Lteif, Roger; Oswald, Isabelle P; Puel, Olivier


    Patulin is a polyketide-derived mycotoxin produced by numerous filamentous fungi. Among them, Penicillium expansum is by far the most problematic species. This fungus is a destructive phytopathogen capable of growing on fruit, provoking the blue mold decay of apples and producing significant amounts of patulin. The biosynthetic pathway of this mycotoxin is chemically well-characterized, but its genetic bases remain largely unknown with only few characterized genes in less economic relevant species. The present study consisted of the identification and positional organization of the patulin gene cluster in P. expansum strain NRRL 35695. Several amplification reactions were performed with degenerative primers that were designed based on sequences from the orthologous genes available in other species. An improved genome Walking approach was used in order to sequence the remaining adjacent genes of the cluster. RACE-PCR was also carried out from mRNAs to determine the start and stop codons of the coding sequences. The patulin gene cluster in P. expansum consists of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK. These genes share 60-70% of identity with orthologous genes grouped differently, within a putative patulin cluster described in a non-producing strain of Aspergillus clavatus. The kinetics of patulin cluster genes expression was studied under patulin-permissive conditions (natural apple-based medium) and patulin-restrictive conditions (Eagle's minimal essential medium), and demonstrated a significant association between gene expression and patulin production. In conclusion, the sequence of the patulin cluster in P. expansum constitutes a key step for a better understanding of the mechanisms leading to patulin production in this fungus. It will allow the role of each gene to be elucidated, and help to define strategies to reduce patulin production in apple-based products.

  20. A gene cluster for biosynthesis of the sesquiterpenoid antibiotic pentalenolactone in Streptomyces avermitilis. (United States)

    Tetzlaff, Charles N; You, Zheng; Cane, David E; Takamatsu, Satoshi; Omura, Satoshi; Ikeda, Haruo


    Streptomyces avermitilis, an industrial organism responsible for the production of the anthelminthic avermectins, harbors a 13.4 kb gene cluster containing 13 unidirectionally transcribed open reading frames corresponding to the apparent biosynthetic operon for the sesquiterpene antibiotic pentalenolactone. The advanced intermediate pentalenolactone F, along with the shunt metabolite pentalenic acid, could be isolated from cultures of S. avermitilis, thereby establishing that the pentalenolactone biosynthetic pathway is functional in S. avermitilis. Deletion of the entire 13.4 kb cluster from S. avermitilis abolished formation of pentalenolactone metabolites, while transfer of the intact cluster to the pentalenolactone nonproducer Streptomyces lividans 1326 resulted in production of pentalenic acid. Direct evidence for the biochemical function of the individual biosynthetic genes came from expression of the ptlA gene (SAV2998) in Escherichia coli. Assay of the resultant protein established that PtlA is a pentalenene synthase, catalyzing the cyclization of farnesyl diphosphate to pentalenene, the parent hydrocarbon of the pentalenolactone family of metabolites. The most upstream gene in the cluster, gap1 (SAV2990), was shown to correspond to the pentalenolactone resistance gene, based on expression in E. coli and demonstration that the resulting glyceraldehyde-3-phosphate dehydrogenase, the normal target of pentalenolactone, was insensitive to the antibiotic. Furthermore, a second GAPDH isozyme (gap2, SAV6296) has been expressed in E. coli and shown to be inactivated by pentalenolactone.

  1. Nucleotide sequence analysis of hypervariable junctions of Haemophilus influenzae pilus gene clusters. (United States)

    Read, T D; Satola, S W; Farley, M M


    Haemophilus influenzae pili are surface structures that promote attachment to human epithelial cells. The five genes that encode pili, hifABCDE, are found inserted in genomes either between pmbA and hpt (hif-1) or between purE and pepN (hif-2). We determined the sequence between the ends of the pilus clusters and bordering genes in a number of H. influenzae strains. The junctions of the hif-1 cluster (limited to biogroup aegyptius isolates) are structurally simple. In contrast, hif-2 junctions are highly diverse, complex assemblies of conserved intergenic sequences (including genes hicA and hicB) with evidence of frequent recombination. Variation at hif-2 junctions seems to be tied to multiple copies of a 23-bp Haemophilus intergenic dyad sequence. The hif-1 cluster appears to have originated in biogroup aegyptius strains from invasion of the hpt-pmbA region by a DNA template containing the hif-2 genes with termini in the hairpin loop of flanking intergenic dyad sequences. The pilus gene clusters are an interesting model of a mobile "pathogenicity island" not associated with a phage, transposon, or insertion element.

  2. Organization of the human keratin type II gene cluster at 12q13

    Energy Technology Data Exchange (ETDEWEB)

    Yoon, S.J.; LeBlanc-Straceski, J.; Krauter, K. [Albert Einstein College of Medicine, Bronx, NY (United States)] [and others


    Keratin proteins constitute intermediate filaments and are the major differentiation products of mammalian epithelial cells. The epithelial keratins are classified into two groups, type I and type II, and one member of each group is expressed in a given epithelial cell differentiation stage. Mutations in type I and type II keratin genes have now been implicated in three different human genetic disorders, epidermolysis bullosa simplex, epidermolytic hyperkeratosis, and epidermolytic palmoplantar keratoderma. Members of the type I keratins are mapped to human chromosome 17, and the type II keratin genes are mapped to chromosome 12. To understand the organization of the type II keratin genes on chromosome 12, we isolated several yeast artificial chromosomes carrying these keratin genes and examined them in detail. We show that eight already known type II keratin genes are located in a cluster at 12q13, and their relative organization reflects their evolutionary relationship. We also determined that a type I keratin gene, KRT8, is located next to its partner, KRT18, in this cluster. Careful examination of the cluster also revealed that there may be a number of additional keratin genes at this locus that have not been described previously. 41 refs., 3 figs., 1 tab.

  3. Copy number variants in the kallikrein gene cluster.

    Directory of Open Access Journals (Sweden)

    Pernilla Lindahl

    Full Text Available The kallikrein gene family (KLK1-KLK15 is the largest contiguous group of protease genes within the human genome and is associated with both risk and outcome of cancer and other diseases. We searched for copy number variants in all KLK genes using quantitative PCR analysis and analysis of inheritance patterns of single nucleotide polymorphisms. Two deletions were identified: one 2235-bp deletion in KLK9 present in 1.2% of alleles, and one 3394-bp deletion in KLK15 present in 4.0% of alleles. Each deletion eliminated one complete exon and created out-of-frame coding that eliminated the catalytic triad of the resulting truncated gene product, which therefore likely is a non-functional protein. Deletion breakpoints identified by DNA sequencing located the KLK9 deletion breakpoint to a long interspersed element (LINE repeated sequence, while the deletion in KLK15 is located in a single copy sequence. To search for an association between each deletion and risk of prostate cancer (PC, we analyzed a cohort of 667 biopsied men (266 PC cases and 401 men with no evidence of PC at biopsy using short deletion-specific PCR assays. There was no association between evidence of PC in this cohort and the presence of either gene deletion. Haplotyping revealed a single origin of each deletion, with most recent common ancestor estimates of 3000-8000 and 6000-14 000 years for the deletions in KLK9 and KLK15, respectively. The presence of the deletions on the same haplotypes in 1000 Genomes data of both European and African populations indicate an early origin of both deletions. The old age in combination with homozygous presence of loss-of-function variants suggests that some kallikrein-related peptidases have non-essential functions.

  4. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  5. Identification of a gene cluster associated with triclosan catabolism. (United States)

    Kagle, Jeanne M; Paxson, Clayton; Johnstone, Precious; Hay, Anthony G


    Aerobic degradation of bis-aryl ethers like the antimicrobial triclosan typically proceeds through oxygenase-dependent catabolic pathways. Although several studies have reported on bacteria capable of degrading triclosan aerobically, there are no reports describing the genes responsible for this process. In this study, a gene encoding the large subunit of a putative triclosan oxygenase, designated tcsA was identified in a triclosan-degrading fosmid clone from a DNA library of Sphingomonas sp. RD1. Consistent with tcsA's similarity to two-part dioxygenases, a putative FMN-dependent ferredoxin reductase, designated tcsB was found immediately downstream of tcsA. Both tcsAB were found in the midst of a putative chlorocatechol degradation operon. We show that RD1 produces hydroxytriclosan and chlorocatechols during triclosan degradation and that tcsA is induced by triclosan. This is the first study to report on the genetics of triclosan degradation.

  6. The O28 Antigen Gene Clusters of Salmonella enterica subsp. enterica Serovar Dakar and Serovar Pomona Are Different

    Directory of Open Access Journals (Sweden)

    Clifford G. Clark


    Full Text Available A 10 kb O-antigen gene cluster was sequenced from a Salmonella enterica subsp. enterica Dakar O28 reference strain and from two S. Pomona serogroup O28 isolates. The two S. Pomona O antigen gene clusters showed only moderate identity with the S. Dakar O28 gene cluster, suggesting that the O antigen oligosaccharides may contain one or more sugars conferring the O28 epitope but may otherwise be different. These novel findings are absolutely critical for the correct interpretation of molecular serotyping assays targeting genes within the O antigen gene clusters of these Salmonella serotypes and suggest the possibility that the O antigen gene clusters of other Salmonella serovars may also be heterogenous.

  7. Characterization of a major cluster of nif, fix, and associated genes in a sugarcane endophyte, Acetobacter diazotrophicus. (United States)

    Lee, S; Reth, A; Meletzus, D; Sevilla, M; Kennedy, C


    A major 30.5-kb cluster of nif and associated genes of Acetobacter diazotrophicus (syn. Gluconacetobacter diazotrophicus), a nitrogen-fixing endophyte of sugarcane, was sequenced and analyzed. This cluster represents the largest assembly of contiguous nif-fix and associated genes so far characterized in any diazotrophic bacterial species. Northern blots and promoter sequence analysis indicated that the genes are organized into eight transcriptional units. The overall arrangement of genes is most like that of the nif-fix cluster in Azospirillum brasilense, while the individual gene products are more similar to those in species of Rhizobiaceae or in Rhodobacter capsulatus.

  8. Cloning and Characterization of the Polyether Salinomycin Biosynthesis Gene Cluster of Streptomyces albus XM211


    Jiang, Chunyan; Wang, Hougen; Kang, Qianjin; Jing LIU; Bai, Linquan


    Salinomycin is widely used in animal husbandry as a food additive due to its antibacterial and anticoccidial activities. However, its biosynthesis had only been studied by feeding experiments with isotope-labeled precursors. A strategy with degenerate primers based on the polyether-specific epoxidase sequences was successfully developed to clone the salinomycin gene cluster. Using this strategy, a putative epoxidase gene, slnC, was cloned from the salinomycin producer Streptomyces albus XM211...

  9. The Serratia gene cluster encoding biosynthesis of the red antibiotic, prodigiosin, shows species- and strain-dependent genome context variation

    DEFF Research Database (Denmark)

    Harris, Abigail K P; Williamson, Neil R; Slater, Holly;


    The prodigiosin biosynthesis gene cluster (pig cluster) from two strains of Serratia (S. marcescens ATCC 274 and Serratia sp. ATCC 39006) has been cloned, sequenced and expressed in heterologous hosts. Sequence analysis of the respective pig clusters revealed 14 ORFs in S. marcescens ATCC 274 and...

  10. Cluster Analysis and Significance of Novel Genes Related to Molecular Classification of Glioma

    Institute of Scientific and Technical Information of China (English)

    Juxiang Chen; Yicheng Lu; Guohan Hu; Kehua Sun; Chun Luo; Meiqing Lou; Kang Ying; Yao Li


    OBJECTIVE To screen differentially expressed genes in the development of human glioma and establish a primary molecular classification of glioma based on gene expression using cDNA microarrays.METHODS Brain specimens were obtained from 18 patients with glioma, 10males and 8 females, ages 14~62 with an average age of 44.4. The total RNAs of these glioma specimens and two specimens of donated brain of normal adults were extracted. BioStarH140S microarrays (including 8,347old genes and 5,592 novel genes) were adopted and hybridized with probes which were prepared from the total RNAs. Differentially expressed genes between normal tissues and glioma tissues were assayed after scanning cDNA microarrays with ScanArray4000. Northern hybridization and in situ hybridization (ISH) were used to identify functions of novel genes. Those differentially expressed genes were studied with a Hierarchical method and molecular classification of glioma was preliminary carried out.RESULTS Among the 13,939 target genes, there were 1,200 (8.61%)differentially expressed genes, of which 395 (2.83%) were novel genes. A total of 348 genes were up-regulated and 852 genes were down-regulated in the gliomas. The results of bioinformatical analysis, Northern hybridization and ISH revealed that those novel genes were highly associated with gliomas. There were multiple genes, such as the MAP gene、cytoskeleton & matrix motility genes, etc, which were of relevance to classification by the Hierarchical method. Molecular classification of glioma using a Hierarchical cluster was in accordance with pathology and suggested a molecular process of tumorigenesis and development.CONCLUSION Multiple genes play important roles in development of glioma. cDNA microarray technology is a powerful technique in screening for differentially expressed genes between two different kinds of tissues. Further analysis of gene expression and novel genes would be helpful to understand the molecular mechanism of glioma

  11. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth


    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we...... introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration...... of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products...

  12. Accurate prediction of secondary metabolite gene clusters in filamentous fungi

    DEFF Research Database (Denmark)

    Andersen, Mikael Rørdam; Nielsen, Jakob Blæsbjerg; Klitgaard, Andreas


    Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify suppo...... used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom....

  13. Identification and Characterization of a Gene Cluster Mediating Enteroaggregative Escherichia Coli Aggregative Adherence Fimbria I Biogenesis (United States)


    adherent E. coli ( DAEC ). respectively. The LA ties to other known fimbrial biogenesis systems of pathogenic pattern is typified by the formation of...agg gene cluster is configured similarly to 60 to 80% of DAEC strains share relatedness with F1845 the determinants of members of the Dr adhesin

  14. Design-based re-engineering of biosynthetic gene clusters : plug-and-play in practice

    NARCIS (Netherlands)

    Frasch, Hans-Jörg; Medema, Marnix H.; Takano, Eriko; Breitling, Rainer; Gago, Federico; Parayil, Ajikumar


    Synthetic biology is revolutionizing the way in which the biosphere is explored for natural products. Through computational genome mining, thousands of biosynthetic gene clusters are being identified in microbial genomes, which constitute a rich source of potential novel pharmaceuticals. New methods

  15. Characterization and biological role of the O-polysaccharide gene cluster of Yersinia enterocolitica serotype O : 9

    DEFF Research Database (Denmark)

    Skurnik, Mikael; Biedzka-Sarek, Marta; Lubeck, Peter S.


    as an attachment site for both the outer core (OC) hexasaccharide and the O-polysaccharide (OPS; a homopolymer of N-formylperosamine). In this work, we cloned the OPS gene cluster of O:9 and identified 12 genes organized into four operons upstream of the gnd gene. Ten genes were predicted to encode...... glycosyltransferases, the ATP-binding cassette polysaccharide translocators, or enzymes required for the biosynthesis of GDP-N-formylperosamine. The two remaining genes within the OPS gene cluster, galF and galU, were not ascribed a clear function in OPS biosynthesis; however, the latter gene appeared to be essential...

  16. Identification, characterization and metagenome analysis of oocyte-specific genes organized in clusters in the mouse genome

    Directory of Open Access Journals (Sweden)

    Vaiman Daniel


    Full Text Available Abstract Background Genes specifically expressed in the oocyte play key roles in oogenesis, ovarian folliculogenesis, fertilization and/or early embryonic development. In an attempt to identify novel oocyte-specific genes in the mouse, we have used an in silico subtraction methodology, and we have focused our attention on genes that are organized in genomic clusters. Results In the present work, five clusters have been studied: a cluster of thirteen genes characterized by an F-box domain localized on chromosome 9, a cluster of six genes related to T-cell leukaemia/lymphoma protein 1 (Tcl1 on chromosome 12, a cluster composed of a SPErm-associated glutamate (E-Rich (Speer protein expressed in the oocyte in the vicinity of four unknown genes specifically expressed in the testis on chromosome 14, a cluster composed of the oocyte secreted protein-1 (Oosp-1 gene and two Oosp-related genes on chromosome 19, all three being characterized by a partial N-terminal zona pellucida-like domain, and another small cluster of two genes on chromosome 19 as well, composed of a TWIK-Related spinal cord K+ channel encoding-gene, and an unknown gene predicted in silico to be testis-specific. The specificity of expression was confirmed by RT-PCR and in situ hybridization for eight and five of them, respectively. Finally, we showed by comparing all of the isolated and clustered oocyte-specific genes identified so far in the mouse genome, that the oocyte-specific clusters are significantly closer to telomeres than isolated oocyte-specific genes are. Conclusion We have studied five clusters of genes specifically expressed in female, some of them being also expressed in male germ-cells. Moreover, contrarily to non-clustered oocyte-specific genes, those that are organized in clusters tend to map near chromosome ends, suggesting that this specific near-telomere position of oocyte-clusters in rodents could constitute an evolutionary advantage. Understanding the biological

  17. Complete Genome Sequence of the Filamentous Fungus Aspergillus westerdijkiae Reveals the Putative Biosynthetic Gene Cluster of Ochratoxin A (United States)

    Chakrabortti, Alolika; Li, Jinming


    Ochratoxin A (OTA) is a common mycotoxin that contaminates food and agricultural products. Sequencing of the complete genome of Aspergillus westerdijkiae, a major producer of OTA, reveals more than 50 biosynthetic gene clusters, including a putative OTA biosynthetic gene cluster that encodes a dozen of enzymes, transporters, and regulatory proteins. PMID:27635003

  18. Clostridium botulinum strain Af84 contains three neurotoxin gene clusters: bont/A2, bont/F4 and bont/F5.

    Directory of Open Access Journals (Sweden)

    Nir Dover

    Full Text Available Sanger and shotgun sequencing of Clostridium botulinum strain Af84 type Af and its botulinum neurotoxin gene (bont clusters identified the presence of three bont gene clusters rather than the expected two. The three toxin gene clusters consisted of bont subtypes A2, F4 and F5. The bont/A2 and bont/F4 gene clusters were located within the chromosome (the latter in a novel location, while the bont/F5 toxin gene cluster was located within a large 246 kb plasmid. These findings are the first identification of a C. botulinum strain that contains three botulinum neurotoxin gene clusters.

  19. Identification and analysis of the paulomycin biosynthetic gene cluster and titer improvement of the paulomycins in Streptomyces paulus NRRL 8115.

    Directory of Open Access Journals (Sweden)

    Jine Li

    Full Text Available The paulomycins are a group of glycosylated compounds featuring a unique paulic acid moiety. To locate their biosynthetic gene clusters, the genomes of two paulomycin producers, Streptomyces paulus NRRL 8115 and Streptomyces sp. YN86, were sequenced. The paulomycin biosynthetic gene clusters were defined by comparative analyses of the two genomes together with the genome of the third paulomycin producer Streptomyces albus J1074. Subsequently, the identity of the paulomycin biosynthetic gene cluster was confirmed by inactivation of two genes involved in biosynthesis of the paulomycose branched chain (pau11 and the ring A moiety (pau18 in Streptomyces paulus NRRL 8115. After determining the gene cluster boundaries, a convergent biosynthetic model was proposed for paulomycin based on the deduced functions of the pau genes. Finally, a paulomycin high-producing strain was constructed by expressing an activator-encoding gene (pau13 in S. paulus, setting the stage for future investigations.

  20. Characterisation of the paralytic shellfish toxin biosynthesis gene clusters in Anabaena circinalis AWQC131C and Aphanizomenon sp. NH-5

    Directory of Open Access Journals (Sweden)

    Neilan Brett A


    Full Text Available Abstract Background Saxitoxin and its analogues collectively known as the paralytic shellfish toxins (PSTs are neurotoxic alkaloids and are the cause of the syndrome named paralytic shellfish poisoning. PSTs are produced by a unique biosynthetic pathway, which involves reactions that are rare in microbial metabolic pathways. Nevertheless, distantly related organisms such as dinoflagellates and cyanobacteria appear to produce these toxins using the same pathway. Hypothesised explanations for such an unusual phylogenetic distribution of this shared uncommon metabolic pathway, include a polyphyletic origin, an involvement of symbiotic bacteria, and horizontal gene transfer. Results We describe the identification, annotation and bioinformatic characterisation of the putative paralytic shellfish toxin biosynthesis clusters in an Australian isolate of Anabaena circinalis and an American isolate of Aphanizomenon sp., both members of the Nostocales. These putative PST gene clusters span approximately 28 kb and contain genes coding for the biosynthesis and export of the toxin. A putative insertion/excision site in the Australian Anabaena circinalis AWQC131C was identified, and the organization and evolution of the gene clusters are discussed. A biosynthetic pathway leading to the formation of saxitoxin and its analogues in these organisms is proposed. Conclusion The PST biosynthesis gene cluster presents a mosaic structure, whereby genes have apparently transposed in segments of varying size, resulting in different gene arrangements in all three sxt clusters sequenced so far. The gene cluster organizational structure and sequence similarity seems to reflect the phylogeny of the producer organisms, indicating that the gene clusters have an ancient origin, or that their lateral transfer was also an ancient event. The knowledge we gain from the characterisation of the PST biosynthesis gene clusters, including the identity and sequence of the genes involved

  1. Polymorphisms and linkage analysis for ICAM-1 and the selectin gene cluster

    Energy Technology Data Exchange (ETDEWEB)

    Vora, D.K.; Rosenbloom, C.L.; Cottingham, R.W. [Baylor College of Medicine, Houston, TX (United States)] [and others


    Genetic polymorphisms in leukocyte and endothelial cell adhesion molecules may be important variables with regard to susceptibility to multifactorial disease processes that include an inflammatory component. For this reason, polymorphisms were sought for intercellular adhesion molecule-1 (ICAM-1; gene symbol ICAM1) and for the three genes in the selectin cluster, P-selectin, L-selectin, and E-selectin (gene symbols SELP, SELL, and SELE, respectively). Two amino acid polymorphisms were identified for ICAM-1; Gly or Arg at codon 241 and Lys or Glu at codon 469. Dinucleotide repeat polymorphisms were identified in the 3{prime}-untranslated region for ICAM-1 and in intron 9 for P-selectin. Restriction fragment length polymorphisms were found using cDNAs for each of the three selectin genes as probes; E-selectin with BglII, P-selectin with ScaI, and L-selectin with HincII. Linkage analysis was performed for the selectin gene cluster and for ICAM-1 using the CEPH families; ICAM-1 is very tightly linked to the LDL receptor on chromosome 19, and the selectin cluster is linked to markers at chromosome 1q23. 41 refs., 2 tabs.

  2. A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data

    Directory of Open Access Journals (Sweden)

    Scherer Stephen W


    Full Text Available Abstract Background Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. Results We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. Conclusions The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.

  3. Nitrate assimilation gene cluster from the heterocyst-forming cyanobacterium Anabaena sp. strain PCC 7120. (United States)

    Frías, J E; Flores, E; Herrero, A


    A region of the genome of the filamentous, nitrogen-fixing, heterocyst-forming cyanobacterium Anabaena sp. strain PCC 7120 that contains a cluster of genes involved in nitrate assimilation has been identified. The genes nir, encoding nitrite reductase, and nrtABC, encoding elements of a nitrate permease, have been cloned. Insertion of a gene cassette into the nir-nrtA region impaired expression of narB, the nitrate reductase structural gene which together with nrtD is found downstream from nrtC in the gene cluster. This indicates that the nir-nrtABCD-narB genes are cotranscribed, thus constituting an operon. Expression of the nir operon in strain PCC 7120 is subjected to ammonium-promoted repression and takes place from an NtcA-activated promoter located 460 bp upstream from the start of the nir gene. In the absence of ammonium, cellular levels of the products of the nir operon are higher in the presence of nitrate than in the absence of combined nitrogen.

  4. Sequencing and transcriptional analysis of the biosynthesis gene cluster of putrescine-producing Lactococcus lactis. (United States)

    Ladero, Victor; Rattray, Fergal P; Mayo, Baltasar; Martín, María Cruz; Fernández, María; Alvarez, Miguel A


    Lactococcus lactis is a prokaryotic microorganism with great importance as a culture starter and has become the model species among the lactic acid bacteria. The long and safe history of use of L. lactis in dairy fermentations has resulted in the classification of this species as GRAS (General Regarded As Safe) or QPS (Qualified Presumption of Safety). However, our group has identified several strains of L. lactis subsp. lactis and L. lactis subsp. cremoris that are able to produce putrescine from agmatine via the agmatine deiminase (AGDI) pathway. Putrescine is a biogenic amine that confers undesirable flavor characteristics and may even have toxic effects. The AGDI cluster of L. lactis is composed of a putative regulatory gene, aguR, followed by the genes (aguB, aguD, aguA, and aguC) encoding the catabolic enzymes. These genes are transcribed as an operon that is induced in the presence of agmatine. In some strains, an insertion (IS) element interrupts the transcription of the cluster, which results in a non-putrescine-producing phenotype. Based on this knowledge, a PCR-based test was developed in order to differentiate nonproducing L. lactis strains from those with a functional AGDI cluster. The analysis of the AGDI cluster and their flanking regions revealed that the capacity to produce putrescine via the AGDI pathway could be a specific characteristic that was lost during the adaptation to the milk environment by a process of reductive genome evolution.

  5. A polyketide synthase-peptide synthetase gene cluster from an uncultured bacterial symbiont of Paederus beetles. (United States)

    Piel, Jörn


    Many drug candidates from marine and terrestrial invertebrates are suspected metabolites of uncultured bacterial symbionts. The antitumor polyketides of the pederin family, isolated from beetles and sponges, are an example. Drug development from such sources is commonly hampered by low yields and the difficulty of sustaining invertebrate cultures. To obtain insight into the true producer and find alternative supplies of these rare drug candidates, the putative pederin biosynthesis genes were cloned from total DNA of Paederus fuscipes beetles, which use this compound for chemical defense. Sequence analysis of the gene cluster and adjacent regions revealed the presence of ORFs with typical bacterial architecture and homologies. The ped cluster, which is present only in beetle specimens with high pederin content, is located on a 54-kb region bordered by transposase pseudogenes and encodes a mixed modular polyketide synthase/nonribosomal peptide synthetase. Notably, none of the modules contains regions with homology to acyltransferase domains, but two copies of isolated monodomain acyltransferase genes were found at the upstream end of the cluster. In line with an involvement in pederin biosynthesis, the upstream cluster region perfectly mirrors pederin structure. The unexpected presence of additional polyketide synthase/nonribosomal peptide synthetase modules reveals surprising insights into the evolutionary relationship between pederin-type pathways in beetles and sponges.

  6. Copy number of pilus gene clusters in Haemophilus influenzae and variation in the hifE pilin gene. (United States)

    Read, T D; Satola, S W; Opdyke, J A; Farley, M M


    Brazilian purpuric fever (BPF)-associated Haemophilus influenzae biogroup aegyptius strain F3031 contains two identical copies of a five gene cluster (hifA to hifE) encoding pili similar to well-characterized Hif fimbriae of H. influenzae type b. HifE, the putative pilus tip adhesin of F3031, shares only 40% amino acid sequence similarity with the same molecule from type b strains, whereas the other four proteins have 75 to 95% identity. To determine whether pilus cluster duplication and the hifE(F3031) allele were special features of BPF-associated bacteria, we analyzed a collection of H. influenzae strains by PCR with hifA- and hifE-specific oligonucleotides, by Southern hybridization with a hifC gene probe, and by nucleotide sequencing. The presence of two pilus clusters was limited to some H. influenzae biogroup aegyptius strains. The hifE(F3031) allele was limited to H. influenzae biogroup aegyptius. Two strains contained one copy of hifE(F3031) and one copy of a variant hifE allele. We determined the nucleotide sequences of four hifE genes from H. influenzae biogroup aegyptius and H. influenzae capsule serotypes a and c. The predicted proteins produced by these genes demonstrated only 35 to 70% identity to the three published HifE proteins from nontypeable H. influenzae, serotype b, and BPF strains. The C-terminal third of the molecules implicated in chaperone binding was the most highly conserved region. Three conserved domains in the otherwise highly variable N-terminal putative receptor-binding region of HifE were similar to conserved portions in the N terminus of Neisseria pilus adhesin PilC. We concluded that two pilus clusters and hifE(F3031) were not specific for BPF-causing H. influenzae, and we also identified portions of HifE possibly involved in binding mammalian cell receptors.

  7. Bacillus sp.CDB3 isolated from cattle dip-sites possesses two ars gene clusters

    Institute of Scientific and Technical Information of China (English)

    Somanath Bhat; Xi Luo; Zhiqiang Xu; Lixia Liu; Ren Zhang


    Contamination of soil and water by arsenic is a global problem.In Australia, the dipping of cattle in arsenic-containing solution to control cattle ticks in last centenary has left many sites heavily contaminated with arsenic and other toxicants.We had previously isolated five soil bacterial strains (CDB1-5) highly resistant to arsenic.To understand the resistance mechanism, molecular studies have been carried out.Two chromosome-encoded arsenic resistance (ars) gene clusters have been cloned from CDB3 (Bacillus sp.).They both function in Escherichia coli and cluster 1 exerts a much higher resistance to the toxic metalloid.Cluster 2 is smaller possessing four open reading frames (ORFs) arsRorf2BC, similar to that identified in Bacillus subtilis Skin element.Among the eight ORFs in cluster 1 five are analogs of common ars genes found in other bacteria, however, organized in a unique order arsRBCDA instead of arsRDABC.Three other putative genes are located directly downstream and designated as arsTIP based on the homologies of their theoretical translation sequences respectively to thioredoxin reductases, iron-sulphur cluster proteins and protein phosphatases.The latter two are novel of any known ars operons.The arsD gene from Bacillus species was cloned for the first time and the predict protein differs from the well studied E.coli ArsD by lacking two pairs of C-terrninal cysteine residues.Its functional involvement in arsenic resistance has been confirmed by a deletion experiment.There exists also an inverted repeat in the intergenic region between arsC and arsD implying some unknown transcription regulation.

  8. Genetic variations and haplotype diversity of the UGT1 gene cluster in the Chinese population.

    Directory of Open Access Journals (Sweden)

    Jing Yang

    Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.

  9. The Histidine Decarboxylase Gene Cluster of Lactobacillus parabuchneri Was Gained by Horizontal Gene Transfer and Is Mobile within the Species (United States)

    Wüthrich, Daniel; Berthoud, Hélène; Wechsler, Daniel; Eugster, Elisabeth; Irmler, Stefan; Bruggmann, Rémy


    Histamine in food can cause intolerance reactions in consumers. Lactobacillus parabuchneri (L. parabuchneri) is one of the major causes of elevated histamine levels in cheese. Despite its significant economic impact and negative influence on human health, no genomic study has been published so far. We sequenced and analyzed 18 L. parabuchneri strains of which 12 were histamine positive and 6 were histamine negative. We determined the complete genome of the histamine positive strain FAM21731 with PacBio as well as Illumina and the genomes of the remaining 17 strains using the Illumina technology. We developed the synteny aware ortholog finding algorithm SynOrf to compare the genomes and we show that the histidine decarboxylase (HDC) gene cluster is located in a genomic island. It is very likely that the HDC gene cluster was transferred from other lactobacilli, as it is highly conserved within several lactobacilli species. Furthermore, we have evidence that the HDC gene cluster was transferred within the L. parabuchneri species. PMID:28261177

  10. A Papaver somniferum 10-gene cluster for synthesis of the anticancer alkaloid noscapine. (United States)

    Winzer, Thilo; Gazda, Valeria; He, Zhesi; Kaminski, Filip; Kern, Marcelo; Larson, Tony R; Li, Yi; Meade, Fergus; Teodor, Roxana; Vaistij, Fabián E; Walker, Carol; Bowser, Tim A; Graham, Ian A


    Noscapine is an antitumor alkaloid from opium poppy that binds tubulin, arrests metaphase, and induces apoptosis in dividing human cells. Elucidation of the biosynthetic pathway will enable improvement in the commercial production of noscapine and related bioactive molecules. Transcriptomic analysis revealed the exclusive expression of 10 genes encoding five distinct enzyme classes in a high noscapine-producing poppy variety, HN1. Analysis of an F(2) mapping population indicated that these genes are tightly linked in HN1, and bacterial artificial chromosome sequencing confirmed that they exist as a complex gene cluster for plant alkaloids. Virus-induced gene silencing resulted in accumulation of pathway intermediates, allowing gene function to be linked to noscapine synthesis and a novel biosynthetic pathway to be proposed.

  11. MeSH key terms for validation and annotation of gene expression clusters

    Energy Technology Data Exchange (ETDEWEB)

    Rechtsteiner, A. (Andreas); Rocha, L. M. (Luis Mateus)


    Integration of different sources of information is a great challenge for the analysis of gene expression data, and for the field of Functional Genomics in general. As the availability of numerical data from high-throughput methods increases, so does the need for technologies that assist in the validation and evaluation of the biological significance of results extracted from these data. In mRNA assaying with microarrays, for example, numerical analysis often attempts to identify clusters of co-expressed genes. The important task to find the biological significance of the results and validate them has so far mostly fallen to the biological expert who had to perform this task manually. One of the most promising avenues to develop automated and integrative technology for such tasks lies in the application of modern Information Retrieval (IR) and Knowledge Management (KM) algorithms to databases with biomedical publications and data. Examples of databases available for the field are bibliographic databases c ntaining scientific publications (e.g. MEDLINE/PUBMED), databases containing sequence data (e.g. GenBank) and databases of semantic annotations (e.g. the Gene Ontology Consortium and Medical Subject Headings (MeSH)). We present here an approach that uses the MeSH terms and their concept hierarchies to validate and obtain functional information for gene expression clusters. The controlled and hierarchical MeSH vocabulary is used by the National Library of Medicine (NLM) to index all the articles cited in MEDLINE. Such indexing with a controlled vocabulary eliminates some of the ambiguity due to polysemy (terms that have multiple meanings) and synonymy (multiple terms have similar meaning) that would be encountered if terms would be extracted directly from the articles due to differing article contexts or author preferences and background. Further, the hierarchical organization of the MeSH terms can illustrate the conceptuallfunctional relationships of genes

  12. Molecular cloning and characterization of the human beta-like globin gene cluster. (United States)

    Fritsch, E F; Lawn, R M; Maniatis, T


    The genes encoding human embryonic (epsilon), fetal (G gamma, A gamma) and adult (delta, beta) beta-like globin polypeptides were isolated as a set of overlapping cloned DNA fragments from bacteriophage lambda libraries of high molecular weight (15-20 kb) chromosomal DNA. The 65 kb of DNA represented in these overlapping clones contains the genes for all five beta-like polypeptides, including the embryonic epsilon-globin gene, for which the chromosomal location was previously unknown. All five genes are transcribed from the same DNA strand and are arranged in the order 5'-epsilon-(13.3 kb)-G gamma-(3.5 kb)-A gamma-(13.9 kb)-delta-(5.4 kb)-beta-3'. Thus the genes are positioned on the chromosome in the order of their expression during development. In addition to the five known beta-like globin genes, we have detected two other beta-like globin sequences which do not correspond to known polypeptides. One of these sequences has been mapped to the A gamma-delta intergenic region while the other is located 6-9 kb 5' to the epsilon gene. Cross hybridization experiments between the intergenic sequences of the gene cluster have revealed a nonglobin repeat sequence (*) which is interspersed with the globin genes in the following manner: 5'-**epsilon-*G gamma-A gamma*-**delta-beta*-3'. Fine structure mapping of the region located 5' to the delta-globin gene revealed two repeats with a maximum size of 400 bp, which are separated by approximately 700 bp of DNA not repeated within the cluster. Preliminary experiments indicate that this repeat family is also repeated many times in the human genome.

  13. Structure and gene cluster of the o-antigen of Escherichia coli o96. (United States)

    Guo, Xi; Senchenkova, Sof'ya N; Shashkov, Alexander S; Perepelov, Andrei V; Liu, Bin; Knirel, Yuriy A


    Mild acid degradation of the lipopolysaccharide of Escherichia coli O96 afforded a mixture of two polysaccharides. The following structure of the pentasaccharide repeating unit of the major polymer was established by sugar analysis, Smith degradation, and (1)H and (13)C NMR spectroscopy: [Formula: see text]. The O-antigen gene cluster of E. coli O96 between conserved galF and gnd genes was found to be consistent with this structure, and hence, the major polysaccharide represents the O96-antigen. The O96-antigen structure and gene cluster are similar to those of E. coli O170, and two proteins encoded in the gene clusters of both bacteria were putatively assigned a function of galactofuranosyltransferases. The minor polymer has the same structure as a peptidoglycan-related polysaccharide reported earlier in Providencia alcalifeciens O45 and several other O-serogoups of this species (Ovchinnikova OG, Liu B, Kocharova NA, Shashkov AS, Kondakova AN, Siwinska M, Feng L, Rozalski A, Wang L, Knirel YA. Biochemistry (Moscow) 2012;77:609-15) → 4)-β-D-GlcpNAc-(1 → 4)-β-D-GlcpNAc3(Rlac-lAla)-(1 → where Rlac-lAla indicates (R)-1-[(S)-1-carboxyethylaminocarbonyl]ethyl.

  14. Clustering Gene Expression Data Based on Predicted Differential Effects of G V Interaction

    Institute of Scientific and Technical Information of China (English)

    Hai-Yan Pan; Jun Zhu; Dan-Fu Han


    Microarray has become a popular biotechnology in biological and medical research.However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of G V (gene by variety)interaction using the adjusted unbiased prediction (AUP) method. The predicted G V interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.

  15. Evolution of coding and non-coding genes in HOX clusters of a marsupial

    Directory of Open Access Journals (Sweden)

    Yu Hongshi


    Full Text Available Abstract Background The HOX gene clusters are thought to be highly conserved amongst mammals and other vertebrates, but the long non-coding RNAs have only been studied in detail in human and mouse. The sequencing of the kangaroo genome provides an opportunity to use comparative analyses to compare the HOX clusters of a mammal with a distinct body plan to those of other mammals. Results Here we report a comparative analysis of HOX gene clusters between an Australian marsupial of the kangaroo family and the eutherians. There was a strikingly high level of conservation of HOX gene sequence and structure and non-protein coding genes including the microRNAs miR-196a, miR-196b, miR-10a and miR-10b and the long non-coding RNAs HOTAIR, HOTAIRM1 and HOXA11AS that play critical roles in regulating gene expression and controlling development. By microRNA deep sequencing and comparative genomic analyses, two conserved microRNAs (miR-10a and miR-10b were identified and one new candidate microRNA with typical hairpin precursor structure that is expressed in both fibroblasts and testes was found. The prediction of microRNA target analysis showed that several known microRNA targets, such as miR-10, miR-414 and miR-464, were found in the tammar HOX clusters. In addition, several novel and putative miRNAs were identified that originated from elsewhere in the tammar genome and that target the tammar HOXB and HOXD clusters. Conclusions This study confirms that the emergence of known long non-coding RNAs in the HOX clusters clearly predate the marsupial-eutherian divergence 160 Ma ago. It also identified a new potentially functional microRNA as well as conserved miRNAs. These non-coding RNAs may participate in the regulation of HOX genes to influence the body plan of this marsupial.

  16. Molecular analysis of SCARECROW genes expressed in white lupin cluster roots. (United States)

    Sbabou, Laila; Bucciarelli, Bruna; Miller, Susan; Liu, Junqi; Berhada, Fatiha; Filali-Maltouf, Abdelkarim; Allan, Deborah; Vance, Carroll


    The Scarecrow (SCR) transcription factor plays a crucial role in root cell radial patterning and is required for maintenance of the quiescent centre and differentiation of the endodermis. In response to phosphorus (P) deficiency, white lupin (Lupinus albus L.) root surface area increases some 50-fold to 70-fold due to the development of cluster (proteoid) roots. Previously it was reported that SCR-like expressed sequence tags (ESTs) were expressed during early cluster root development. Here the cloning of two white lupin SCR genes, LaSCR1 and LaSCR2, is reported. The predicted amino acid sequences of both LaSCR gene products are highly similar to AtSCR and contain C-terminal conserved GRAS family domains. LaSCR1 and LaSCR2 transcript accumulation localized to the endodermis of both normal and cluster roots as shown by in situ hybridization and gene promoter::reporter staining. Transcript analysis as evaluated by quantitative real-time-PCR (qRT-PCR) and RNA gel hybridization indicated that the two LaSCR genes are expressed predominantly in roots. Expression of LaSCR genes was not directly responsive to the P status of the plant but was a function of cluster root development. Suppression of LaSCR1 in transformed roots of lupin and Medicago via RNAi (RNA interference) delivered through Agrobacterium rhizogenes resulted in decreased root numbers, reflecting the potential role of LaSCR1 in maintaining root growth in these species. The results suggest that the functional orthologues of AtSCR have been characterized.

  17. Cloning and characterization of a gene cluster for cyclododecanone oxidation in Rhodococcus ruber SC1. (United States)

    Kostichka, K; Thomas, S M; Gibson, K J; Nagarajan, V; Cheng, Q


    Biological oxidation of cyclic ketones normally results in formation of the corresponding dicarboxylic acids, which are further metabolized in the cell. Rhodococcus ruber strain SC1 was isolated from an industrial wastewater bioreactor that was able to utilize cyclododecanone as the sole carbon source. A reverse genetic approach was used to isolate a 10-kb gene cluster containing all genes required for oxidative conversion of cyclododecanone to 1,12-dodecanedioic acid (DDDA). The genes required for cyclododecanone oxidation were only marginally similar to the analogous genes for cyclohexanone oxidation. The biochemical function of the enzymes encoded on the 10-kb gene cluster, the flavin monooxygenase, the lactone hydrolase, the alcohol dehydrogenase, and the aldehyde dehydrogenase, was determined in Escherichia coli based on the ability to convert cyclododecanone. Recombinant E. coli strains grown in the presence of cyclododecanone accumulated lauryl lactone, 12-hydroxylauric acid, and/or DDDA depending on the genes cloned. The cyclododecanone monooxygenase is a type 1 Baeyer-Villiger flavin monooxygenase (FAD as cofactor) and exhibited substrate specificity towards long-chain cyclic ketones (C11 to C15), which is different from the specificity of cyclohexanone monooxygenase favoring short-chain cyclic compounds (C5 to C7).

  18. Genomic organization, tissue distribution and functional characterization of the rat Pate gene cluster.

    Directory of Open Access Journals (Sweden)

    Angireddy Rajesh

    Full Text Available The cysteine rich prostate and testis expressed (Pate proteins identified till date are thought to resemble the three fingered protein/urokinase-type plasminogen activator receptor proteins. In this study, for the first time, we report the identification, cloning and characterization of rat Pate gene cluster and also determine the expression pattern. The rat Pate genes are clustered on chromosome 8 and their predicted proteins retained the ten cysteine signature characteristic to TFP/Ly-6 protein family. PATE and PATE-F three dimensional protein structure was found to be similar to that of the toxin bucandin. Though Pate gene expression is thought to be prostate and testis specific, we observed that rat Pate genes are also expressed in seminal vesicle and epididymis and in tissues beyond the male reproductive tract. In the developing rats (20-60 day old, expression of Pate genes seem to be androgen dependent in the epididymis and testis. In the adult rat, androgen ablation resulted in down regulation of the majority of Pate genes in the epididymides. PATE and PATE-F proteins were found to be expressed abundantly in the male reproductive tract of rats and on the sperm. Recombinant PATE protein exhibited potent antibacterial activity, whereas PATE-F did not exhibit any antibacterial activity. Pate expression was induced in the epididymides when challenged with LPS. Based on our results, we conclude that rat PATE proteins may contribute to the reproductive and defense functions.

  19. Multi-class clustering of cancer subtypes through SVM based ensemble of pareto-optimal solutions for gene marker identification.

    Directory of Open Access Journals (Sweden)

    Anirban Mukhopadhyay

    Full Text Available With the advancement of microarray technology, it is now possible to study the expression profiles of thousands of genes across different experimental conditions or tissue samples simultaneously. Microarray cancer datasets, organized as samples versus genes fashion, are being used for classification of tissue samples into benign and malignant or their subtypes. They are also useful for identifying potential gene markers for each cancer subtype, which helps in successful diagnosis of particular cancer types. In this article, we have presented an unsupervised cancer classification technique based on multiobjective genetic clustering of the tissue samples. In this regard, a real-coded encoding of the cluster centers is used and cluster compactness and separation are simultaneously optimized. The resultant set of near-Pareto-optimal solutions contains a number of non-dominated solutions. A novel approach to combine the clustering information possessed by the non-dominated solutions through Support Vector Machine (SVM classifier has been proposed. Final clustering is obtained by consensus among the clusterings yielded by different kernel functions. The performance of the proposed multiobjective clustering method has been compared with that of several other microarray clustering algorithms for three publicly available benchmark cancer datasets. Moreover, statistical significance tests have been conducted to establish the statistical superiority of the proposed clustering method. Furthermore, relevant gene markers have been identified using the clustering result produced by the proposed clustering method and demonstrated visually. Biological relationships among the gene markers are also studied based on gene ontology. The results obtained are found to be promising and can possibly have important impact in the area of unsupervised cancer classification as well as gene marker identification for multiple cancer subtypes.

  20. Onto-CC: a web server for identifying Gene Ontology conceptual clusters (United States)

    Romero-Zaliz, R.; del Val, C.; Cobb, J. P.; Zwir, I.


    The Gene Ontology (GO) vocabulary has been extensively explored to analyze the functions of coexpressed genes. However, despite its extended use in Biology and Medical Sciences, there are still high levels of uncertainty about which ontology (i.e. Molecular Process, Cellular Component or Molecular Function) should be used, and at which level of specificity. Moreover, the GO database can contain incomplete information resulting from human annotations, or highly influenced by the available knowledge about a specific branch in an ontology. In spite of these drawbacks, there is a trend to ignore these problems and even use GO terms to conduct searches of gene expression profiles (i.e. expression + GO) instead of more cautious approaches that just consider them as an independent source of validation (i.e. expression versus GO). Consequently, propagating the uncertainty and producing biased analysis of the required gene grouping hypotheses. We proposed a web tool, Onto-CC, as an automatic method specially suited for independent explanation/validation of gene grouping hypotheses (e.g. coexpressed genes) based on GO clusters (i.e. expression versus GO). Onto-CC approach reduces the uncertainty of the queries by identifying optimal conceptual clusters that combine terms from different ontologies simultaneously, as well as terms defined at different levels of specificity in the GO hierarchy. To do so, we implemented the EMO-CC methodology to find clusters in structural databases [GO Directed acyclic Graph (DAG) tree], inspired on Conceptual Clustering algorithms. This approach allows the management of optimal cluster sets as potential parallel hypotheses, guided by multiobjective/multimodal optimization techniques. Therefore, we can generate alternative and, still, optimal explanations of queries that can provide new insights for a given problem. Onto-CC has been successfully used to test different medical and biological hypotheses including the explanation and prediction of

  1. Cloning of type 8 capsule genes and analysis of gene clusters for the production of different capsular polysaccharides in Staphylococcus aureus. (United States)

    Sau, S; Lee, C Y


    Eleven serotypes of capsular polysaccharide from Staphylococcus aureus have been reported. We have previously cloned a cluster of type 1 capsule (cap1) genes responsible for type 1 capsular polysaccharide biosynthesis in S. aureus M. To clone the type 8 capsule (cap8) genes, a plasmid library of type 8 strain Becker was screened with a labelled DNA fragment containing the cap1 genes under low-stringency conditions. One recombinant plasmid containing a 14-kb insert was chosen for further study and found to complement 14 of the 18 type 8 capsule-negative (Cap8-) mutants used in the study. Additional library screening, subcloning, and complementation experiments showed that all of the 18 Cap8- mutants were complemented by DNA fragments derived from a 20.5-kb contiguous region of the Becker chromosome. The mutants were mapped into six complementation groups, indicating that the cap8 genes are clustered. By Southern hybridization analyses under high-stringency conditions, we found that DNA fragments containing the cap8 gene cluster show extensive homology with all 17 strains tested, including type 1 strains. By further Southern analyses and cloning of the cap8-related homolog from strain M, we show that strain M carries an additional capsule gene cluster different from the cap1 gene cluster. In addition, by using DNA fragments containing different regions of the cap8 gene cluster as probes to hybridize DNA from different strains, we found that the central region of the cap8 gene cluster hybridizes only to DNAs from certain strains tested whereas the flanking regions hybridize to DNAs of all strains tested. Thus, the cap8 gene clusters and its closely related homologs are likely to have organizations similar to those of the encapsulation genes of other bacterial systems.

  2. Versatile Cosmid Vectors for the Isolation, Expression, and Rescue of Gene Sequences: Studies with the Human α -globin Gene Cluster (United States)

    Lau, Yun-Fai; Kan, Yuet Wai


    We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.

  3. Non-ribosomal peptide synthetases: Identifying the cryptic gene clusters and decoding the natural product

    Indian Academy of Sciences (India)



    Non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs) present in bacteria and fungi are themajor multi-modular enzyme complexes which synthesize secondary metabolites like the pharmacologically importantantibiotics and siderophores. Each of the multiple modules of an NRPS activates a different amino or aryl acid,followed by their condensation to synthesize a linear or cyclic natural product. The studies on NRPS domains, theknowledge of their gene cluster architecture and tailoring enzymes have helped in the in silico genetic screening of theever-expanding sequenced microbial genomic data for the identification of novel NRPS/PKS clusters and thusdeciphering novel non-ribosomal peptides (NRPs). Adenylation domain is an integral part of the NRPSs and is thesubstrate selecting unit for the final assembled NRP. In some cases, it also requires a small protein, the MbtHhomolog, for its optimum activity. The presence of putative adenylation domain and MbtH homologs in a sequencedgenome can help identify the novel secondary metabolite producers. The role of the adenylation domain in the NRPSgene clusters and its characterization as a tool for the discovery of novel cryptic NRPS gene clusters are discussed.

  4. Comparison of Expression of Secondary Metabolite Biosynthesis Cluster Genes in Aspergillus flavus, A. parasiticus, and A. oryzae

    Directory of Open Access Journals (Sweden)

    Kenneth C. Ehrlich


    Full Text Available Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.

  5. Comparison of expression of secondary metabolite biosynthesis cluster genes in Aspergillus flavus, A. parasiticus, and A. oryzae. (United States)

    Ehrlich, Kenneth C; Mack, Brian M


    Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.

  6. Hierarchical clustering of breast cancer methylomes revealed differentially methylated and expressed breast cancer genes.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.

  7. In silico clustering of Salmonella global gene expression data reveals novel genes co-regulated with the SPI-1 virulence genes through HilD (United States)

    Martínez-Flores, Irma; Pérez-Morales, Deyanira; Sánchez-Pérez, Mishael; Paredes, Claudia C.; Collado-Vides, Julio; Salgado, Heladia; Bustamante, Víctor H.


    A wide variety of Salmonella enterica serovars cause intestinal and systemic infections to humans and animals. Salmonella Patogenicity Island 1 (SPI-1) is a chromosomal region containing 39 genes that have crucial virulence roles. The AraC-like transcriptional regulator HilD, encoded in SPI-1, positively controls the expression of the SPI-1 genes, as well as of several other virulence genes located outside SPI-1. In this study, we applied a clustering method to the global gene expression data of S. enterica serovar Typhimurium from the COLOMBOS database; thus genes that show an expression pattern similar to that of SPI-1 genes were selected. This analysis revealed nine novel genes that are co-expressed with SPI-1, which are located in different chromosomal regions. Expression analyses and protein-DNA interaction assays showed regulation by HilD for six of these genes: gtgE, phoH, sinR, SL1263 (lpxR) and SL4247 were regulated directly, whereas SL1896 was regulated indirectly. Interestingly, phoH is an ancestral gene conserved in most of bacteria, whereas the other genes show characteristics of genes acquired by Salmonella. A role in virulence has been previously demonstrated for gtgE, lpxR and sinR. Our results further expand the regulon of HilD and thus identify novel possible Salmonella virulence genes. PMID:27886269

  8. Gene cluster analysis for the biosynthesis of elgicins, novel lantibiotics produced by paenibacillus elgii B69

    Directory of Open Access Journals (Sweden)

    Teng Yi


    Full Text Available Abstract Background The recent increase in bacterial resistance to antibiotics has promoted the exploration of novel antibacterial materials. As a result, many researchers are undertaking work to identify new lantibiotics because of their potent antimicrobial activities. The objective of this study was to provide details of a lantibiotic-like gene cluster in Paenibacillus elgii B69 and to produce the antibacterial substances coded by this gene cluster based on culture screening. Results Analysis of the P. elgii B69 genome sequence revealed the presence of a lantibiotic-like gene cluster composed of five open reading frames (elgT1, elgC, elgT2, elgB, and elgA. Screening of culture extracts for active substances possessing the predicted properties of the encoded product led to the isolation of four novel peptides (elgicins AI, AII, B, and C with a broad inhibitory spectrum. The molecular weights of these peptides were 4536, 4593, 4706, and 4820 Da, respectively. The N-terminal sequence of elgicin B was Leu-Gly-Asp-Tyr, which corresponded to the partial sequence of the peptide ElgA encoded by elgA. Edman degradation suggested that the product elgicin B is derived from ElgA. By correlating the results of electrospray ionization-mass spectrometry analyses of elgicins AI, AII, and C, these peptides are deduced to have originated from the same precursor, ElgA. Conclusions A novel lantibiotic-like gene cluster was shown to be present in P. elgii B69. Four new lantibiotics with a broad inhibitory spectrum were isolated, and these appear to be promising antibacterial agents.

  9. Fetal Haemoglobin and β-globin Gene Cluster Haplotypes among Sickle Cell Patients in Chhattisgarh


    Bhagat, Sanjana; Patra, Pradeep Kumar; Thakur, Amar Singh


    Background: Foetal Haemoglobin (HbF) is the best-known genetic modulator of sickle cell anaemia, which varies dramatically in concentration in the blood of these patients. The patients with SCA display a remarkable variability in the disease severity. High HbF levels and the β-globin gene cluster haplotypes influence the clinical presentation of sickle cell disease. To identify the genetic modifiers which influence the disease severity, we conducted a β-globin haplotype analysis in the sickle...

  10. Structure and gene cluster of the O-antigen of Escherichia coli O133. (United States)

    Shashkov, Alexander S; Zhang, Yuanyuan; Sun, Qiangzheng; Guo, Xi; Senchenkova, Sof'ya N; Perepelov, Andrei V; Knirel, Yuriy A


    The O-specific polysaccharide (O-antigen) of Escherichia coli O133 was obtained by mild acid hydrolysis of the lipopolysaccharide of E. coli O133. The structure of the hexasaccharide repeating unit of the polysaccharide was elucidated by (1)H and (13)C NMR spectroscopy, including a two-dimensional (1)H-(1)H ROESY experiment: Functions of genes in the O-antigen gene cluster were putatively identified by comparison with sequences in the available databases and, particularly, an encoded predicted multifunctional glycosyltransferase was assigned to three α-l-rhamnosidic linkages.

  11. Gene Clusters for Insecticidal Loline Alkaloids in the Grass-Endophytic Fungus Neotyphodium uncinatum


    SPIERING, MARTIN J.; Moon, Christina D.; Wilkinson, Heather H.; Christopher L Schardl


    Loline alkaloids are produced by mutualistic fungi symbiotic with grasses, and they protect the host plants from insects. Here we identify in the fungal symbiont, Neotyphodium uncinatum, two homologous gene clusters (LOL-1 and LOL-2) associated with loline-alkaloid production. Nine genes were identified in a 25-kb region of LOL-1 and designated (in order) lolF-1, lolC-1, lolD-1, lolO-1, lolA-1, lolU-1, lolP-1, lolT-1, and lolE-1. LOL-2 contained the homologs lolC-2 through lolE-2 in the same ...

  12. Burkholderia thailandensis harbors two identical rhl gene clusters responsible for the biosynthesis of rhamnolipids

    Directory of Open Access Journals (Sweden)

    Woods Donald E


    Full Text Available Abstract Background Rhamnolipids are surface active molecules composed of rhamnose and β-hydroxydecanoic acid. These biosurfactants are produced mainly by Pseudomonas aeruginosa and have been thoroughly investigated since their early discovery. Recently, they have attracted renewed attention because of their involvement in various multicellular behaviors. Despite this high interest, only very few studies have focused on the production of rhamnolipids by Burkholderia species. Results Orthologs of rhlA, rhlB and rhlC, which are responsible for the biosynthesis of rhamnolipids in P. aeruginosa, have been found in the non-infectious Burkholderia thailandensis, as well as in the genetically similar important pathogen B. pseudomallei. In contrast to P. aeruginosa, both Burkholderia species contain these three genes necessary for rhamnolipid production within a single gene cluster. Furthermore, two identical, paralogous copies of this gene cluster are found on the second chromosome of these bacteria. Both Burkholderia spp. produce rhamnolipids containing 3-hydroxy fatty acid moieties with longer side chains than those described for P. aeruginosa. Additionally, the rhamnolipids produced by B. thailandensis contain a much larger proportion of dirhamnolipids versus monorhamnolipids when compared to P. aeruginosa. The rhamnolipids produced by B. thailandensis reduce the surface tension of water to 42 mN/m while displaying a critical micelle concentration value of 225 mg/L. Separate mutations in both rhlA alleles, which are responsible for the synthesis of the rhamnolipid precursor 3-(3-hydroxyalkanoyloxyalkanoic acid, prove that both copies of the rhl gene cluster are functional, but one contributes more to the total production than the other. Finally, a double ΔrhlA mutant that is completely devoid of rhamnolipid production is incapable of swarming motility, showing that both gene clusters contribute to this phenotype. Conclusions Collectively, these

  13. Functional dissection of HOXD cluster genes in regulation of neuroblastoma cell proliferation and differentiation.

    Directory of Open Access Journals (Sweden)

    Yunhong Zha

    Full Text Available Retinoic acid (RA can induce growth arrest and neuronal differentiation of neuroblastoma cells and has been used in clinic for treatment of neuroblastoma. It has been reported that RA induces the expression of several HOXD genes in human neuroblastoma cell lines, but their roles in RA action are largely unknown. The HOXD cluster contains nine genes (HOXD1, HOXD3, HOXD4, and HOXD8-13 that are positioned sequentially from 3' to 5', with HOXD1 at the 3' end and HOXD13 the 5' end. Here we show that all HOXD genes are induced by RA in the human neuroblastoma BE(2-C cells, with the genes located at the 3' end being activated generally earlier than those positioned more 5' within the cluster. Individual induction of HOXD8, HOXD9, HOXD10 or HOXD12 is sufficient to induce both growth arrest and neuronal differentiation, which is associated with downregulation of cell cycle-promoting genes and upregulation of neuronal differentiation genes. However, induction of other HOXD genes either has no effect (HOXD1 or has partial effects (HOXD3, HOXD4, HOXD11 and HOXD13 on BE(2-C cell proliferation or differentiation. We further show that knockdown of HOXD8 expression, but not that of HOXD9 expression, significantly inhibits the differentiation-inducing activity of RA. HOXD8 directly activates the transcription of HOXC9, a key effector of RA action in neuroblastoma cells. These findings highlight the distinct functions of HOXD genes in RA induction of neuroblastoma cell differentiation.

  14. Genetic diversity within Clostridium botulinum serotypes, botulinum neurotoxin gene clusters and toxin subtypes. (United States)

    Hill, Karen K; Smith, Theresa J


    Clostridium botulinum is a species of spore-forming anaerobic bacteria defined by the expression of any one or two of seven serologically distinct botulinum neurotoxins (BoNTs) designated BoNT/A-G. This Gram-positive bacterium was first identified in 1897 and since then the paralyzing and lethal effects of its toxin have resulted in the recognition of different forms of the intoxication known as food-borne, infant, or wound botulism. Early microbiological and biochemical characterization of C. botulinum isolates revealed that the bacteria within the species had different characteristics and expressed different toxin types. To organize the variable bacterial traits within the species, Group I-IV designations were created. Interestingly, it was observed that isolates within different Groups could express the same toxin type and conversely a single Group could express different toxin types. This discordant phylogeny between the toxin and the host bacteria indicated that horizontal gene transfer of the toxin was responsible for the variation observed within the species. The recent availability of multiple C. botulinum genomic sequences has offered the ability to bioinformatically analyze the locations of the bont genes, the composition of their toxin gene clusters, and the genes flanking these regions to understand their variation. Comparison of the genomic sequences representing multiple serotypes indicates that the bont genes are not in random locations. Instead the analyses revealed specific regions where the toxin genes occur within the genomes representing serotype A, B, C, E, and F C. botulinum strains and C. butyricum type E strains. The genomic analyses have provided evidence of horizontal gene transfer, site-specific insertion, and recombination events. These events have contributed to the variation observed among the neurotoxins, the toxin gene clusters and the bacteria that contain them, and has supported the historical microbiological, and biochemical

  15. Differential expression of TIR-like genes embedded in the M1-1 gene cluster in nematode-resistant and -susceptible tomato roots

    NARCIS (Netherlands)

    Seifi Abdolabad, A.R.; Visser, R.G.F.; Bai, Y.


    Transport inhibitor 1 (TIR1) is an auxin receptor that plays a pivotal role in auxin signaling. It has been reported that TIR-like genes are present in a gene cluster carrying the Mi-1 gene which confers resistance to nematodes, aphids and whiteflies. Since auxin is involved in the pathogenicity of

  16. Mapping gene clusters within arrayed metagenomic libraries to expand the structural diversity of biomedically relevant natural products. (United States)

    Owen, Jeremy G; Reddy, Boojala Vijay B; Ternei, Melinda A; Charlop-Powers, Zachary; Calle, Paula Y; Kim, Jeffrey H; Brady, Sean F


    Complex microbial ecosystems contain large reservoirs of unexplored biosynthetic diversity. Here we provide an experimental framework and data analysis tool to facilitate the targeted discovery of natural-product biosynthetic gene clusters from the environment. Multiplex sequencing of barcoded PCR amplicons is followed by sequence similarity directed data parsing to identify sequences bearing close resemblance to biosynthetically or biomedically interesting gene clusters. Amplicons are then mapped onto arrayed metagenomic libraries to guide the recovery of targeted gene clusters. When applied to adenylation- and ketosynthase-domain amplicons derived from saturating soil DNA libraries, our analysis pipeline led to the recovery of biosynthetic clusters predicted to encode for previously uncharacterized glycopeptide- and lipopeptide-like antibiotics; thiocoraline-, azinomycin-, and bleomycin-like antitumor agents; and a rapamycin-like immunosuppressant. The utility of the approach is demonstrated by using recovered eDNA sequences to generate glycopeptide derivatives. The experiments described here constitute a systematic interrogation of a soil metagenome for gene clusters capable of encoding naturally occurring derivatives of biomedically relevant natural products. Our results show that previously undetected biosynthetic gene clusters with potential biomedical relevance are very common in the environment. This general process should permit the routine screening of environmental samples for gene clusters capable of encoding the systematic expansion of the structural diversity seen in biomedically relevant families of natural products.

  17. Comparative analysis of a cryptic thienamycin-like gene cluster identified in Streptomyces flavogriseus by genome mining. (United States)

    Blanco, Gloria


    In silico database searches allowed the identification in the S. flavogriseus ATCC 33331 genome of a carbapenem gene cluster highly related to the S. cattleya thienamycin one. This is the second cluster found for a complex highly substituted carbapenem. Comparative analysis revealed that both gene clusters display a high degree of synteny in gene organization and in protein conservation. Although the cluster appears to be silent under our laboratory conditions, the putative metabolic product was predicted from bioinformatics analyses using sequence comparison tools. These data, together with previous reports concerning epithienamycins production by S. flavogriseus strains, suggest that the cluster metabolic product might be a thienamycin-like carbapenem, possibly the epimeric epithienamycin. This finding might help in understanding the biosynthetic pathway to thienamycin and other highly substituted carbapenems. It also provides another example of genome mining in Streptomyces sequenced genomes as a powerful approach for novel antibiotic discovery.

  18. Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sites. (United States)

    Qin, Zhaohui S; McCue, Lee Ann; Thompson, William; Mayerhofer, Linda; Lawrence, Charles E; Liu, Jun S


    The identification of co-regulated genes and their transcription-factor binding sites (TFBS) are key steps toward understanding transcription regulation. In addition to effective laboratory assays, various computational approaches for the detection of TFBS in promoter regions of coexpressed genes have been developed. The availability of complete genome sequences combined with the likelihood that transcription factors and their cognate sites are often conserved during evolution has led to the development of phylogenetic footprinting. The modus operandi of this technique is to search for conserved motifs upstream of orthologous genes from closely related species. The method can identify hundreds of TFBS without prior knowledge of co-regulation or coexpression. Because many of these predicted sites are likely to be bound by the same transcription factor, motifs with similar patterns can be put into clusters so as to infer the sets of co-regulated genes, that is, the regulons. This strategy utilizes only genome sequence information and is complementary to and confirmative of gene expression data generated by microarray experiments. However, the limited data available to characterize individual binding patterns, the variation in motif alignment, motif width, and base conservation, and the lack of knowledge of the number and sizes of regulons make this inference problem difficult. We have developed a Gibbs sampling-based Bayesian motif clustering (BMC) algorithm to address these challenges. Tests on simulated data sets show that BMC produces many fewer errors than hierarchical and K-means clustering methods. The application of BMC to hundreds of predicted gamma-proteobacterial motifs correctly identified many experimentally reported regulons, inferred the existence of previously unreported members of these regulons, and suggested novel regulons.

  19. The lineage-specific evolution of aquaporin gene clusters facilitated tetrapod terrestrial adaptation.

    Directory of Open Access Journals (Sweden)

    Roderick Nigel Finn

    Full Text Available A major physiological barrier for aquatic organisms adapting to terrestrial life is dessication in the aerial environment. This barrier was nevertheless overcome by the Devonian ancestors of extant Tetrapoda, but the origin of specific molecular mechanisms that solved this water problem remains largely unknown. Here we show that an ancient aquaporin gene cluster evolved specifically in the sarcopterygian lineage, and subsequently diverged into paralogous forms of AQP2, -5, or -6 to mediate water conservation in extant Tetrapoda. To determine the origin of these apomorphic genomic traits, we combined aquaporin sequencing from jawless and jawed vertebrates with broad taxon assembly of >2,000 transcripts amongst 131 deuterostome genomes and developed a model based upon Bayesian inference that traces their convergent roots to stem subfamilies in basal Metazoa and Prokaryota. This approach uncovered an unexpected diversity of aquaporins in every lineage investigated, and revealed that the vertebrate superfamily consists of 17 classes of aquaporins (Aqp0 - Aqp16. The oldest orthologs associated with water conservation in modern Tetrapoda are traced to a cluster of three aqp2-like genes in Actinistia that likely arose >500 Ma through duplication of an aqp0-like gene present in a jawless ancestor. In sea lamprey, we show that aqp0 first arose in a protocluster comprised of a novel aqp14 paralog and a fused aqp01 gene. To corroborate these findings, we conducted phylogenetic analyses of five syntenic nuclear receptor subfamilies, which, together with observations of extensive genome rearrangements, support the coincident loss of ancestral aqp2-like orthologs in Actinopterygii. We thus conclude that the divergence of sarcopterygian-specific aquaporin gene clusters was permissive for the evolution of water conservation mechanisms that facilitated tetrapod terrestrial adaptation.

  20. A conserved cluster of three PRD-class homeobox genes (homeobrain, rx and orthopedia in the Cnidaria and Protostomia

    Directory of Open Access Journals (Sweden)

    Mazza Maureen E


    Full Text Available Abstract Background Homeobox genes are a superclass of transcription factors with diverse developmental regulatory functions, which are found in plants, fungi and animals. In animals, several Antennapedia (ANTP-class homeobox genes reside in extremely ancient gene clusters (for example, the Hox, ParaHox, and NKL clusters and the evolution of these clusters has been implicated in the morphological diversification of animal bodyplans. By contrast, similarly ancient gene clusters have not been reported among the other classes of homeobox genes (that is, the LIM, POU, PRD and SIX classes. Results Using a combination of in silico queries and phylogenetic analyses, we found that a cluster of three PRD-class homeobox genes (Homeobrain (hbn, Rax (rx and Orthopedia (otp is present in cnidarians, insects and mollusks (a partial cluster comprising hbn and rx is present in the placozoan Trichoplax adhaerens. We failed to identify this 'HRO' cluster in deuterostomes; in fact, the Homeobrain gene appears to be missing from the chordate genomes we examined, although it is present in hemichordates and echinoderms. To illuminate the ancestral organization and function of this ancient cluster, we mapped the constituent genes against the assembled genome of a model cnidarian, the sea anemone Nematostella vectensis, and characterized their spatiotemporal expression using in situ hybridization. In N. vectensis, these genes reside in a span of 33 kb with the same gene order as previously reported in insects. Comparisons of genomic sequences and expressed sequence tags revealed the presence of alternative transcripts of Nv-otp and two highly unusual protein-coding polymorphisms in the terminal helix of the Nv-rx homeodomain. A population genetic survey revealed the Rx polymorphisms to be widespread in natural populations. During larval development, all three genes are expressed in the ectoderm, in non-overlapping territories along the oral-aboral axis, with distinct

  1. Alanylclavam Biosynthetic Genes Are Clustered Together with One Group of Clavulanic Acid Biosynthetic Genes in Streptomyces clavuligerus▿ § (United States)

    Zelyas, Nathan J.; Cai, Hui; Kwong, Thomas; Jensen, Susan E.


    Streptomyces clavuligerus produces at least five different clavam metabolites, including clavulanic acid and the methionine antimetabolite, alanylclavam. In vitro transposon mutagenesis was used to analyze a 13-kb region upstream of the known paralogue gene cluster. The paralogue cluster includes one group of clavulanic acid biosynthetic genes in S. clavuligerus. Twelve open reading frames (ORFs) were found in this area, and mutants were generated in each using either in vitro transposon or PCR-targeted mutagenesis. Mutants with defects in any of the genes orfA, orfB, orfC, or orfD were unable to produce alanylclavam but could produce all of the other clavams, including clavulanic acid. orfA encodes a predicted hydroxymethyltransferase, orfB encodes a YjgF/YER057c/UK114-family regulatory protein, orfC encodes an aminotransferase, and orfD encodes a dehydratase. All of these types of proteins are normally involved in amino acid metabolism. Mutants in orfC or orfD also accumulated a novel clavam metabolite instead of alanylclavam, and a complemented orfC mutant was able to produce trace amounts of alanylclavam while still producing the novel clavam. Mass spectrometric analyses, together with consideration of the enzymes involved in its production, led to tentative identification of the novel clavam as 8-OH-alanylclavam, an intermediate in the proposed alanylclavam biosynthetic pathway. PMID:18931110

  2. Identification and functional analysis of gene cluster involvement in biosynthesis of the cyclic lipopeptide antibiotic pelgipeptin produced by Paenibacillus elgii

    Directory of Open Access Journals (Sweden)

    Qian Chao-Dong


    Full Text Available Abstract Background Pelgipeptin, a potent antibacterial and antifungal agent, is a non-ribosomally synthesised lipopeptide antibiotic. This compound consists of a β-hydroxy fatty acid and nine amino acids. To date, there is no information about its biosynthetic pathway. Results A potential pelgipeptin synthetase gene cluster (plp was identified from Paenibacillus elgii B69 through genome analysis. The gene cluster spans 40.8 kb with eight open reading frames. Among the genes in this cluster, three large genes, plpD, plpE, and plpF, were shown to encode non-ribosomal peptide synthetases (NRPSs, with one, seven, and one module(s, respectively. Bioinformatic analysis of the substrate specificity of all nine adenylation domains indicated that the sequence of the NRPS modules is well collinear with the order of amino acids in pelgipeptin. Additional biochemical analysis of four recombinant adenylation domains (PlpD A1, PlpE A1, PlpE A3, and PlpF A1 provided further evidence that the plp gene cluster involved in pelgipeptin biosynthesis. Conclusions In this study, a gene cluster (plp responsible for the biosynthesis of pelgipeptin was identified from the genome sequence of Paenibacillus elgii B69. The identification of the plp gene cluster provides an opportunity to develop novel lipopeptide antibiotics by genetic engineering.

  3. Clustering of two genes putatively involved in cyanate detoxification evolved recently and independently in multiple fungal lineages. (United States)

    Elmore, M Holly; McGary, Kriston L; Wisecaver, Jennifer H; Slot, Jason C; Geiser, David M; Sink, Stacy; O'Donnell, Kerry; Rokas, Antonis


    Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trace its evolution across Ascomycetes, and examine the evolutionary dynamics of its spread among lineages of the Fusarium oxysporum species complex (hereafter referred to as the FOSC), a cosmopolitan clade of purportedly clonal vascular wilt plant pathogens. Phylogenetic analysis of fungal cyanase and carbonic anhydrase genes reveals that the CCA gene cluster arose independently at least twice and is now present in three lineages, namely Cochliobolus lunatus, Oidiodendron maius, and the FOSC. Genome-wide surveys within the FOSC indicate that the CCA gene cluster varies in copy number across isolates, is always located on accessory chromosomes, and is absent in FOSC's closest relatives. Phylogenetic reconstruction of the CCA gene cluster in 163 FOSC strains from a wide variety of hosts suggests a recent history of rampant transfers between isolates. We hypothesize that the independent formation of the CCA gene cluster in different fungal lineages and its spread across FOSC strains may be associated with resistance to plant-produced cyanates or to use of cyanate fungicides in agriculture.

  4. Identification and activation of novel biosynthetic gene clusters by genome mining in the kirromycin producer Streptomyces collinus Tü 365. (United States)

    Iftime, Dumitrita; Kulik, Andreas; Härtner, Thomas; Rohrer, Sabrina; Niedermeyer, Timo Horst Johannes; Stegmann, Evi; Weber, Tilmann; Wohlleben, Wolfgang


    Streptomycetes are prolific sources of novel biologically active secondary metabolites with pharmaceutical potential. S. collinus Tü 365 is a Streptomyces strain, isolated 1972 from Kouroussa (Guinea). It is best known as producer of the antibiotic kirromycin, an inhibitor of the protein biosynthesis interacting with elongation factor EF-Tu. Genome Mining revealed 32 gene clusters encoding the biosynthesis of diverse secondary metabolites in the genome of Streptomyces collinus Tü 365, indicating an enormous biosynthetic potential of this strain. The structural diversity of secondary metabolisms predicted for S. collinus Tü 365 includes PKS, NRPS, PKS-NRPS hybrids, a lanthipeptide, terpenes and siderophores. While some of these gene clusters were found to contain genes related to known secondary metabolites, which also could be detected in HPLC-MS analyses, most of the uncharacterized gene clusters are not expressed under standard laboratory conditions. With this study we aimed to characterize the genome information of S. collinus Tü 365 to make use of gene clusters, which previously have not been described for this strain. We were able to connect the gene clusters of a lanthipeptide, a carotenoid, five terpenoid compounds, an ectoine, a siderophore and a spore pigment-associated gene cluster to their respective biosynthesis products.

  5. Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

    Directory of Open Access Journals (Sweden)

    Hong Lu

    Full Text Available BACKGROUND: Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. METHODOLOGY/PRINCIPAL FINDINGS: To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. CONCLUSIONS/SIGNIFICANCE: Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major

  6. Acquisition and Evolution of Plant Pathogenesis–Associated Gene Clusters and Candidate Determinants of Tissue-Specificity in Xanthomonas (United States)

    Van Sluys, Marie-Anne; White, Frank F.; Ryan, Robert P.; Dow, J. Maxwell; Rabinowicz, Pablo; Salzberg, Steven L.; Leach, Jan E.; Sonti, Ramesh; Brendel, Volker; Bogdanove, Adam J.


    Background Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. Methodology/Principal Findings To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors) cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. Conclusions/Significance Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale

  7. Localization and physical mapping of a plasmid-borne 23-kb nif gene cluster from Enterobacter agglomerans showing homology to the entire nif gene cluster of Klebsiella pneumoniae M5a1. (United States)

    Singh, M; Kreutzer, R; Acker, G; Klingmüller, W


    A physical and genetical map of the plasmid pEA3 indigenous to Enterobacter agglomerans is presented. pEA3 is a 111-kb large plasmid containing a 23-kb large cluster of nif genes which shows extensive homology (Southern hybridization and heteroduplex analysis) to the entire nif gene cluster of Klebsiella pneumoniae (Kp) M5a1. All the nif genes on pEA3 are organized in the same manner as in K. pneumoniae, except nifJ, which is located on the left end of pEA3 nif gene cluster (near nifQB). A BamHI restriction map of pEA3 and a detailed restriction map of the 23-kb nif region on pEA3 is also presented. The nif genes of pEA3 showed a low level of acetylene reduction in Escherichia coli, demonstrating that these genes are functional and contain the whole genetic information required to fix nitrogen. The origin of vegetative replication (OriV) of pEA3 was localized about 5.5 kb from the right end of the nif gene cluster. In addition to pEA3, large plasmids from four other strains of E. agglomerans showed homology to all the Kp nif genes tested, indicating that in diazotrophic strains of E. agglomerans nif genes are usually located on plasmids. In contrast, in most of the free-living, nitrogen-fixing bacteria the nif genes are on chromosome.

  8. Cloning and characterization of the polyether salinomycin biosynthesis gene cluster of Streptomyces albus XM211. (United States)

    Jiang, Chunyan; Wang, Hougen; Kang, Qianjin; Liu, Jing; Bai, Linquan


    Salinomycin is widely used in animal husbandry as a food additive due to its antibacterial and anticoccidial activities. However, its biosynthesis had only been studied by feeding experiments with isotope-labeled precursors. A strategy with degenerate primers based on the polyether-specific epoxidase sequences was successfully developed to clone the salinomycin gene cluster. Using this strategy, a putative epoxidase gene, slnC, was cloned from the salinomycin producer Streptomyces albus XM211. The targeted replacement of slnC and subsequent trans-complementation proved its involvement in salinomycin biosynthesis. A 127-kb DNA region containing slnC was sequenced, including genes for polyketide assembly and release, oxidative cyclization, modification, export, and regulation. In order to gain insight into the salinomycin biosynthesis mechanism, 13 gene replacements and deletions were conducted. Including slnC, 7 genes were identified as essential for salinomycin biosynthesis and putatively responsible for polyketide chain release, oxidative cyclization, modification, and regulation. Moreover, 6 genes were found to be relevant to salinomycin biosynthesis and possibly involved in precursor supply, removal of aberrant extender units, and regulation. Sequence analysis and a series of gene replacements suggest a proposed pathway for the biosynthesis of salinomycin. The information presented here expands the understanding of polyether biosynthesis mechanisms and paves the way for targeted engineering of salinomycin activity and productivity.

  9. Characterization and expression analysis of a gene cluster for nitrate assimilation from the yeast Arxula adeninivorans. (United States)

    Böer, Erik; Schröter, Anja; Bode, Rüdiger; Piontek, Michael; Kunze, Gotthard


    In Arxula adeninivorans nitrate assimilation is mediated by the combined actions of a nitrate transporter, a nitrate reductase and a nitrite reductase. Single-copy genes for these activities (AYNT1, AYNR1, AYNI1, respectively) form a 9103 bp gene cluster localized on chromosome 2. The 3210 bp AYNI1 ORF codes for a protein of 1070 amino acids, which exhibits a high degree of identity to nitrite reductases from the yeasts Pichia anomala (58%), Hansenula polymorpha (58%) and Dekkera bruxellensis (54%). The second ORF (AYNR1, 2535 bp) encodes a nitrate reductase of 845 residues that shows significant (51%) identity to nitrate reductases of P. anomala and H. polymorpha. The third ORF in the cluster (AYNT1, 1518 bp) specifies a nitrate transporter with 506 amino acids, which is 46% identical to that of H. polymorpha. The three genes are independently expressed upon induction with NaNO(3). We quantitatively analysed the promoter activities by qRT-PCR and after fusing individual promoter fragments to the phytase (phyK) gene from Klebsiella sp. ASR1. The AYNI1 promoter was found to exhibit the highest activity, followed by the AYNT1 and AYNR1 elements. Direct measurements of nitrate and nitrite reductase activities performed after induction with NaNO(3) are compatible with these results. Both enzymes show optimal activity at around 42 degrees C and near-neutral pH, and require FAD as a co-factor and NADPH as electron donor.

  10. Identification and molecular characterization of four new large deletions in the beta-globin gene cluster. (United States)

    Joly, Philippe; Lacan, Philippe; Garcia, Caroline; Couprie, Nicole; Francina, Alain


    Despite the fact that mutations in the human beta-globin gene cluster are essentially point mutations, a significant number of large deletions have also been described. We present here four new large deletions in the beta-globin gene cluster that have been identified on patients displaying an atypical hemoglobin phenotype (high HbF) at routine analysis. The first deletion, which spreads over 2.0 kb, removes the entire beta-globin gene, including its promoter, and is associated with a typical beta-thal minor phenotype. The three other deletions are larger (19.7 to 23.9 kb) and remove both the delta and beta-globin genes. Phenotypically, they look like an HPFH-deletion as they are associated with normal hematological parameters. The precise localization of their 5' and 3' breakpoints gives new insights about the differences between HPFH and (deltabeta)(0)-thalassemia at the molecular level. The importance of detection of these deletions in prenatal diagnosis and newborn screening of hemoglobinopathies is also discussed.

  11. An ensemble method for identifying regulatory circuits with special reference to the qa gene cluster of Neurospora crassa (United States)

    Battogtokh, D.; Asch, D. K.; Case, M. E.; Arnold, J.; Schüttler, H.-B.


    A chemical reaction network for the regulation of the quinic acid (qa) gene cluster of Neurospora crassa is proposed. An efficient Monte Carlo method for walking through the parameter space of possible chemical reaction networks is developed to identify an ensemble of deterministic kinetics models with rate constants consistent with RNA and protein profiling data. This method was successful in identifying a model ensemble fitting available RNA profiling data on the qa gene cluster. PMID:12477937

  12. Characterization of the ars Gene Cluster from Extremely Arsenic-Resistant Microbacterium sp. Strain A33▿ † (United States)

    Achour-Rokbani, Asma; Cordi, Audrey; Poupin, Pascal; Bauda, Pascale; Billard, Patrick


    The arsenic resistance gene cluster of Microbacterium sp. A33 contains a novel pair of genes (arsTX) encoding a thioredoxin system that are cotranscribed with an unusual arsRC2 fusion gene, ACR3, and arsC1 in an operon divergent from arsC3. The whole ars gene cluster is required to complement an Escherichia coli ars mutant. ArsRC2 negatively regulates the expression of the pentacistronic operon. ArsC1 and ArsC3 are related to thioredoxin-dependent arsenate reductases; however, ArsC3 lacks the two distal catalytic cysteine residues of this class of enzymes. PMID:19966021

  13. Updated clusters of orthologous genes for Archaea: a complex ancestor of the Archaea and the byways of horizontal gene transfer

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I


    Full Text Available Abstract Background Collections of Clusters of Orthologous Genes (COGs provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs. Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. Results The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major ‘highways’ of horizontal gene transfer. Conclusions The updated collection

  14. Identification of a new diterpene biosynthetic gene cluster that produces O-methylkolavelool in Herpetosiphon aurantiacus. (United States)

    Nakano, Chiaki; Oshima, Misaki; Kurashima, Nodoka; Hoshino, Tsutomu


    Diterpenoids are usually found in plants and fungi, but are rare in bacteria. We have previously reported new diterpenes, named tuberculosinol and isotuberculosinol, which are generated from the Mycobacterium tuberculosis gene products Rv3377c and Rv3378c. No homologous gene was found at that time, but we recently found highly homologous proteins in the Herpetosiphon aurantiacus ATCC 23779 genome. Haur_2145 was a class II diterpene cyclase responsible for the conversion of geranylgeranyl diphosphate into kolavenyl diphosphate. Haur_2146, homologous to Rv3378c, synthesized (+)-kolavelool through the nucleophilic addition of a water molecule to the incipient cation formed after the diphosphate moiety was released. Haur_2147 afforded (+)-O-methylkolavelool from (+)-kolavelool, so this enzyme was an O-methyltransferase. This new diterpene was indeed detected in H. aurantiacus cells. This is the first report of the identification of a (+)-O-methylkolavelool biosynthetic gene cluster.

  15. Exploration of geosmin synthase from Streptomyces peucetius ATCC 27952 by deletion of doxorubicin biosynthetic gene cluster. (United States)

    Singh, Bijay; Oh, Tae-Jin; Sohng, Jae Kyung


    Thorough investigation of Streptomyces peucetius ATCC 27952 genome revealed a sesquiterpene synthase, named spterp13, which encodes a putative protein of 732 amino acids with significant similarity to S. avermitilis MA-4680 (SAV2163, GeoA) and S. coelicolor A3(2) (SCO6073). The proteins encoded by SAV2163 and SCO6073 produce geosmin in the respective strains. However, the spterp13 gene seemed to be silent in S. peucetius. Deletion of the doxorubicin gene cluster from S. peucetius resulted in increased cell growth rate along with detectable production of geosmin. When we over expressed the spterp13 gene in S. peucetius DM07 under the control of an ermE* promoter, 2.4 +/- 0.4-fold enhanced production of geosmin was observed.

  16. Transcriptional regulation of gene expression clusters in motor neurons following spinal cord injury

    DEFF Research Database (Denmark)

    Ryge, J.; Winther, Ole; Wienecke, J.;


    expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials...... of modulatory inputs from the brain correlates with the development of spasticity. Results: Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use......Background: Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence...

  17. Characterization of the biosynthetic gene cluster of rebeccamycin from Lechevalieria aerocolonigenes ATCC 39243. (United States)

    Onaka, Hiroyasu; Taniguchi, Shin-ichi; Igarashi, Yasuhiro; Furumai, Tamotsu


    The biosynthetic gene cluster for rebeccamycin, an indolocarbazole antibiotic, from Lechevalieria aerocolonigenes ATCC 39243 has 11 ORFs. To clarify their functions, mutants with rebG, rebD, rebC, rebP, rebM, rebR, rebH, rebT, or orfD2 disrupted were constructed, and the gene products were examined. rebP disruptants produced 11,11'-dichlorochromopyrrolic acid, found to be a biosynthetic intermediate by a bioconversion experiment. Other genes encoded N-glycosyltransferase (rebG), monooxygenase (rebC), methyltransferase (rebM), a transcriptional activator (rebR), and halogenase (rebH). rebT disruptants produced rebeccamycin as much as the wild strain, so rebT was probably not involved in rebeccamycin production. Biosynthetic genes of staurosporine, an another indolocarbazole antibiotic, were cloned from Streptomyces sp. TP-A0274. staO, staD, and staP were similar to rebO, rebD, and rebP, respectively, all of which are responsible for indolocarbazole biosynthesis, But a rebC homolog, encoding a putative enzyme oxidizing the C-7 site of pyrrole rings, was not found in the staurosporine biosynthetic gene cluster. These results suggest that indolocarbazole is constructed by oxidative decarboxylation of chromopyrrolic acid (11,11'-dichlorochromopyrrolic acid in rebeccamycin) generated from two molecules of tryptophan by coupling and that the oxidation state at the C-7 position depends on the additional enzyme(s) encoded by the biosynthetic genes.

  18. Molecular diversity at the major cluster of disease resistance genes in cultivated and wild Lactuca spp. (United States)

    Sicard, D; Woo, S S; Arroyo-Garcia, R; Ochoa, O; Nguyen, D; Korol, A; Nevo, E; Michelmore, R


    Diversity was analyzed in wild and cultivated Lactuca germplasm using molecular markers derived from resistance genes of the NBS-LRR type. Three molecular markers, one microsatellite marker and two SCAR markers that amplified LRR-encoding regions, were developed from sequences of resistance gene homologs at the main resistance gene cluster in lettuce. Variation for these markers were assessed in germplasm including accessions of cultivated lettuce, Lactuca sativa L. and three wild Lactuca spp., L. serriola L., L. saligna and L. virosa L. Diversity was also studied within and between natural populations of L. serriola from Israel and California; the former is close to the center of diversity for Lactuca spp. while the latter is an area of more recent colonization. Large numbers of haplotypes were detected indicating the presence of numerous resistance genes in wild species. The diversity in haplotypes provided evidence for gene duplication and unequal crossing-over during the evolution of this cluster of resistance genes. However, there was no evidence for duplications and deletions within the LRR-encoding regions studied. The three markers were highly correlated with resistance phenotypes in L. sativa. They were able to discriminate between accessions that had previously been shown to be resistant to all known isolates of Bremia lactucae. Therefore, these markers will be highly informative for the establishment of core collections and marker-aided selection. A hierarchical analysis of the population structure of L. serriola showed that countries, as well as locations, were significantly differentiated. These differences may reflect local founder effects and/or divergent selection.

  19. Diplotype Trend Regression Analysis of the ADH Gene Cluster and the ALDH2 Gene: Multiple Significant Associations with Alcohol Dependence (United States)

    Luo, Xingguang; Kranzler, Henry R.; Zuo, Lingjun; Wang, Shuang; Schork, Nicholas J.; Gelernter, Joel


    The set of alcohol-metabolizing enzymes has considerable genetic and functional complexity. The relationships between some alcohol dehydrogenase (ADH) and aldehyde dehydrogenase (ALDH) genes and alcohol dependence (AD) have long been studied in many populations, but not comprehensively. In the present study, we genotyped 16 markers within the ADH gene cluster (including the ADH1A, ADH1B, ADH1C, ADH5, ADH6, and ADH7 genes), 4 markers within the ALDH2 gene, and 38 unlinked ancestry-informative markers in a case-control sample of 801 individuals. Associations between markers and disease were analyzed by a Hardy-Weinberg equilibrium (HWE) test, a conventional case-control comparison, a structured association analysis, and a novel diplotype trend regression (DTR) analysis. Finally, the disease alleles were fine mapped by a Hardy-Weinberg disequilibrium (HWD) measure (J). All markers were found to be in HWE in controls, but some markers showed HWD in cases. Genotypes of many markers were associated with AD. DTR analysis showed that ADH5 genotypes and diplotypes of ADH1A, ADH1B, ADH7, and ALDH2 were associated with AD in European Americans and/or African Americans. The risk-influencing alleles were fine mapped from among the markers studied and were found to coincide with some well-known functional variants. We demonstrated that DTR was more powerful than many other conventional association methods. We also found that several ADH genes and the ALDH2 gene were susceptibility loci for AD, and the associations were best explained by several independent risk genes. PMID:16685648

  20. Functional analysis of alcS, a gene of the alc cluster in Aspergillus nidulans. (United States)

    Flipphi, Michel; Robellet, Xavier; Dequier, Emmanuel; Leschelle, Xavier; Felenbok, Béatrice; Vélot, Christian


    The ethanol utilization pathway (alc system) of Aspergillus nidulans requires two structural genes, alcA and aldA, which encode the two enzymes (alcohol dehydrogenase and aldehyde dehydrogenase, respectively) allowing conversion of ethanol into acetate via acetyldehyde, and a regulatory gene, alcR, encoding the pathway-specific autoregulated transcriptional activator. The alcR and alcA genes are clustered with three other genes that are also positively regulated by alcR, although they are dispensable for growth on ethanol. In this study, we characterized alcS, the most abundantly transcribed of these three genes. alcS is strictly co-regulated with alcA, and encodes a 262-amino acid protein. Sequence comparison with protein databases detected a putative conserved domain that is characteristic of the novel GPR1/FUN34/YaaH membrane protein family. It was shown that the AlcS protein is located in the plasma membrane. Deletion or overexpression of alcS did not result in any obvious phenotype. In particular, AlcS does not appear to be essential for the transport of ethanol, acetaldehyde or acetate. Basic Local Alignment Search Tool analysis against the A. nidulans genome led to the identification of two novel ethanol- and ethylacetate-induced genes encoding other members of the GPR1/FUN34/YaaH family, AN5226 and AN8390.

  1. Pathogen corruption and site-directed recombination at a plant disease resistance gene cluster (United States)

    Nagy, Ervin D.; Bennetzen, Jeffrey L.


    The Pc locus of sorghum (Sorghum bicolor) determines dominant sensitivity to a host-selective toxin produced by the fungal pathogen Periconia circinata. The Pc region was cloned by a map-based approach and found to contain three tandemly repeated genes with the structures of nucleotide binding site–leucine-rich repeat (NBS–LRR) disease resistance genes. Thirteen independent Pc-to-pc mutations were analyzed, and each was found to remove all or part of the central gene of the threesome. Hence, this central gene is Pc. Most Pc-to-pc mutations were associated with unequal recombination. Eight recombination events were localized to different sites in a 560-bp region within the ∼3.7-kb NBS–LRR genes. Because any unequal recombination located within the flanking NBS–LRR genes would have removed Pc, the clustering of cross-over events within a 560-bp segment indicates that a site-directed recombination process exists that specifically targets unequal events to generate LRR diversity in NBS–LRR loci. PMID:18719093

  2. Regulation of a novel gene cluster involved in secondary metabolite production in Streptomyces coelicolor. (United States)

    Hindra; Pak, Patricia; Elliot, Marie A


    Antibiotic biosynthesis in the streptomycetes is a complex and highly regulated process. Here, we provide evidence for the contribution of a novel genetic locus to antibiotic production in Streptomyces coelicolor. The overexpression of a gene cluster comprising four protein-encoding genes (abeABCD) and an antisense RNA-encoding gene (α-abeA) stimulated the production of the blue-pigmented metabolite actinorhodin on solid medium. Actinorhodin production also was enhanced by the overexpression of an adjacent gene (abeR) encoding a predicted Streptomyces antibiotic regulatory protein (SARP), while the deletion of this gene impaired actinorhodin production. We found the abe genes to be differentially regulated and controlled at multiple levels. Upstream of abeA was a promoter that directed the transcription of abeABCD at a low but constitutive level. The expression of abeBCD was, however, significantly upregulated at a time that coincided with the initiation of aerial development and the onset of secondary metabolism; this expression was activated by the binding of AbeR to four heptameric repeats upstream of a promoter within abeA. Expressed divergently to the abeBCD promoter was α-abeA, whose expression mirrored that of abeBCD but did not require activation by AbeR. Instead, α-abeA transcript levels were subject to negative control by the double-strand-specific RNase, RNase III.

  3. Teaching Gene Technology in an Outreach Lab: Students' Assigned Cognitive Load Clusters and the Clusters' Relationships to Learner Characteristics, Laboratory Variables, and Cognitive Achievement (United States)

    Scharfenberg, Franz-Josef; Bogner, Franz X.


    This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and cognitive achievement were examined using a pre-post-follow-up design. Participants of our day-long module Genetic Fingerprinting were 409 twelfth-graders. During the module instructional phases (pre-lab, theoretical, experimental, and interpretation phases), we measured the students' mental effort (ME) as an index of CL. By clustering the students' module-phase-specific ME pattern, we found three student CL clusters which were independent of the module instructional phases, labeled as low-level, average-level, and high-level loaded clusters. Additionally, we found two student CL clusters that were each particular to a specific module phase. Their members reported especially high ME invested in one phase each: within the pre-lab phase and within the interpretation phase. Differentiating the clusters, we identified uncertainty tolerance, prior experience in experimentation, epistemic interest, and prior knowledge as relevant learner characteristics. We found relationships to cognitive achievement, but no relationships to the examined laboratory variables. Our results underscore the importance of pre-lab and interpretation phases in hands-on teaching in science education and the need for teachers to pay attention to these phases, both inside and outside of outreach laboratory learning settings.

  4. CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. (United States)

    Weber, T; Rausch, C; Lopez, P; Hoof, I; Gaykova, V; Huson, D H; Wohlleben, W


    Bacterial secondary metabolites are an important source of antimicrobial and cytostatic drugs. These molecules are often synthesized in a stepwise fashion by multimodular megaenzymes that are encoded in clusters of genes encoding enzymes for precursor supply and modification. In this work,we present an open source software pipeline, CLUSEAN (CLUster SEquence ANalyzer) that helps to annotate and analyze such gene clusters. CLUSEAN integrates standard analysis tools, like BLAST and HMMer, with specific tools for the identification of the functional domains and motifs in nonribosomal peptide synthetases (NRPS)/type I polyketide synthases (PKS) and the prediction of specificities of NRPS.

  5. Motif-Independent De Novo Detection of Secondary Metabolite Gene Clusters – Towards Identification of Novel Secondary Metabolisms from Filamentous Fungi -

    Directory of Open Access Journals (Sweden)

    Myco eUmemura


    Full Text Available Secondary metabolites are produced mostly by clustered genes that are essential to their biosynthesis. The transcriptional expression of these genes is often cooperatively regulated by a transcription factor located inside or close to a cluster. Most of the secondary metabolism biosynthesis (SMB gene clusters identified to date contain so-called core genes with distinctive sequence features, such as polyketide synthase (PKS and non-ribosomal peptide synthetase (NRPS. Recent efforts in sequencing fungal genomes have revealed far more SMB gene clusters than expected based on the number of core genes in the genomes. Several bioinformatics tools have been developed to survey SMB gene clusters using the sequence motif information of the core genes, including SMURF and antiSMASH.More recently, accompanied by the development of sequencing techniques allowing to obtain large-scale genomic and transcriptomic data, motif-independent prediction methods of SMB gene clusters, including MIDDAS-M, have been developed. Most these methods detect the clusters in which the genes are cooperatively regulated at transcriptional levels, thus allowing the identification of novel SMB gene clusters regardless of the presence of the core genes. Another type of the method, MIPS-CG, uses the characteristics of SMB genes, which are highly enriched in non-syntenic blocks (NSBs, enabling the prediction even without transcriptome data although the results have not been evaluated in detail. Considering that large portion of SMB gene clusters might be sufficiently expressed only in limited uncommon conditions, it seems that prediction of SMB gene clusters by bioinformatics and successive experimental validation is an only way to efficiently uncover hidden SMB gene clusters. Here, we describe and discuss possible novel approaches for the determination of SMB gene clusters that have not been identified using conventional methods.

  6. Prevalence and characteristics of pks genotoxin gene cluster-positive clinical Klebsiella pneumoniae isolates in Taiwan (United States)

    Chen, Ying-Tsong; Lai, Yi-Chyi; Tan, Mei-Chen; Hsieh, Li-Yun; Wang, Jann-Tay; Shiau, Yih-Ru; Wang, Hui-Ying; Lin, Ann-Chi; Lai, Jui-Fen; Huang, I-Wen; Lauderdale, Tsai-Ling


    The pks gene cluster encodes enzymes responsible for the synthesis of colibactin, a genotoxin that has been shown to induce DNA damage and contribute to increased virulence. The present study investigated the prevalence of pks in clinical K. pneumoniae isolates from a national surveillance program in Taiwan, and identified microbiological and molecular factors associated with pks-carriage. The pks gene cluster was detected in 67 (16.7%) of 400 isolates from various specimen types. Multivariate analysis revealed that isolates of K1, K2, K20, and K62 capsular types (p < 0.001), and those more susceptible to antimicrobial agents (p = 0.001) were independent factors strongly associated with pks-carriage. Phylogenetic studies on the sequence type (ST) and pulsed-field gel electrophoresis patterns indicated that the pks-positive isolates belong to a clonal group of ST23 in K1, a locally expanding ST65 clone in K2, a ST268-related K20 group, and a highly clonal ST36:K62 group. Carriage of rmpA, iutC, and ybtA, the genes associated with hypervirulence, was significantly higher in the pks-positive isolates than the pks-negative isolates (95.5% vs. 13.2%, p < 0.001). Further studies to determine the presence of hypervirulent pks-bearing bacterial populations in the flora of community residents and their association with different disease entities may be warranted. PMID:28233784

  7. Characterization of the Biosynthetic Gene Cluster for Benzoxazole Antibiotics A33853 Reveals Unusual Assembly Logic. (United States)

    Lv, Meinan; Zhao, Junfeng; Deng, Zixin; Yu, Yi


    A33853, which shows excellent bioactivity against Leishmania, is a benzoxazole-family compound formed from two moieties of 3-hydroxyanthranilic acid and one 3-hydroxypicolinic acid. In this study, we have identified the gene cluster responsible for the biosynthesis of A33853 in Streptomyces sp. NRRL12068 through genome mining and heterologous expression. Bioinformatics analysis and functional characterization of the orfs contained in the gene cluster revealed that the biosynthesis of A33853 is directed by a group of unusual enzymes. In particular, BomK, annotated as a ketosynthase, was found to catalyze the amide bond formation between 3-hydroxypicolinic and 3-hydroxyanthranilic acid during the assembly of A33853. BomJ, a putative ATP-dependent coenzyme A ligase, and BomN, a putative amidohydrolase, were further proposed to be involved in the benzoxazole formation in A33853 according to gene deletion experiments. Finally, we have successfully utilized mutasynthesis to generate two analogs of A33853, which were reported previously to possess excellent anti-leishmanial activity.

  8. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters. (United States)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H


    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software.

  9. Clique-Based Clustering of Correlated SNPs in a Gene Can Improve Performance of Gene-Based Multi-Bin Linear Combination Test

    Directory of Open Access Journals (Sweden)

    Yun Joo Yoo


    Full Text Available Gene-based analysis of multiple single nucleotide polymorphisms (SNPs in a gene region is an alternative to single SNP analysis. The multi-bin linear combination test (MLC proposed in previous studies utilizes the correlation among SNPs within a gene to construct a gene-based global test. SNPs are partitioned into clusters of highly correlated SNPs, and the MLC test statistic quadratically combines linear combination statistics constructed for each cluster. The test has degrees of freedom equal to the number of clusters and can be more powerful than a fully quadratic or fully linear test statistic. In this study, we develop a new SNP clustering algorithm designed to find cliques, which are complete subnetworks of SNPs with all pairwise correlations above a threshold. We evaluate the performance of the MLC test using the clique-based CLQ algorithm versus using the tag-SNP-based LDSelect algorithm. In our numerical power calculations we observed that the two clustering algorithms produce identical clusters about 40~60% of the time, yielding similar power on average. However, because the CLQ algorithm tends to produce smaller clusters with stronger positive correlation, the MLC test is less likely to be affected by the occurrence of opposing signs in the individual SNP effect coefficients.

  10. Clique-Based Clustering of Correlated SNPs in a Gene Can Improve Performance of Gene-Based Multi-Bin Linear Combination Test. (United States)

    Yoo, Yun Joo; Kim, Sun Ah; Bull, Shelley B


    Gene-based analysis of multiple single nucleotide polymorphisms (SNPs) in a gene region is an alternative to single SNP analysis. The multi-bin linear combination test (MLC) proposed in previous studies utilizes the correlation among SNPs within a gene to construct a gene-based global test. SNPs are partitioned into clusters of highly correlated SNPs, and the MLC test statistic quadratically combines linear combination statistics constructed for each cluster. The test has degrees of freedom equal to the number of clusters and can be more powerful than a fully quadratic or fully linear test statistic. In this study, we develop a new SNP clustering algorithm designed to find cliques, which are complete subnetworks of SNPs with all pairwise correlations above a threshold. We evaluate the performance of the MLC test using the clique-based CLQ algorithm versus using the tag-SNP-based LDSelect algorithm. In our numerical power calculations we observed that the two clustering algorithms produce identical clusters about 40~60% of the time, yielding similar power on average. However, because the CLQ algorithm tends to produce smaller clusters with stronger positive correlation, the MLC test is less likely to be affected by the occurrence of opposing signs in the individual SNP effect coefficients.

  11. Dissection of Two Complex Clusters of Resistance Genes in Lettuce (Lactuca sativa). (United States)

    Christopoulou, Marilena; McHale, Leah K; Kozik, Alex; Reyes-Chin Wo, Sebastian; Wroblewski, Tadeusz; Michelmore, Richard W


    Of the over 50 phenotypic resistance genes mapped in lettuce, 25 colocalize to three major resistance clusters (MRC) on chromosomes 1, 2, and 4. Similarly, the majority of candidate resistance genes encoding nucleotide binding-leucine rich repeat (NLR) proteins genetically colocalize with phenotypic resistance loci. MRC1 and MRC4 span over 66 and 63 Mb containing 84 and 21 NLR-encoding genes, respectively, as well as 765 and 627 genes that are not related to NLR genes. Forward and reverse genetic approaches were applied to dissect MRC1 and MRC4. Transgenic lines exhibiting silencing were selected using silencing of β-glucuronidase as a reporter. Silencing of two of five NLR-encoding gene families resulted in abrogation of nine of 14 tested resistance phenotypes mapping to these two regions. At MRC1, members of the coiled coil-NLR-encoding RGC1 gene family were implicated in host and nonhost resistance through requirement for Dm5/8- and Dm45-mediated resistance to downy mildew caused by Bremia lactucae as well as the hypersensitive response to effectors AvrB, AvrRpm1, and AvrRpt2 of the nonpathogen Pseudomonas syringae. At MRC4, RGC12 family members, which encode toll interleukin receptor-NLR proteins, were implicated in Dm4-, Dm7-, Dm11-, and Dm44-mediated resistance to B. lactucae. Lesions were identified in the sequence of a candidate gene within dm7 loss-of-resistance mutant lines, confirming that RGC12G confers Dm7.

  12. Beta-globin gene cluster haplotypes in Venezuelan sickle cell patients from the State of Aragua


    Nancy Moreno; Martínez, José A.; Zorella Blanco; Leidys Osorio; Patrick Hackshaw


    Seven polymorphic sites in the beta-globin gene cluster were analyzed on a sample of 96 chromosomes of Venezuelan sickle cell patients from the State of Aragua. The Benin haplotype was predominant with a frequency of 0.479, followed by the Bantu haplotype (0.406); a minority of cases with other haplotypes was also identified: atypical Bantu A2 (0.042), Senegal (0.031), atypical Bantu A7 (0.021) and Saudi Arabia/Indian (0.021) haplotypes; however, the Cameroon haplotype was not identified in t...

  13. Genomic organization and differential signature of positive selection in the alpha and beta globin gene clusters in two cetacean species. (United States)

    Nery, Mariana F; Arroyo, José Ignacio; Opazo, Juan C


    The hemoglobin of jawed vertebrates is a heterotetramer protein that contains two α- and two β-chains, which are encoded by members of α- and β-globin gene families. Given the hemoglobin role in mediating an adaptive response to chronic hypoxia, it is likely that this molecule may have experienced a selective pressure during the evolution of cetaceans, which have to deal with hypoxia tolerance during prolonged diving. This selective pressure could have generated a complex history of gene turnover in these clusters and/or changes in protein structure themselves. Accordingly, we aimed to characterize the genomic organization of α- and β-globin gene clusters in two cetacean species and to detect a possible role of positive selection on them using a phylogenetic framework. Maximum likelihood and Bayesian phylogeny reconstructions revealed that both cetacean species had retained a similar complement of putatively functional genes. For the α-globin gene cluster, the killer whale presents a complement of genes composed of HBZ, HBK, and two functional copies of HBA and HBQ genes, whereas the dolphin possesses HBZ, HBK, HBA and HBQ genes, and one HBA pseudogene. For the β-globin gene cluster, both species retained a complement of four genes, two early expressed genes-HBE and HBH-and two adult expressed genes-HBD and HBB. Our natural selection analysis detected two positively selected sites in the HBB gene (56 and 62) and four in HBA (15, 21, 49, 120). Interestingly, only the genes that are expressed during the adulthood showed the signature of positive selection.

  14. A Novel Type Pathway-Specific Regulator and Dynamic Genome Environments of a Solanapyrone Biosynthesis Gene Cluster in the Fungus Ascochyta rabiei. (United States)

    Kim, Wonyong; Park, Jeong-Jin; Gang, David R; Peever, Tobin L; Chen, Weidong


    Secondary metabolite genes are often clustered together and situated in particular genomic regions, like the subtelomere, that can facilitate niche adaptation in fungi. Solanapyrones are toxic secondary metabolites produced by fungi occupying different ecological niches. Full-genome sequencing of the ascomycete Ascochyta rabiei revealed a solanapyrone biosynthesis gene cluster embedded in an AT-rich region proximal to a telomere end and surrounded by Tc1/Mariner-type transposable elements. The highly AT-rich environment of the solanapyrone cluster is likely the product of repeat-induced point mutations. Several secondary metabolism-related genes were found in the flanking regions of the solanapyrone cluster. Although the solanapyrone cluster appears to be resistant to repeat-induced point mutations, a P450 monooxygenase gene adjacent to the cluster has been degraded by such mutations. Among the six solanapyrone cluster genes (sol1 to sol6), sol4 encodes a novel type of Zn(II)2Cys6 zinc cluster transcription factor. Deletion of sol4 resulted in the complete loss of solanapyrone production but did not compromise growth, sporulation, or virulence. Gene expression studies with the sol4 deletion and sol4-overexpressing mutants delimited the boundaries of the solanapyrone gene cluster and revealed that sol4 is likely a specific regulator of solanapyrone biosynthesis and appears to be necessary and sufficient for induction of the solanapyrone cluster genes. Despite the dynamic surrounding genomic regions, the solanapyrone gene cluster has maintained its integrity, suggesting important roles of solanapyrones in fungal biology.

  15. The entire β-globin gene cluster is deleted in a form of τδβ-thalassemia.

    NARCIS (Netherlands)

    E.R. Fearon; H.H.Jr. Kazazian; P.G. Waber (Pamela); J.I. Lee (Joseph); S.E. Antonarakis; S.H. Orkin (Stuart); E.F. Vanin; P.S. Henthorn; F.G. Grosveld (Frank); A.F. Scott; G.R. Buchanan


    textabstractWe have used restriction endonuclease mapping to study a deletion involving the beta-globin gene cluster in a Mexican-American family with gamma delta beta-thalassemia. Analysis of DNA polymorphisms demonstrated deletion of the beta-globin gene from the affected chromosome. Using a DNA f

  16. A functional gene cluster for toxoflavin biosynthesis in the genome of the soil bacterium Pseudomonas protegens Pf-5 (United States)

    Toxoflavin is a broad-spectrum toxin best known for its role in virulence of Burkholderia glumae, which causes panicle blight of rice. A gene cluster containing homologs of toxoflavin biosynthesis genes (toxA-E) of B. glumae is present in the genome of Pseudomonas protegens Pf-5, a biological contr...

  17. Structure elucidation and gene cluster characterization of the O-antigen of Escherichia coli O80. (United States)

    Senchenkova, Sof'ya N; Guo, Xi; Filatov, Andrei V; Perepelov, Andrei V; Liu, Bin; Shashkov, Alexander S; Knirel, Yuriy A


    Mild alkaline degradation of the lipopolysaccharide of Escherichia coli O80 afforded a polysaccharide, which was studied by sugar analysis, selective cleavage of glycosidic linkages, and (1)H and (13)C NMR spectroscopy. Solvolysis of the polysaccharide with CF3CO2H cleaved the linkages of α-Fuc and β-linked GlcNAc and GalNAc residues to give two disaccharides. The following structure of the hexasaccharide repeating unit of the O-polysaccharide was established: The polysaccharide repeat also contains a minor O-acetyl group but its position was not determined. The O-antigen gene cluster of E. coli O80 between the conserved galF and gnd genes was analyzed and found to be consistent with the O-polysaccharide structure established.

  18. Soft Topographic Maps for Clustering and Classifying Bacteria Using Housekeeping Genes

    Directory of Open Access Journals (Sweden)

    Massimo La Rosa


    Full Text Available The Self-Organizing Map (SOM algorithm is widely used for building topographic maps of data represented in a vectorial space, but it does not operate with dissimilarity data. Soft Topographic Map (STM algorithm is an extension of SOM to arbitrary distance measures, and it creates a map using a set of units, organized in a rectangular lattice, defining data neighbourhood relationships. In the last years, a new standard for identifying bacteria using genotypic information began to be developed. In this new approach, phylogenetic relationships of bacteria could be determined by comparing a stable part of the bacteria genetic code, the so-called “housekeeping genes.” The goal of this work is to build a topographic representation of bacteria clusters, by means of self-organizing maps, starting from genotypic features regarding housekeeping genes.

  19. The HOX-5 and surfeit gene clusters are linked in the proximal portion of mouse chromosome 2. (United States)

    Stubbs, L; Huxley, C; Hogan, B; Evans, T; Fried, M; Duboule, D; Lehrach, H


    Using an interspecies backcross, we have mapped the HOX-5 and surfeit (surf) gene clusters within the proximal portion of mouse chromosome 2. While the HOX-5 cluster of homeobox-containing genes has been localized to chromosome 2, bands C3-E1, by in situ hybridization, its more precise position relative to the genes and cloned markers of chromosome 2 was not known. Surfeit, a tight cluster of at least six highly conserved "housekeeping" genes, has not been previously mapped in mouse, but has been localized to human chromosome 9q, a region of the human genome with strong homology to proximal mouse chromosome 2. The data presented here place HOX-5 in the vicinity of the closely linked set of developmental mutations rachiterata, lethargic, and fidget and place surf close to the proto-oncogene Abl, near the centromere of chromosome 2.

  20. Sequencing and transcriptional analysis of the Streptococcus thermophilus histamine biosynthesis gene cluster: factors that affect differential hdcA expression

    DEFF Research Database (Denmark)

    Calles-Enríquez, Marina; Hjort, Benjamin Benn; Andersen, Pia Skov;


    Histamine, a toxic compound that is formed by the decarboxylation of histidine through the action of microbial decarboxylases, can accumulate in fermented food products. From a total of 69 Streptococcus thermophilus strains screened, two strains, CHCC1524 and CHCC6483, showed the capacity...... to produce histamine. The hdc clusters of S. thermophilus CHCC1524 and CHCC6483 were sequenced, and the factors that affect histamine biosynthesis and histidine-decarboxylating gene (hdcA) expression were studied. The hdc cluster began with the hdcA gene, was followed by a transporter (hdcP), and ended...... acquisition through a horizontal transfer mechanism. Transcriptional analysis of the hdc cluster revealed the existence of a polycistronic mRNA covering the three genes. The histidine-decarboxylating gene (hdcA) of S. thermophilus demonstrated maximum expression during the stationary growth phase, with high...

  1. Inter-MAR association contributes to transcriptionally active looping events in human beta-globin gene cluster. (United States)

    Wang, Li; Di, Li-Jun; Lv, Xiang; Zheng, Wei; Xue, Zheng; Guo, Zhi-Chen; Liu, De-Pei; Liang, Chi-Chuan


    Matrix attachment regions (MARs) are important in chromatin organization and gene regulation. Although it is known that there are a number of MAR elements in the beta-globin gene cluster, it is unclear that how these MAR elements are involved in regulating beta-globin genes expression. Here, we report the identification of a new MAR element at the LCR (locus control region) of human beta-globin gene cluster and the detection of the inter-MAR association within the beta-globin gene cluster. Also, we demonstrate that SATB1, a protein factor that has been implicated in the formation of network like higher order chromatin structures at some gene loci, takes part in beta-globin specific inter-MAR association through binding the specific MARs. Knocking down of SATB1 obviously reduces the binding of SATB1 to the MARs and diminishes the frequency of the inter-MAR association. As a result, the ACH establishment and the alpha-like globin genes and beta-like globin genes expressions are affected either. In summary, our results suggest that SATB1 is a regulatory factor of hemoglobin genes, especially the early differentiation genes at least through affecting the higher order chromatin structure.

  2. Identification of the nik Gene Cluster of Brucella suis: Regulation and Contribution to Urease Activity (United States)

    Jubier-Maurin, Véronique; Rodrigue, Agnès; Ouahrani-Bettache, Safia; Layssac, Marion; Mandrand-Berthelot, Marie-Andrée; Köhler, Stephan; Liautard, Jean-Pierre


    Analysis of a Brucella suis 1330 gene fused to a gfp reporter, and identified as being induced in J774 murine macrophage-like cells, allowed the isolation of a gene homologous to nikA, the first gene of the Escherichia coli operon encoding the specific transport system for nickel. DNA sequence analysis of the corresponding B. suis nik locus showed that it was highly similar to that of E. coli except for localization of the nikR regulatory gene, which lies upstream from the structural nikABCDE genes and in the opposite orientation. Protein sequence comparisons suggested that the deduced nikABCDE gene products belong to a periplasmic binding protein-dependent transport system. The nikA promoter-gfp fusion was activated in vitro by low oxygen tension and metal ion deficiency and was repressed by NiCl2 excess. Insertional inactivation of nikA strongly reduced the activity of the nickel metalloenzyme urease, which was restored by addition of a nickel excess. Moreover, the nikA mutant of B. suis was functionally complemented with the E. coli nik gene cluster, leading to the recovery of urease activity. Reciprocally, an E. coli strain harboring a deleted nik operon recovered hydrogenase activity by heterologous complementation with the B. suis nik locus. Taking into account these results, we propose that the nik locus of B. suis encodes a nickel transport system. The results further suggest that nickel could enter B. suis via other transport systems. Intracellular growth rates of the B. suis wild-type and nikA mutant strains in human monocytes were similar, indicating that nikA was not essential for this step of infection. We discuss a possible role of nickel transport in maintaining enzymatic activities which could be crucial for survival of the bacteria under the environmental conditions encountered within the host. PMID:11133934

  3. Extended genetic effects of ADH cluster genes on the risk of alcohol dependence: from GWAS to replication. (United States)

    Park, Byung Lae; Kim, Jee Wook; Cheong, Hyun Sub; Kim, Lyoung Hyo; Lee, Boung Chul; Seo, Cheong Hoon; Kang, Tae-Cheon; Nam, Young-Woo; Kim, Goon-Bo; Shin, Hyoung Doo; Choi, Ihn-Geun


    Alcohol dependence (AD) is a multifactorial and polygenic disorder involving complex gene-to-gene and gene-to-environment interactions. Several genome-wide association studies have reported numerous risk factors for AD, but replication results following these studies have been controversial. To identify new candidate genes, the present study used GWAS and replication studies in a Korean cohort with AD. Genome-wide association analysis revealed that two chromosome regions on Chr. 4q22-q23 (ADH gene cluster, including ADH5, ADH4, ADH6, ADH1A, ADH1B, and ADH7) and Chr. 12q24 (ALDH2) showed multiple association signals for the risk of AD. To investigate detailed genetic effects of these ADH genes on AD, a follow-up study of the ADH gene cluster on 4q22-q23 was performed. A total of 90 SNPs, including ADH1B rs1229984 (H47R), were genotyped in an additional 975 Korean subjects. In case-control analysis, ADH1B rs1229984 (H47R) showed the most significant association with the risk of AD (p = 2.63 × 10(-21), OR = 2.35). Moreover, subsequent conditional analyses revealed that all positive associations of other ADH genes in the cluster disappeared, which suggested that ADH1B rs1229984 (H47R) might be the sole functional genetic marker across the ADH gene cluster. Our findings could provide additional information on the ADH gene cluster regarding the risk of AD, as well as a new and important insight into the genetic factors associated with AD.

  4. Apicidin F: characterization and genetic manipulation of a new secondary metabolite gene cluster in the rice pathogen Fusarium fujikuroi.

    Directory of Open Access Journals (Sweden)

    Eva-Maria Niehaus

    Full Text Available The fungus F. fujikuroi is well known for its production of gibberellins causing the 'bakanae' disease of rice. Besides these plant hormones, it is able to produce other secondary metabolites (SMs, such as pigments and mycotoxins. Genome sequencing revealed altogether 45 potential SM gene clusters, most of which are cryptic and silent. In this study we characterize a new non-ribosomal peptide synthetase (NRPS gene cluster that is responsible for the production of the cyclic tetrapeptide apicidin F (APF. This new SM has structural similarities to the known histone deacetylase inhibitor apicidin. To gain insight into the biosynthetic pathway, most of the 11 cluster genes were deleted, and the mutants were analyzed by HPLC-DAD and HPLC-HRMS for their ability to produce APF or new derivatives. Structure elucidation was carried out be HPLC-HRMS and NMR analysis. We identified two new derivatives of APF named apicidin J and K. Furthermore, we studied the regulation of APF biosynthesis and showed that the cluster genes are expressed under conditions of high nitrogen and acidic pH in a manner dependent on the nitrogen regulator AreB, and the pH regulator PacC. In addition, over-expression of the atypical pathway-specific transcription factor (TF-encoding gene APF2 led to elevated expression of the cluster genes under inducing and even repressing conditions and to significantly increased product yields. Bioinformatic analyses allowed the identification of a putative Apf2 DNA-binding ("Api-box" motif in the promoters of the APF genes. Point mutations in this sequence motif caused a drastic decrease of APF production indicating that this motif is essential for activating the cluster genes. Finally, we provide a model of the APF biosynthetic pathway based on chemical identification of derivatives in the cultures of deletion mutants.

  5. The dppBCDF gene cluster of Haemophilus influenzae: Role in heme utilization

    Directory of Open Access Journals (Sweden)

    Morton Daniel J


    Full Text Available Abstract Background Haemophilus influenzae requires a porphyrin source for aerobic growth and possesses multiple mechanisms to obtain this essential nutrient. This porphyrin requirement may be satisfied by either heme alone, or protoporphyrin IX in the presence of an iron source. One protein involved in heme acquisition by H. influenzae is the periplasmic heme binding protein HbpA. HbpA exhibits significant homology to the dipeptide and heme binding protein DppA of Escherichia coli. DppA is a component of the DppABCDF peptide-heme permease of E. coli. H. influenzae homologs of dppBCDF are located in the genome at a point distant from hbpA. The object of this study was to investigate the potential role of the H. influenzae dppBCDF locus in heme utilization. Findings An insertional mutation in dppC was constructed and the impact of the mutation on the utilization of both free heme and various proteinaceous heme sources as well as utilization of protoporphyrin IX was determined in growth curve studies. The dppC insertion mutant strain was significantly impacted in utilization of all tested heme sources and protoporphyin IX. Complementation of the dppC mutation with an intact dppCBDF gene cluster in trans corrected the growth defects seen in the dppC mutant strain. Conclusion The dppCBDF gene cluster constitutes part of the periplasmic heme-acquisition systems of H. influenzae.

  6. Nonribosomal peptide synthase gene clusters for lipopeptide biosynthesis in Bacillus subtilis 916 and their phenotypic functions. (United States)

    Luo, Chuping; Liu, Xuehui; Zhou, Huafei; Wang, Xiaoyu; Chen, Zhiyi


    Bacillus cyclic lipopeptides (LPs) have been well studied for their phytopathogen-antagonistic activities. Recently, research has shown that these LPs also contribute to the phenotypic features of Bacillus strains, such as hemolytic activity, swarming motility, biofilm formation, and colony morphology. Bacillus subtilis 916 not only coproduces the three families of well-known LPs, i.e., surfactins, bacillomycin Ls (iturin family), and fengycins, but also produces a new family of LP called locillomycins. The genome of B. subtilis 916 contains four nonribosomal peptide synthase (NRPS) gene clusters, srf, bmy, fen, and loc, which are responsible for the biosynthesis of surfactins, bacillomycin Ls, fengycins, and locillomycins, respectively. By studying B. subtilis 916 mutants lacking production of one, two, or three LPs, we attempted to unveil the connections between LPs and phenotypic features. We demonstrated that bacillomycin Ls and fengycins contribute mainly to antifungal activity. Although surfactins have weak antifungal activity in vitro, the strain mutated in srfAA had significantly decreased antifungal activity. This may be due to the impaired productions of fengycins and bacillomycin Ls. We also found that the disruption of any LP gene cluster other than fen resulted in a change in colony morphology. While surfactins and bacillomycin Ls play very important roles in hemolytic activity, swarming motility, and biofilm formation, the fengycins and locillomycins had little influence on these phenotypic features. In conclusion, B. subtilis 916 coproduces four families of LPs which contribute to the phenotypic features of B. subtilis 916 in an intricate way.

  7. Microbial communication leading to the activation of silent fungal secondary metabolite gene clusters

    Directory of Open Access Journals (Sweden)

    Tina eNetzker


    Full Text Available Microorganisms form diverse multispecies communities in various ecosystems. The high abundance of fungal and bacterial species in these consortia results in specific communication between the microorganisms. A key role in this communication is played by secondary metabolites (SMs, which are also called natural products. Recently, it was shown that interspecies ‘talk’ between microorganisms represents a physiological trigger to activate silent gene clusters leading to the formation of novel SMs by the involved species. This review focuses on mixed microbial cultivation, mainly between bacteria and fungi, with a special emphasis on the induced formation of fungal SMs in co-cultures. In addition, the role of chromatin remodeling in the induction is examined, and methodical perspectives for the analysis of natural products are presented. As an example for an intermicrobial interaction elucidated at the molecular level, we discuss the specific interaction between the filamentous fungi Aspergillus nidulans and Aspergillus fumigatus with the soil bacterium Streptomyces rapamycinicus, which provides an excellent model system to enlighten molecular concepts behind regulatory mechanisms and will pave the way to a novel avenue of drug discovery through targeted activation of silent SM gene clusters through co-cultivations of microorganisms.

  8. Directed natural product biosynthesis gene cluster capture and expression in the model bacterium Bacillus subtilis

    KAUST Repository

    Li, Yongxin


    Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning plug-and-playa approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.

  9. CTCF Is Required for Neural Development and Stochastic Expression of Clustered Pcdh Genes in Neurons

    Directory of Open Access Journals (Sweden)

    Teruyoshi Hirayama


    Full Text Available The CCCTC-binding factor (CTCF is a key molecule for chromatin conformational changes that promote cellular diversity, but nothing is known about its role in neurons. Here, we produced mice with a conditional knockout (cKO of CTCF in postmitotic projection neurons, mostly in the dorsal telencephalon. The CTCF-cKO mice exhibited postnatal growth retardation and abnormal behavior and had defects in functional somatosensory mapping in the brain. In terms of gene expression, 390 transcripts were expressed at significantly different levels between CTCF-deficient and control cortex and hippocampus. In particular, the levels of 53 isoforms of the clustered protocadherin (Pcdh genes, which are stochastically expressed in each neuron, declined markedly. Each CTCF-deficient neuron showed defects in dendritic arborization and spine density during brain development. Their excitatory postsynaptic currents showed normal amplitude but occurred with low frequency. Our results indicate that CTCF regulates functional neural development and neuronal diversity by controlling clustered Pcdh expression.

  10. Divergence and transcriptional analysis of the division cell wall (dcw) gene cluster in Neisseria spp. (United States)

    Snyder, Lori A S; Shafer, William M; Saunders, Nigel J


    Three of the 18 open reading frames in the division and cell wall synthesis cluster of the pathogenic Neisseria spp. are not present in the clusters of other bacterial species. The region containing two of these, dcaB and dcaC, displays interstrain and interspecies variability uncharacteristic of such clusters. 3' of dcaB is a Correia repeat enclosed element (CREE), which is only present in some strains. It has been suggested that this CREE is a transcriptional terminator, although we demonstrate otherwise. A gearbox-like promoter within this CREE is active in Escherichia coli but not in Neisseria meningitidis. There is an active promoter 5' of dcaC, although its sequence is not conserved. The presence of similarly located promoters has not been demonstrated in other species. In Neisseria lactamica, this promoter involves another dcw-associated CREE, the first demonstration of active promoter generation at the 5' end of this common intergenic, apparently mobile, element. Upstream of this promoter is an inverted pair of neisserial uptake signal sequences, which are commonly considered to be transcriptional terminators. It has been proposed to terminate transcription in this location, although we have demonstrated transcript extending through this uptake signal sequence. dcaC contains a 108 bp tandem repeat, which is present in different copy numbers in the neisserial strains examined. This investigation reveals extensive sequence variation, disputes the presence of transcriptional terminators and identifies active internal promoters in this normally highly conserved cluster of essential genes, and addresses the transcriptional activity of two common neisserial intergenic components.

  11. Nonblack patients with sickle cell disease have African. beta. sup s gene cluster haplotypes

    Energy Technology Data Exchange (ETDEWEB)

    Rogers, Z.R.; Powars, D.R.; Williams, W.D. (Univ. of Southern California School of Medicine, Los Angeles (USA)); Kinney, T.R. (Duke Univ., Durham, NC (USA)); Schroeder, W.A. (California Institute of Technology, Pasadena (USA))


    Of 18 nonblack patients with sickle cell disease, 14 had sickle cell anemia, 2 had hemoglobin SC disease, and 2 had hemoglobin S-{beta}{sup o}-thalassemia. The {beta}{sup s} gene cluster haplotypes that were determined in 7 patients were of African origin and were identified as Central African Republic, Central African Republic minor II, Benin, and Senegal. The haplotype Central African Republic minor II was present on the {beta}{sup o}-thalassemia chromosome in 2 patients. None of 10 patients whose {alpha}-gene status was determined had {alpha}-thalassemia-2. These data strongly support the concept that the {beta}{sup s} gene on chromosome 11 of these individuals is of African origin and that the {alpha}-gene locus on chromosome 16 is of white or native American origin. The clinical severity of the disease in these nonblack patients is appropriate to their haplotype without {alpha}-thalassemia-2 and is comparable with that of black patients. All persons with congenital hemolytic anemia should be examined for the presence of sickle cell disease regardless of physical appearance or ethnic background.

  12. Degradation of Benzene by Pseudomonas veronii 1YdBTEX2 and 1YB2 Is Catalyzed by Enzymes Encoded in Distinct Catabolism Gene Clusters (United States)

    de Lima-Morales, Daiana; Chaves-Moreno, Diego; Wos-Oxley, Melissa L.; Jáuregui, Ruy; Vilchez-Vargas, Ramiro


    Pseudomonas veronii 1YdBTEX2, a benzene and toluene degrader, and Pseudomonas veronii 1YB2, a benzene degrader, have previously been shown to be key players in a benzene-contaminated site. These strains harbor unique catabolic pathways for the degradation of benzene comprising a gene cluster encoding an isopropylbenzene dioxygenase where genes encoding downstream enzymes were interrupted by stop codons. Extradiol dioxygenases were recruited from gene clusters comprising genes encoding a 2-hydroxymuconic semialdehyde dehydrogenase necessary for benzene degradation but typically absent from isopropylbenzene dioxygenase-encoding gene clusters. The benzene dihydrodiol dehydrogenase-encoding gene was not clustered with any other aromatic degradation genes, and the encoded protein was only distantly related to dehydrogenases of aromatic degradation pathways. The involvement of the different gene clusters in the degradation pathways was suggested by real-time quantitative reverse transcription PCR. PMID:26475106

  13. Self-cloning in Streptomyces griseus of an str gene cluster for streptomycin biosynthesis and streptomycin resistance.


    Ohnuki, T; Imanaka, T; Aiba, S


    An str gene cluster containing at least four genes (strR, strA, strB, and strC) involved in streptomycin biosynthesis or streptomycin resistance or both was self-cloned in Streptomyces griseus by using plasmid pOA154. The strA gene was verified to encode streptomycin 6-phosphotransferase, a streptomycin resistance factor in S. griseus, by examining the gene product expressed in Escherichia coli. The other three genes were determined by complementation tests with streptomycin-nonproducing muta...

  14. The fnr Gene of Bacillus licheniformis and the Cysteine Ligands of the C-Terminal FeS Cluster


    Klinger, Anette; Schirawski, Jan; Glaser, Philippe; Unden, Gottfried


    In the facultatively anaerobic bacterium Bacillus licheniformis a gene encoding a protein of the fumarate nitrate reductase family of transcriptional regulators (Fnr) was isolated. Unlike Fnr proteins from gram-negative bacteria, but like Fnr from Bacillus subtilis, the protein contained a C-terminal cluster of cysteine residues. Unlike in Fnr from B. subtilis, this cluster (Cys226-X2-Cys229-X4-Cys234) is composed of only three Cys residues, which are supposed to serve together with an intern...

  15. Carotenogenesis gene cluster and phytoene desaturase catalyzing both three- and four-step desaturations from Rhodobacter azotoformans. (United States)

    Zhang, Jinhua; Lu, Lili; Yin, Lijie; Xie, Shen; Xiao, Min


    A carotenogenesis gene cluster from the purple nonsulfur photosynthetic bacterium Rhodobacter azotoformans CGMCC 6086 was cloned. A total of eight carotenogenesis genes ( crtA , crtI , crtB , tspO , crtC , crtD , crtE , and crtF ) were located in two separate regions within the genome, a 4.9 kb region containing four clustered genes of crtAIB - tspO and a 5.3 kb region containing four clustered genes of crtCDEF . The organization was unusual for a carotenogenesis gene cluster in purple photosynthetic bacteria. A gene encoding phytoene desaturase ( CrtI ) from Rba. azotoformans was expressed in Escherichia coli. The recombinant CrtI could catalyze both three- and four-step desaturations of phytoene to produce neurosporene and lycopene, and the relative contents of neurosporene and lycopene formed by CrtI were approximately 23% and 75%, respectively. Even small amounts of five-step desaturated 3,4-didehydrolycopene could be produced by CrtI . This product pattern was novel because CrtI produced only neurosporene leading to spheroidene pathway in the cells of Rba. azotoformans. In the in vitro reaction, the relative content of lycopene in desaturated products increased from 19.6% to 62.5% when phytoene reduced from 2.6 to 0.13 μM. The results revealed that the product pattern of CrtI might be affected by the kinetics.

  16. Genome mining of the hitachimycin biosynthetic gene cluster: involvement of a phenylalanine-2,3-aminomutase in biosynthesis. (United States)

    Kudo, Fumitaka; Kawamura, Koichi; Uchino, Asuka; Miyanaga, Akimasa; Numakura, Mario; Takayanagi, Ryuichi; Eguchi, Tadashi


    Hitachimycin is a macrolactam antibiotic with (S)-β-phenylalanine (β-Phe) at the starter position of its polyketide skeleton. To understand the incorporation mechanism of β-Phe and the modification mechanism of the unique polyketide skeleton, the biosynthetic gene cluster for hitachimycin in Streptomyces scabrisporus was identified by genome mining. The identified gene cluster contains a putative phenylalanine-2,3-aminomutase (PAM), five polyketide synthases, four β-amino-acid-carrying enzymes, and a characteristic amidohydrolase. A hitA knockout mutant showed no hitachimycin production, but antibiotic production was restored by feeding with (S)-β-Phe. We also confirmed the enzymatic activity of the HitA PAM. The results suggest that the identified gene cluster is responsible for the biosynthesis of hitachimycin. A plausible biosynthetic pathway for hitachimycin, including a unique polyketide skeletal transformation mechanism, is proposed.

  17. Ancient expansion of the hox cluster in lepidoptera generated four homeobox genes implicated in extra-embryonic tissue formation. (United States)

    Ferguson, Laura; Marlétaz, Ferdinand; Carter, Jean-Michel; Taylor, William R; Gibbs, Melanie; Breuker, Casper J; Holland, Peter W H


    Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes) has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina) plus a caddisfly outgroup (Glyphotaelius pellucidus) to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths). Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria), with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks.

  18. A WDR Gene Is a Conserved Member of a Chitin Synthase Gene Cluster and Influences the Cell Wall in Aspergillus nidulans

    Directory of Open Access Journals (Sweden)

    Gea Guerriero


    Full Text Available WD40 repeat (WDR proteins are pleiotropic molecular hubs. We identify a WDR gene that is a conserved genomic neighbor of a chitin synthase gene in Ascomycetes. The WDR gene is unique to fungi and plants, and was called Fungal Plant WD (FPWD. FPWD is within a cell wall metabolism gene cluster in the Ascomycetes (Pezizomycotina comprising chsD, a Chs activator and a GH17 glucanase. The FPWD, AN1556.2 locus was deleted in Aspergillus nidulans strain SAA.111 by gene replacement and only heterokaryon transformants were obtained. The re-annotation of Aspergilli genomes shows that AN1556.2 consists of two tightly linked separate genes, i.e., the WDR gene and a putative beta-flanking gene of unknown function. The WDR and the beta-flanking genes are conserved genomic neighbors localized within a recently identified metabolic cell wall gene cluster in genomes of Aspergilli. The heterokaryons displayed increased susceptibility to drugs affecting the cell wall, and their phenotypes, observed by optical, confocal, scanning electron and atomic force microscopy, suggest cell wall alterations. Quantitative real-time PCR shows altered expression of some cell wall-related genes. The possible implications on cell wall biosynthesis are discussed.

  19. Evolution of C2H2-zinc finger genes and subfamilies in mammals: Species-specific duplication and loss of clusters, genes and effector domains

    Directory of Open Access Journals (Sweden)

    Aubry Muriel


    Full Text Available Abstract Background C2H2 zinc finger genes (C2H2-ZNF constitute the largest class of transcription factors in humans and one of the largest gene families in mammals. Often arranged in clusters in the genome, these genes are thought to have undergone a massive expansion in vertebrates, primarily by tandem duplication. However, this view is based on limited datasets restricted to a single chromosome or a specific subset of genes belonging to the large KRAB domain-containing C2H2-ZNF subfamily. Results Here, we present the first comprehensive study of the evolution of the C2H2-ZNF family in mammals. We assembled the complete repertoire of human C2H2-ZNF genes (718 in total, about 70% of which are organized into 81 clusters across all chromosomes. Based on an analysis of their N-terminal effector domains, we identified two new C2H2-ZNF subfamilies encoding genes with a SET or a HOMEO domain. We searched for the syntenic counterparts of the human clusters in other mammals for which complete gene data are available: chimpanzee, mouse, rat and dog. Cross-species comparisons show a large variation in the numbers of C2H2-ZNF genes within homologous mammalian clusters, suggesting differential patterns of evolution. Phylogenetic analysis of selected clusters reveals that the disparity in C2H2-ZNF gene repertoires across mammals not only originates from differential gene duplication but also from gene loss. Further, we discovered variations among orthologs in the number of zinc finger motifs and association of the effector domains, the latter often undergoing sequence degeneration. Combined with phylogenetic studies, physical maps and an analysis of the exon-intron organization of genes from the SCAN and KRAB domains-containing subfamilies, this result suggests that the SCAN subfamily emerged first, followed by the SCAN-KRAB and finally by the KRAB subfamily. Conclusion Our results are in agreement with the "birth and death hypothesis" for the evolution of

  20. Genome-wide significant association between alcohol dependence and a variant in the ADH gene cluster. (United States)

    Frank, Josef; Cichon, Sven; Treutlein, Jens; Ridinger, Monika; Mattheisen, Manuel; Hoffmann, Per; Herms, Stefan; Wodarz, Norbert; Soyka, Michael; Zill, Peter; Maier, Wolfgang; Mössner, Rainald; Gaebel, Wolfgang; Dahmen, Norbert; Scherbaum, Norbert; Schmäl, Christine; Steffens, Michael; Lucae, Susanne; Ising, Marcus; Müller-Myhsok, Bertram; Nöthen, Markus M; Mann, Karl; Kiefer, Falk; Rietschel, Marcella


    Alcohol dependence (AD) is an important contributory factor to the global burden of disease. The etiology of AD involves both environmental and genetic factors, and the disorder has a heritability of around 50%. The aim of the present study was to identify susceptibility genes for AD by performing a genome-wide association study (GWAS). The sample comprised 1333 male in-patients with severe AD according to the Diagnostic and Statistical Manual of Mental Disorders, 4th edition, and 2168 controls. These included 487 patients and 1358 controls from a previous GWAS study by our group. All individuals were of German descent. Single-marker tests and a polygenic score-based analysis to assess the combined contribution of multiple markers with small effects were performed. The single nucleotide polymorphism (SNP) rs1789891, which is located between the ADH1B and ADH1C genes, achieved genome-wide significance [P = 1.27E-8, odds ratio (OR) = 1.46]. Other markers from this region were also associated with AD, and conditional analyses indicated that these made a partially independent contribution. The SNP rs1789891 is in complete linkage disequilibrium with the functional Arg272Gln variant (P = 1.24E-7, OR = 1.31) of the ADH1C gene, which has been reported to modify the rate of ethanol oxidation to acetaldehyde in vitro. A polygenic score-based approach produced a significant result (P = 9.66E-9). This is the first GWAS of AD to provide genome-wide significant support for the role of the ADH gene cluster and to suggest a polygenic component to the etiology of AD. The latter result may indicate that many more AD susceptibility genes still await identification.

  1. Development and mapping of SSR markers linked to resistance-gene homologue clusters in common bean

    Institute of Scientific and Technical Information of China (English)

    Luz; Nayibe; Garzon; Matthew; Wohlgemuth; Blair


    Common bean is an important but often a disease-susceptible legume crop of temperate,subtropical and tropical regions worldwide. The crop is affected by bacterial, fungal and viral pathogens. The strategy of resistance-gene homologue(RGH) cloning has proven to be an efficient tool for identifying markers and R(resistance) genes associated with resistances to diseases. Microsatellite or SSR markers can be identified by physical association with RGH clones on large-insert DNA clones such as bacterial artificial chromosomes(BACs). Our objectives in this work were to identify RGH-SSR in a BAC library from the Andean genotype G19833 and to test and map any polymorphic markers to identify associations with known positions of disease resistance genes. We developed a set of specific probes designed for clades of common bean RGH genes and then identified positive BAC clones and developed microsatellites from BACs having SSR loci in their end sequences. A total of 629 new RGH-SSRs were identified and named BMr(bean microsatellite RGH-associated markers). A subset of these markers was screened for detecting polymorphism in the genetic mapping population DOR364 × G19833. A genetic map was constructed with a total of 264 markers,among which were 80 RGH loci anchored to single-copy RFLP and SSR markers. Clusters of RGH-SSRs were observed on most of the linkage groups of common bean and in positions associated with R-genes and QTL. The use of these new markers to select for disease resistance is discussed.

  2. Mutational analysis of the thienamycin biosynthetic gene cluster from Streptomyces cattleya. (United States)

    Rodríguez, Miriam; Núñez, Luz Elena; Braña, Alfredo F; Méndez, Carmen; Salas, José A; Blanco, Gloria


    The generation of non-thienamycin-producing mutants with mutations in the thnL, thnN, thnO, and thnI genes within the thn gene cluster from Streptomyces cattleya and their involvement in thienamycin biosynthesis and regulation were previously reported. Four additional mutations were independently generated in the thnP, thnG, thnR, and thnT genes by insertional inactivation. Only the first two genes were found to play a role in thienamycin biosynthesis, since these mutations negatively or positively affect antibiotic production. A mutation of thnP results in the absence of thienamycin production, whereas a 2- to 3-fold increase in thienamycin production was observed for the thnG mutant. On the other hand, mutations in thnR and thnT showed that although these genes were previously reported to participate in this pathway, they seem to be nonessential for thienamycin biosynthesis, as thienamycin production was not affected in these mutants. High-performance liquid chromatography (HPLC)-mass spectrometry (MS) analysis of all available mutants revealed some putative intermediates in the thienamycin biosynthetic pathway. A compound with a mass corresponding to carbapenam-3-carboxylic acid was detected in some of the mutants, suggesting that the assembly of the bicyclic nucleus of thienamycin might proceed in a way analogous to that of the simplest natural carbapenem, 1-carbapen-2-em-3-carboxylic acid biosynthesis. The accumulation of a compound with a mass corresponding to 2,3-dihydrothienamycin in the thnG mutant suggests that it might be the last intermediate in the biosynthetic pathway. These data, together with the establishment of cross-feeding relationships by the cosynthesis analysis of the non-thienamycin-producing mutants, lead to a proposal for some enzymatic steps during thienamycin assembly.

  3. MicroRNAs located in the Hox gene clusters are implicated in huntington's disease pathogenesis.

    Directory of Open Access Journals (Sweden)

    Andrew G Hoss


    Full Text Available Transcriptional dysregulation has long been recognized as central to the pathogenesis of Huntington's disease (HD. MicroRNAs (miRNAs represent a major system of post-transcriptional regulation, by either preventing translational initiation or by targeting transcripts for storage or for degradation. Using next-generation miRNA sequencing in prefrontal cortex (Brodmann Area 9 of twelve HD and nine controls, we identified five miRNAs (miR-10b-5p, miR-196a-5p, miR-196b-5p, miR-615-3p and miR-1247-5p up-regulated in HD at genome-wide significance (FDR q-value<0.05. Three of these, miR-196a-5p, miR-196b-5p and miR-615-3p, were expressed at near zero levels in control brains. Expression was verified for all five miRNAs using reverse transcription quantitative PCR and all but miR-1247-5p were replicated in an independent sample (8HD/8C. Ectopic miR-10b-5p expression in PC12 HTT-Q73 cells increased survival by MTT assay and cell viability staining suggesting increased expression may be a protective response. All of the miRNAs but miR-1247-5p are located in intergenic regions of Hox clusters. Total mRNA sequencing in the same samples identified fifteen of 55 genes within the Hox cluster gene regions as differentially expressed in HD, and the Hox genes immediately adjacent to the four Hox cluster miRNAs as up-regulated. Pathway analysis of mRNA targets of these miRNAs implicated functions for neuronal differentiation, neurite outgrowth, cell death and survival. In regression models among the HD brains, huntingtin CAG repeat size, onset age and age at death were independently found to be inversely related to miR-10b-5p levels. CAG repeat size and onset age were independently inversely related to miR-196a-5p, onset age was inversely related to miR-196b-5p and age at death was inversely related to miR-615-3p expression. These results suggest these Hox-related miRNAs may be involved in neuroprotective response in HD. Recently, miRNAs have shown promise as

  4. The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines.

    Directory of Open Access Journals (Sweden)

    Julian Dopstadt

    Full Text Available Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster.

  5. The Epipolythiodiketopiperazine Gene Cluster in Claviceps purpurea: Dysfunctional Cytochrome P450 Enzyme Prevents Formation of the Previously Unknown Clapurines (United States)

    Tudzynski, Paul; Humpf, Hans-Ulrich


    Claviceps purpurea is an important food contaminant and well known for the production of the toxic ergot alkaloids. Apart from that, little is known about its secondary metabolism and not all toxic substances going along with the food contamination with Claviceps are known yet. We explored the metabolite profile of a gene cluster in C. purpurea with a high homology to gene clusters, which are responsible for the formation of epipolythiodiketopiperazine (ETP) toxins in other fungi. By overexpressing the transcription factor, we were able to activate the cluster in the standard C. purpurea strain 20.1. Although all necessary genes for the formation of the characteristic disulfide bridge were expressed in the overexpression mutants, the fungus did not produce any ETPs. Isolation of pathway intermediates showed that the common biosynthetic pathway stops after the first steps. Our results demonstrate that hydroxylation of the diketopiperazine backbone is the critical step during the ETP biosynthesis. Due to a dysfunctional enzyme, the fungus is not able to produce toxic ETPs. Instead, the pathway end-products are new unusual metabolites with a unique nitrogen-sulfur bond. By heterologous expression of the Leptosphaeria maculans cytochrome P450 encoding gene sirC, we were able to identify the end-products of the ETP cluster in C. purpurea. The thioclapurines are so far unknown ETPs, which might contribute to the toxicity of other C. purpurea strains with a potentially intact ETP cluster. PMID:27390873

  6. Characterization of the urease gene cluster from Rhizobium leguminosarum bv. viciae. (United States)

    Toffanin, Annita; Cadahia, Esther; Imperial, Juan; Ruiz-Argüeso, Tomás; Palacios, Manuel


    Moderate levels of urease activity (ca. 300 mU mg(-1)) were detected in Rhizobium leguminosarum bv. viciae UPM791 vegetative cells. This activity did not require urea for induction and was partially repressed by the addition of ammonium into the medium. Lower levels of urease activity (ca. 100 mU mg(-1)) were detected also in pea bacteroids. A DNA region of ca. 9 kb containing the urease structural genes ( ureA, ureB and ureC), accessory genes ( ureD, ureE, ureF, and ureG), and five additional ORFs ( orf83, orf135, orf207, orf223, and orf287) encoding proteins of unknown function was sequenced. Three of these ORFs ( orf83, orf135 and orf207) have a homologous counterpart in a gene cluster from Sinorhizobium meliloti, reported to be involved in urease and hydrogenase activities. R. leguminosarum mutant strains carrying Tn 5 insertions within this region exhibited a urease-negative phenotype, but induced wild-type levels of hydrogenase and nitrogenase activities in bacteroids. orf287 encodes a potential transmembrane protein with a C-terminal GGDEF domain. A mutant affected in orf287 exhibited normal levels of urease activity in culture cells. Experiments aimed at cross-complementing Ni-binding proteins required for urease and hydrogenase synthesis (UreE and HypB, respectively) indicated that these two proteins are not functionally interchangeable in R. leguminosarum.

  7. Regulation of the Apolipoprotein Gene Cluster by a Long Noncoding RNA

    Directory of Open Access Journals (Sweden)

    Paul Halley


    Full Text Available Apolipoprotein A1 (APOA1 is the major protein component of high-density lipoprotein (HDL in plasma. We have identified an endogenously expressed long noncoding natural antisense transcript, APOA1-AS, which acts as a negative transcriptional regulator of APOA1 both in vitro and in vivo. Inhibition of APOA1-AS in cultured cells resulted in the increased expression of APOA1 and two neighboring genes in the APO cluster. Chromatin immunoprecipitation (ChIP analyses of a ∼50 kb chromatin region flanking the APOA1 gene demonstrated that APOA1-AS can modulate distinct histone methylation patterns that mark active and/or inactive gene expression through the recruitment of histone-modifying enzymes. Targeting APOA1-AS with short antisense oligonucleotides also enhanced APOA1 expression in both human and monkey liver cells and induced an increase in hepatic RNA and protein expression in African green monkeys. Furthermore, the results presented here highlight the significant local modulatory effects of long noncoding antisense RNAs and demonstrate the therapeutic potential of manipulating the expression of these transcripts both in vitro and in vivo.

  8. Unsupervised clustering of gene expression data points at hypoxia as possible trigger for metabolic syndrome

    Directory of Open Access Journals (Sweden)

    York David


    Full Text Available Abstract Background Classification of large volumes of data produced in a microarray experiment allows for the extraction of important clues as to the nature of a disease. Results Using multi-dimensional unsupervised FOREL (FORmal ELement algorithm we have re-analyzed three public datasets of skeletal muscle gene expression in connection with insulin resistance and type 2 diabetes (DM2. Our analysis revealed the major line of variation between expression profiles of normal, insulin resistant, and diabetic skeletal muscle. A cluster of most "metabolically sound" samples occupied one end of this line. The distance along this line coincided with the classic markers of diabetes risk, namely obesity and insulin resistance, but did not follow the accepted clinical diagnosis of DM2 as defined by the presence or absence of hyperglycemia. Genes implicated in this expression pattern are those controlling skeletal muscle fiber type and glycolytic metabolism. Additionally myoglobin and hemoglobin were upregulated and ribosomal genes deregulated in insulin resistant patients. Conclusion Our findings are concordant with the changes seen in skeletal muscle with altitude hypoxia. This suggests that hypoxia and shift to glycolytic metabolism may also drive insulin resistance.

  9. Sexuality generates diversity in the aflatoxin gene cluster: evidence on a global scale.

    Directory of Open Access Journals (Sweden)

    Geromy G Moore

    Full Text Available Aflatoxins are produced by Aspergillus flavus and A. parasiticus in oil-rich seed and grain crops and are a serious problem in agriculture, with aflatoxin B₁ being the most carcinogenic natural compound known. Sexual reproduction in these species occurs between individuals belonging to different vegetative compatibility groups (VCGs. We examined natural genetic variation in 758 isolates of A. flavus, A. parasiticus and A. minisclerotigenes sampled from single peanut fields in the United States (Georgia, Africa (Benin, Argentina (Córdoba, Australia (Queensland and India (Karnataka. Analysis of DNA sequence variation across multiple intergenic regions in the aflatoxin gene clusters of A. flavus, A. parasiticus and A. minisclerotigenes revealed significant linkage disequilibrium (LD organized into distinct blocks that are conserved across different localities, suggesting that genetic recombination is nonrandom and a global occurrence. To assess the contributions of asexual and sexual reproduction to fixation and maintenance of toxin chemotype diversity in populations from each locality/species, we tested the null hypothesis of an equal number of MAT1-1 and MAT1-2 mating-type individuals, which is indicative of a sexually recombining population. All samples were clone-corrected using multi-locus sequence typing which associates closely with VCG. For both A. flavus and A. parasiticus, when the proportions of MAT1-1 and MAT1-2 were significantly different, there was more extensive LD in the aflatoxin cluster and populations were fixed for specific toxin chemotype classes, either the non-aflatoxigenic class in A. flavus or the B₁-dominant and G₁-dominant classes in A. parasiticus. A mating type ratio close to 1∶1 in A. flavus, A. parasiticus and A. minisclerotigenes was associated with higher recombination rates in the aflatoxin cluster and less pronounced chemotype differences in populations. This work shows that the reproductive nature of

  10. The type VI secretion system gene cluster of Salmonella typhimurium: required for full virulence in mice. (United States)

    Liu, Ji; Guo, Ji-Tao; Li, Yong-Guo; Johnston, Randal N; Liu, Gui-Rong; Liu, Shu-Lin


    Type VI secretion system (T6SS) has increasingly been believed to participate in the infection process for many bacterial pathogens, but its role in the virulence of Salmonella typhimurium remains unclear. To look into this, we deleted the T6SS cluster from the genome of S. typhimurium 14028s and analyzed the phenotype of the resulting T6SS knockout mutant (T6SSKO mutant) in vitro and in vivo. We found that the T6SSKO mutant exhibited reduced capability in colonizing the spleen and liver in an in vivo colonization competition model in BALB/c mice infected by the oral route. Additionally, infection via intraperitoneal administration also showed that the T6SSKO mutant was less capable of colonizing the mouse spleen and liver than the wild-type strain. We did not detect significant differences between the T6SSKO and wild-type strains in epithelial cell invasion tests. However, in the macrophage RAW264.7 cell line, the T6SSKO mutant survived and proliferated significantly more poorly than the wild-type strain. These findings indicate that T6SS gene cluster is required for full virulence of S. typhimurium 14028s in BALB/c mice, possibly due to its roles in bacterial survival and proliferation in macrophages.

  11. Regulatory cross talk and microbial induction of fungal secondary metabolite gene clusters. (United States)

    Nützmann, Hans-Wilhelm; Schroeckh, Volker; Brakhage, Axel A


    Filamentous fungi are well-known producers of a wealth of secondary metabolites with various biological activities. Many of these compounds such as penicillin, cyclosporine, or lovastatin are of great importance for human health. Genome sequences of filamentous fungi revealed that the encoded potential to produce secondary metabolites is much higher than the actual number of compounds produced during cultivation in the laboratory. This finding encouraged research groups to develop new methods to exploit the silent reservoir of secondary metabolites. In this chapter, we present three successful strategies to induce the expression of secondary metabolite gene clusters. They are based on the manipulation of the molecular processes controlling the biosynthesis of secondary metabolites and the simulation of stimulating environmental conditions leading to altered metabolic profiles. The presented methods were successfully applied to identify novel metabolites. They can be also used to significantly increase product yields.

  12. Beta-globin gene cluster haplotypes in Venezuelan sickle cell patients from the State of Aragua

    Directory of Open Access Journals (Sweden)

    Nancy Moreno


    Full Text Available Seven polymorphic sites in the beta-globin gene cluster were analyzed on a sample of 96 chromosomes of Venezuelan sickle cell patients from the State of Aragua. The Benin haplotype was predominant with a frequency of 0.479, followed by the Bantu haplotype (0.406; a minority of cases with other haplotypes was also identified: atypical Bantu A2 (0.042, Senegal (0.031, atypical Bantu A7 (0.021 and Saudi Arabia/Indian (0.021 haplotypes; however, the Cameroon haplotype was not identified in this study. Our results are in agreement with the historical records that establish Sudanese and Bantu origins for the African slaves brought into Venezuela.

  13. beta(S)-Globin gene cluster haplotypes in the West Bank of Palestine. (United States)

    Samarah, Fekri; Ayesh, Suhail; Athanasiou, Miranda; Christakis, John; Vavatsi, Norma


    Sickle cell disease is an inherited autosomal recessive disorder of the beta-globin chain. In Palestine it is accompanied by a low level of Hb F (mean 5.14%) and a severe clinical presentation. In this study, 59 Palestinian patients, homozygotes for Hb S were studied for their haplotype background. Eight polymorphic sites in the beta-globin gene cluster were examined. The Benin haplotype was predominant with a frequency of 88.1%, followed by a frequency of 5.1% for the Bantu haplotype. One chromosome was found to carry the Cameroon haplotype (0.85%). Three atypical haplotypes were also found (5.95%). Heterogeneity was observed in Hb F production, ranging between 1.5 and 17.0%, whereas the (G)gamma ratio was homogeneous among all haplotypes with a normal amount of about 41%. Our results are in agreement with previous reports of the Benin haplotype origin in the Mediterranean.

  14. Interferon-α/β receptor-mediated selective induction of a gene cluster by CpG oligodeoxynucleotide 2006

    Directory of Open Access Journals (Sweden)

    Wakiguchi Hiroshi


    Full Text Available Abstract Background Oligodeoxynucleotides containing unmethylated CpG motifs (CpG ODN are known to exert a strong adjuvant effect on Th1 immune responses. Although several genes have been reported, no comprehensive study of the gene expression profiles in human cells after stimulation with CpG ODN has been reported. Results This study was designed to identify a CpG-inducible gene cluster that potentially predicts for the molecular mechanisms of clinical efficacy of CpG ODN, by determining mRNA expression in human PBMC after stimulation with CpG ODN. PBMCs were obtained from the peripheral blood of healthy volunteers and cultured in the presence or absence of CpG ODN 2006 for up to 24 hours. The mRNA expression profile was evaluated using a high-density oligonucleotide probe array, GeneChip®. Using hierarchical clustering-analysis, out of a total of 10,000 genes we identified a cluster containing 77 genes as having been up-regulated by CpG ODN. This cluster was further divided into two sub-clusters by means of time-kinetics. (1 Inflammatory cytokines such as IL-6 and GM-CSF were up-regulated predominantly 3 to 6 hours after stimulation with CpG ODN, presumably through activation of a transcription factor, NF-κB. (2 Interferon (IFN-inducible anti-viral proteins, including IFIT1, OAS1 and Mx1, and Th1 chemoattractant IP-10, were up-regulated predominantly 6 to 24 hours after stimulation. Blocking with mAb against IFN-α/β receptor strongly inhibited the induction of these IFN-inducible genes by CpG ODN. Conclusion This study provides new information regarding the possible immunomodulatory effects of CpG ODN in vivo via an IFN-α/β receptor-mediated paracrine pathway.

  15. Characterization and expression of genes from the RubisCO gene cluster of the chemoautotrophic symbiont of Solemya velum: cbbLSQO. (United States)

    Schwedock, Julie; Harmer, Tara L; Scott, Kathleen M; Hektor, Harm J; Seitz, Angelica P; Fontana, Matthew C; Distel, Daniel L; Cavanaugh, Colleen M


    Chemoautotrophic endosymbionts residing in Solemya velum gills provide this shallow water clam with most of its nutritional requirements. The cbb gene cluster of the S. velum symbiont, including cbbL and cbbS, which encode the large and small subunits of the carbon-fixing enzyme ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO), was cloned and expressed in Escherichia coli. The recombinant RubisCO had a high specific activity, approximately 3 micromol min(-1) mg protein (-1), and a KCO2 of 40.3 microM. Based on sequence identity and phylogenetic analyses, these genes encode a form IA RubisCO, both subunits of which are closely related to those of the symbiont of the deep-sea hydrothermal vent gastropod Alviniconcha hessleri and the photosynthetic bacterium Allochromatium vinosum. In the cbb gene cluster of the S. velum symbiont, the cbbLS genes were followed by cbbQ and cbbO, which are found in some but not all cbb gene clusters and whose products are implicated in enhancing RubisCO activity post-translationally. cbbQ shares sequence similarity with nirQ and norQ, found in denitrification clusters of Pseudomonas stutzeri and Paracoccus denitrificans. The 3' region of cbbO from the S. velum symbiont, like that of the three other known cbbO genes, shares similarity to the 3' region of norD in the denitrification cluster. This is the first study to explore the cbb gene structure for a chemoautotrophic endosymbiont, which is critical both as an initial step in evaluating cbb operon structure in chemoautotrophic endosymbionts and in understanding the patterns and forces governing RubisCO evolution and physiology.

  16. Comparing large covariance matrices under weak conditions on the dependence structure and its application to gene clustering. (United States)

    Chang, Jinyuan; Zhou, Wen; Zhou, Wen-Xin; Wang, Lan


    Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN.

  17. The Sound of Silence: Activating Silent Biosynthetic Gene Clusters in Marine Microorganisms

    Directory of Open Access Journals (Sweden)

    F. Jerry Reen


    Full Text Available Unlocking the rich harvest of marine microbial ecosystems has the potential to both safeguard the existence of our species for the future, while also presenting significant lifestyle benefits for commercial gain. However, while significant advances have been made in the field of marine biodiscovery, leading to the introduction of new classes of therapeutics for clinical medicine, cosmetics and industrial products, much of what this natural ecosystem has to offer is locked in, and essentially hidden from our screening methods. Releasing this silent potential represents a significant technological challenge, the key to which is a comprehensive understanding of what controls these systems. Heterologous expression systems have been successful in awakening a number of these cryptic marine biosynthetic gene clusters (BGCs. However, this approach is limited by the typically large size of the encoding sequences. More recently, focus has shifted to the regulatory proteins associated with each BGC, many of which are signal responsive raising the possibility of exogenous activation. Abundant among these are the LysR-type family of transcriptional regulators, which are known to control production of microbial aromatic systems. Although the environmental signals that activate these regulatory systems remain unknown, it offers the exciting possibility of evoking mimic molecules and synthetic expression systems to drive production of potentially novel natural products in microorganisms. Success in this field has the potential to provide a quantum leap forward in medical and industrial bio-product development. To achieve these new endpoints, it is clear that the integrated efforts of bioinformaticians and natural product chemists will be required as we strive to uncover new and potentially unique structures from silent or cryptic marine gene clusters.

  18. Analysis of healthy cohorts for single nucleotide polymorphisms in C1q gene cluster

    Directory of Open Access Journals (Sweden)



    Full Text Available C1q is the first component of the classical pathway of complement activation. The coding region for C1q is localized on chromosome 1p34.1–36.3. Mutations or single nucleotide polymorphisms (SNPs in C1q gene cluster can cause developing of Systemic lupus erythematosus (SLE because of C1q deficiency or other unknown reason. We selected five SNPs located in 7.121 kbp region on chromosome 1, which were previously associated with SLE and/or low C1q level, but not causing C1q deficiency and analyzed them in terms of allele frequencies and genotype distribution in comparison with Hispanic, Asian, African and other Caucasian cohorts. These SNPs were: rs587585, rs292001, rs172378, rs294179 and rs631090. One hundred eighty five healthy Bulgarian volunteers were genotyped for the selected five C1q SNPs by quantative real-time PCR methods. International HapMap Project has been used for information about genotype distribution and allele frequencies of the five SNPs in, Hispanics, Asians, Africans and others Caucasian cohorts. Bulgarian healthy volunteers and another pooled Caucasian cohort had similar frequencies of genotypes and alleles of rs587585, rs292001, rs294179 and rs631090 SNPs. Nevertheless, genotype AA of rs172378 was significantly overrepresented in Bulgarians when compared to other healthy Caucasians from USA and UK (60% vs 31%. Genotype distribution of rs172378 in Bulgarians was similar to Greek-Cyriot Caucasians. For all Caucasians the major allele of rs172378 was A. This is the first study analyzing the allele frequencies and genotype distribution of C1q gene cluster SNPs in Bulgarian healthy population.

  19. Gene-Set Local Hierarchical Clustering (GSLHC--A Gene Set-Based Approach for Characterizing Bioactive Compounds in Terms of Biological Functional Groups.

    Directory of Open Access Journals (Sweden)

    Feng-Hsiang Chung

    Full Text Available Gene-set-based analysis (GSA, which uses the relative importance of functional gene-sets, or molecular signatures, as units for analysis of genome-wide gene expression data, has exhibited major advantages with respect to greater accuracy, robustness, and biological relevance, over individual gene analysis (IGA, which uses log-ratios of individual genes for analysis. Yet IGA remains the dominant mode of analysis of gene expression data. The Connectivity Map (CMap, an extensive database on genomic profiles of effects of drugs and small molecules and widely used for studies related to repurposed drug discovery, has been mostly employed in IGA mode. Here, we constructed a GSA-based version of CMap, Gene-Set Connectivity Map (GSCMap, in which all the genomic profiles in CMap are converted, using gene-sets from the Molecular Signatures Database, to functional profiles. We showed that GSCMap essentially eliminated cell-type dependence, a weakness of CMap in IGA mode, and yielded significantly better performance on sample clustering and drug-target association. As a first application of GSCMap we constructed the platform Gene-Set Local Hierarchical Clustering (GSLHC for discovering insights on coordinated actions of biological functions and facilitating classification of heterogeneous subtypes on drug-driven responses. GSLHC was shown to tightly clustered drugs of known similar properties. We used GSLHC to identify the therapeutic properties and putative targets of 18 compounds of previously unknown characteristics listed in CMap, eight of which suggest anti-cancer activities. The GSLHC website contains 1,857 local hierarchical clusters accessible by querying 555 of the 1,309 drugs and small molecules listed in CMap. We expect GSCMap and GSLHC to be widely useful in providing new insights in the biological effect of bioactive compounds, in drug repurposing, and in function-based classification of complex diseases.

  20. Co-transcription of the celC gene cluster in Clostridium thermocellum. (United States)

    Newcomb, Michael; Millen, Jonathan; Chen, Chun-Yu; Wu, J H David


    Clostridium thermocellum, an anaerobic, thermophilic, and ethanogenic bacterium produces a large cellulase complex termed the cellulosome and many free glycosyl hydrolases. Most cellulase genes scatter around the genome. We mapped the transcripts of the six-gene cluster celC-glyR3-licA-orf4-manB-celT and determined their transcription initiation sites by primer extension. Northern blot showed that celC-glyR3-licA were co-transcribed into a polycistronic messenger with the transcription initiation site at -20 bp. Furthermore, RT-PCR mapping showed that manB and celT, two cellulosomal genes immediately downstream, were co-transcribed into a bicistronic messenger with the initiation site at -233 bp. In contrast, rf4 was transcribed alone with the two initiation sites at -130 and -138 bp, respectively. Finally, quantitative RT-PCR analysis showed that celC, glyR3, and licA were coordinately induced by growing on laminarin, a β-1,3 glucan. Gene expression peaked at the late exponential phase. Taking together with our previous report that GlyR3 binds to the celC promoter in the absence of laminaribiose, a β-1,3 glucose dimer, these results indicate that celC, glyR3, and licA form an operon repressible by GlyR3 and inducible by laminaribiose, signaling the availability of β-1,3 glucan. The celC operon is the first glycosyl hydrolase operon reported in this bacterium.

  1. Automatic extraction of gene ontology annotation and its correlation with clusters in protein networks

    Directory of Open Access Journals (Sweden)

    Mazo Ilya


    Full Text Available Abstract Background Uncovering cellular roles of a protein is a task of tremendous importance and complexity that requires dedicated experimental work as well as often sophisticated data mining and processing tools. Protein functions, often referred to as its annotations, are believed to manifest themselves through topology of the networks of inter-proteins interactions. In particular, there is a growing body of evidence that proteins performing the same function are more likely to interact with each other than with proteins with other functions. However, since functional annotation and protein network topology are often studied separately, the direct relationship between them has not been comprehensively demonstrated. In addition to having the general biological significance, such demonstration would further validate the data extraction and processing methods used to compose protein annotation and protein-protein interactions datasets. Results We developed a method for automatic extraction of protein functional annotation from scientific text based on the Natural Language Processing (NLP technology. For the protein annotation extracted from the entire PubMed, we evaluated the precision and recall rates, and compared the performance of the automatic extraction technology to that of manual curation used in public Gene Ontology (GO annotation. In the second part of our presentation, we reported a large-scale investigation into the correspondence between communities in the literature-based protein networks and GO annotation groups of functionally related proteins. We found a comprehensive two-way match: proteins within biological annotation groups form significantly denser linked network clusters than expected by chance and, conversely, densely linked network communities exhibit a pronounced non-random overlap with GO groups. We also expanded the publicly available GO biological process annotation using the relations extracted by our NLP technology

  2. A novel and complete gene cluster involved in the degradation of aniline by Delftia sp.AN3

    Institute of Scientific and Technical Information of China (English)

    ZHANG Tao; ZHANG Jinglei; LIU Shuangjiang; LIU Zhipei


    A recombinant strain, Escherichia coli JM109-AN1,was obtained by constructing of a genomic library of the total DNA of Delftia sp.AN3 in E. coli JM109 and screening for catechol 2,3-dioxygenase activity.This recombinant strain could grow on aniline as sole carbou,nitrogen and energy source.Enzymatic assays revealed that the exogenous genes including aniline dioxygenase (AD) and catechol 2,3-dioxygenase (C23O) genes could well express in the recombinant strain with the activities of AD and C23O up to O.31 U/mg wet cell and 1.92 U/mg crude proteins.respectively.The AD or C23O of strain AN3 could only catalyze aniline or catechol but not any other substituted substrates.This recombinant strain contained a recombinant plasmid,pKC505-AN1,in which a 29.7-kb DNA fragment from Delftia sp.AN3 was inserted.Sequencing and open reading frame (orfs) analysis of this 29.7 kb fragment revealed that it contained at least 27 orfs,among them a gene cluster (consisting of at least 16 genes,named danQTAlA2BRDCEFGlHIJKG2) was responsible for the complete metabolism of aniline to TCA-cycle intermediates.This gene cluster could be divided into two main parts,the upper sequences consisted of 7 genes (danQTAlA2BRD) were predicted to encode a multi-component aniline dioxygenase and a LysR-type regulator, and the central genes (danCEFGIHIJKG2) were expected to encode meta-cleavage pathway enzymes for catechol degradation to TCA-cycle intermediates.Unlike clusters tad from Delftia tsuruhatensis AD9 and tdn from Pseudomonas put/da UCC22,in this gene cluster,all the genes were in the Same transcriptional direction.There was only one set of C23O gene (danC) and ferredoxin-like protein gene fdanD).The presence of only one set of these two genes and specificity of AD and C23O might be the reason for strain AN3 could only degrade aniline.The products ofdanQTA1A2BRDC showed 99%-100% identity to those from Delflia acidovorans 7N.and 50%-85% identity to those of tad cluster from D.tsuruhatensis AD9 in

  3. Variability in the sxt Gene Clusters of PSP Toxin Producing Aphanizomenon gracile Strains from Norway, Spain, Germany and North America. (United States)

    Ballot, Andreas; Cerasino, Leonardo; Hostyeva, Vladyslava; Cirés, Samuel


    Paralytic shellfish poisoning (PSP) toxin production has been detected worldwide in the cyanobacterial genera Anabaena, Lyngbya, Scytonema, Cuspidothrix and Aphanizomenon. In Europe Aphanizomenon gracile and Cuspidothrix issatschenkoi are the only known producers of PSP toxins and are found in Southwest and Central European freshwater bodies. In this study the PSP toxin producing Aphanizomenon sp. strain NIVA-CYA 851 was isolated from the Norwegian Lake Hillestadvannet. In a polyphasic approach NIVA-CYA 851 was morphologically and phylogenetically classified, and investigated for toxin production. The strain NIVA-CYA 851 was identified as A. gracile using 16S rRNA gene phylogeny and was confirmed to produce neosaxitoxin, saxitoxin and gonyautoxin 5 by LC-MS. The whole sxt gene clusters (circa 27.3 kb) of four A. gracile strains: NIVA-CYA 851 (Norway); NIVA-CYA 655 & NIVA-CYA 676 (Germany); and UAM 529 (Spain), all from latitudes between 40° and 59° North were sequenced and compared with the sxt gene cluster of reference strain A. gracile NH-5 from the USA. All five sxt gene clusters are highly conserved with similarities exceeding 99.4%, but they differ slightly in the number and presence of single nucleotide polymorphisms (SNPs) and insertions/deletions (In/Dels). Altogether 178 variable sites (44 SNPs and 4 In/Dels, comprising 134 nucleotides) were found in the sxt gene clusters of the Norwegian, German and Spanish strains compared to the reference strain. Thirty-nine SNPs were located in 16 of the 27 coding regions. The sxt gene clusters of NIVA-CYA 851, NIVA-CYA 655, NIVA-CYA 676 and UAM 529, were characterized by 15, 16, 19 and 23 SNPs respectively. Only the Norwegian strain NIVA-CYA 851 possessed an insertion of 126 base pairs (bp) in the noncoding area between the sxtA and sxtE genes and a deletion of 6 nucleotides in the sxtN gene. The sxtI gene showed the highest variability and is recommended as the best genetic marker for further phylogenetic studies

  4. Gene cloning, purification, and characterization of two cyanobacterial NifS homologs driving iron-sulfur cluster formation. (United States)

    Kato, S; Mihara, H; Kurihara, T; Yoshimura, T; Esaki, N


    Iron-sulfur proteins are essential in the photosynthetic system and many other biological processes. We have isolated and characterized enzymes driving the formation of iron-sulfur clusters from Synechocystis sp. PCC6803. Two genes (slr0387 and sll0704), showing similarity to nifS of Azotobacter vinelandii, were cloned, and their gene products (SsCsdl and SsCsd2) were purified. They catalyzed the desulfuration of L-cysteine. Reconstitution of a [2Fe-2S] cluster of cyanobacterial ferredoxin proceeded much faster in the presence of L-cysteine and either of these enzymes than when using sodium sulfide. These results suggest that SsCsdl and SsCsd2 facilitate the iron-sulfur cluster assembly by producing inorganic sulfur from L-cysteine. Synechocystis sp. PCC6803 has no gene coding for a protein with similarity to the N-terminal domain of NifU of A. vinelandii, which is believed to cooperate with NifS to assemble iron-sulfur clusters. Thus, the cluster formation in the cyanobacterium probably proceeds through a mechanism that is different from that in A. vinelandii.

  5. A minimal nitrogen fixation gene cluster from Paenibacillus sp. WLY78 enables expression of active nitrogenase in Escherichia coli.

    Directory of Open Access Journals (Sweden)

    Liying Wang

    Full Text Available Most biological nitrogen fixation is catalyzed by molybdenum-dependent nitrogenase, an enzyme complex comprising two component proteins that contains three different metalloclusters. Diazotrophs contain a common core of nitrogen fixation nif genes that encode the structural subunits of the enzyme and components required to synthesize the metalloclusters. However, the complement of nif genes required to enable diazotrophic growth varies significantly amongst nitrogen fixing bacteria and archaea. In this study, we identified a minimal nif gene cluster consisting of nine nif genes in the genome of Paenibacillus sp. WLY78, a gram-positive, facultative anaerobe isolated from the rhizosphere of bamboo. We demonstrate that the nif genes in this organism are organized as an operon comprising nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV and that the nif cluster is under the control of a σ(70 (σ(A-dependent promoter located upstream of nifB. To investigate genetic requirements for diazotrophy, we transferred the Paenibacillus nif cluster to Escherichia coli. The minimal nif gene cluster enables synthesis of catalytically active nitrogenase in this host, when expressed either from the native nifB promoter or from the T7 promoter. Deletion analysis indicates that in addition to the core nif genes, hesA plays an important role in nitrogen fixation and is responsive to the availability of molybdenum. Whereas nif transcription in Paenibacillus is regulated in response to nitrogen availability and by the external oxygen concentration, transcription from the nifB promoter is constitutive in E. coli, indicating that negative regulation of nif transcription is bypassed in the heterologous host. This study demonstrates the potential for engineering nitrogen fixation in a non-nitrogen fixing organism with a minimum set of nine nif genes.

  6. A minimal nitrogen fixation gene cluster from Paenibacillus sp. WLY78 enables expression of active nitrogenase in Escherichia coli. (United States)

    Wang, Liying; Zhang, Lihong; Liu, Zhanzhi; Liu, Zhangzhi; Zhao, Dehua; Liu, Xiaomeng; Zhang, Bo; Xie, Jianbo; Hong, Yuanyuan; Li, Pengfei; Chen, Sanfeng; Dixon, Ray; Li, Jilun


    Most biological nitrogen fixation is catalyzed by molybdenum-dependent nitrogenase, an enzyme complex comprising two component proteins that contains three different metalloclusters. Diazotrophs contain a common core of nitrogen fixation nif genes that encode the structural subunits of the enzyme and components required to synthesize the metalloclusters. However, the complement of nif genes required to enable diazotrophic growth varies significantly amongst nitrogen fixing bacteria and archaea. In this study, we identified a minimal nif gene cluster consisting of nine nif genes in the genome of Paenibacillus sp. WLY78, a gram-positive, facultative anaerobe isolated from the rhizosphere of bamboo. We demonstrate that the nif genes in this organism are organized as an operon comprising nifB, nifH, nifD, nifK, nifE, nifN, nifX, hesA and nifV and that the nif cluster is under the control of a σ(70) (σ(A))-dependent promoter located upstream of nifB. To investigate genetic requirements for diazotrophy, we transferred the Paenibacillus nif cluster to Escherichia coli. The minimal nif gene cluster enables synthesis of catalytically active nitrogenase in this host, when expressed either from the native nifB promoter or from the T7 promoter. Deletion analysis indicates that in addition to the core nif genes, hesA plays an important role in nitrogen fixation and is responsive to the availability of molybdenum. Whereas nif transcription in Paenibacillus is regulated in response to nitrogen availability and by the external oxygen concentration, transcription from the nifB promoter is constitutive in E. coli, indicating that negative regulation of nif transcription is bypassed in the heterologous host. This study demonstrates the potential for engineering nitrogen fixation in a non-nitrogen fixing organism with a minimum set of nine nif genes.

  7. The human met-ase gene (GZMM): Structure, sequence, and close physical linkage to the serine protease gene cluster on 19p13.3

    Energy Technology Data Exchange (ETDEWEB)

    Pilat, D.; Zimmer, M.; Wekerle, H. [Max-Planck-Institut fuer Psychiatrie, Martinsried (Germany)] [and others


    Cosmid clones containing the genes for the human and murine natural killer cell serine protease Met-ase (gene symbol GZMM; granzyme M) were identified by screening human and murine cosmid libraries with rat Met-ase (RNIK-Met-1) cDNA. The human gene has a size of 7.5 kb and an exon-intron structure identical to that of serine protease genes located on human chromosomes 5q11-q12, 14q11.2, and 19p13.3 that are expressed by lymphocytes, mast cells, or myelomonocyte precursors. Using cosmid DNA as a probe for fluorescence in situ hybridization, we identified the chromosomal position of human Met-ase as 19p13.3. Interphase studies with two differentially labeled probes for Met-ase and the azurocidin (AZU1), proteinase 3 (PRTN3), and neutrophil elastase (ELA2) gene cluster revealed that the distance of Met-ase from this gene cluster is in the range of 200 to 500 kb. Using differentially labeled mouse cosmid probes, we also mapped the murine gene for Met-ase to chromosomal band 10C, close to the gene for lamin B2. Thus, the Met-ase, AZU1, PRTN3, and ELA2 genes fall into an established region of homology between mouse chromosomal band 10C and human 19p13.3. 35 refs., 4 figs.

  8. Assessment of clusters of transcription factor binding sites in relationship to human promoter, CpG islands and gene expression

    Directory of Open Access Journals (Sweden)

    Sakaki Yoshiyuki


    Full Text Available Abstract Background Gene expression is regulated mainly by transcription factors (TFs that interact with regulatory cis-elements on DNA sequences. To identify functional regulatory elements, computer searching can predict TF binding sites (TFBS using position weight matrices (PWMs that represent positional base frequencies of collected experimentally determined TFBS. A disadvantage of this approach is the large output of results for genomic DNA. One strategy to identify genuine TFBS is to utilize local concentrations of predicted TFBS. It is unclear whether there is a general tendency for TFBS to cluster at promoter regions, although this is the case for certain TFBS. Also unclear is the identification of TFs that have TFBS concentrated in promoters and to what level this occurs. This study hopes to answer some of these questions. Results We developed the cluster score measure to evaluate the correlation between predicted TFBS clusters and promoter sequences for each PWM. Non-promoter sequences were used as a control. Using the cluster score, we identified a PWM group called PWM-PCP, in which TFBS clusters positively correlate with promoters, and another PWM group called PWM-NCP, in which TFBS clusters negatively correlate with promoters. The PWM-PCP group comprises 47% of the 199 vertebrate PWMs, while the PWM-NCP group occupied 11 percent. After reducing the effect of CpG islands (CGI against the clusters using partial correlation coefficients among three properties (promoter, CGI and predicted TFBS cluster, we identified two PWM groups including those strongly correlated with CGI and those not correlated with CGI. Conclusion Not all PWMs predict TFBS correlated with human promoter sequences. Two main PWM groups were identified: (1 those that show TFBS clustered in promoters associated with CGI, and (2 those that show TFBS clustered in promoters independent of CGI. Assessment of PWM matches will allow more positive interpretation of TFBS in

  9. DNA assembler: a synthetic biology tool for characterizing and engineering natural product gene clusters. (United States)

    Shao, Zengyi; Zhao, Huimin


    The majority of existing antibacterial and anticancer drugs are natural products or their derivatives. However, the characterization and engineering of these compounds are often hampered by limited ability to manipulate the corresponding biosynthetic pathways. Recently, we developed a genomics-driven, synthetic biology-based method, DNA assembler, for discovery, characterization, and engineering of natural product biosynthetic pathways (Shao, Luo, & Zhao, 2011). By taking advantage of the highly efficient yeast in vivo homologous recombination mechanism, this method synthesizes the entire expression vector containing the target biosynthetic pathway and the genetic elements needed for DNA maintenance and replication in individual hosts in a single-step manner. In this chapter, we describe the general guidelines for construct design. By using two distinct biosynthetic pathways, we demonstrate that DNA assembler can perform multiple tasks, including heterologous expression, introduction of single or multiple point mutations, scar-less gene deletion, generation of product derivatives, and creation of artificial gene clusters. As such, this method offers unprecedented flexibility and versatility in pathway manipulations.

  10. Glutamic acid promotes monacolin K production and monacolin K biosynthetic gene cluster expression in Monascus. (United States)

    Zhang, Chan; Liang, Jian; Yang, Le; Chai, Shiyuan; Zhang, Chenxi; Sun, Baoguo; Wang, Chengtao


    This study investigated the effects of glutamic acid on production of monacolin K and expression of the monacolin K biosynthetic gene cluster. When Monascus M1 was grown in glutamic medium instead of in the original medium, monacolin K production increased from 48.4 to 215.4 mg l(-1), monacolin K production increased by 3.5 times. Glutamic acid enhanced monacolin K production by upregulating the expression of mokB-mokI; on day 8, the expression level of mokA tended to decrease by Reverse Transcription-polymerase Chain Reaction. Our findings demonstrated that mokA was not a key gene responsible for the quantity of monacolin K production in the presence of glutamic acid. Observation of Monascus mycelium morphology using Scanning Electron Microscope showed glutamic acid significantly increased the content of Monascus mycelium, altered the permeability of Monascus mycelium, enhanced secretion of monacolin K from the cell, and reduced the monacolin K content in Monascus mycelium, thereby enhancing monacolin K production.

  11. [Association analysis between polymorphisms of PON gene cluster with coronary heart disease in Chinese]. (United States)

    Wang, Xiao-Ling; Fan, Zhong-Jie; Huang, Jian-Feng; Su, Shao-Yong; Zhao, Jian-Gong; Gu, Dong-Feng


    An extensive association analysis of PON gene cluster (PONs) with coronary heart disease (CHD) was performed in Chinese Han population. Eleven polymorphisms of PON1, PON2 and PON3 gene were investigated for association with CHD in 474 male patients and 475 controls. Univariate analyses showed the cases had significantly higher frequencies of PON1 192Q allele, 160R allele, -162A allele and PON2 311C allele than were seen in the controls. Logistic regression analyses revealed only the PON1 R160G and -162G/A polymorphisms remained significantly associated with CHD (P = 0.0054 and P = 0.0002). Haplotype analyses for various polymorphism combinations further confirmed the results of individual polymorphism analyses. Only the frequencies of haplotypes containing -162A allele were significantly higher,whereas only the frequencies of haplotypes containing 160G allele significantly lower in cases than those in controls in various polymorphism combinations. This extensive association study has identified the PON1 -162G/A and R160G polymorphisms to be independently associated with CHD in Chinese Han population,and warrants further study to elucidate the biological mechanism.

  12. Candidate gene analysis of selectin cluster in patients with multiple sclerosis. (United States)

    Fenoglio, Chiara; Scalabrini, Diego; Piccio, Laura; De Riz, Milena; Venturelli, Eliana; Cortini, Francesca; Villa, Chiara; Serpente, Maria; Parks, Becky; Rinker, John; Cross, Anne H; Bresolin, Nereo; Scarpini, Elio; Galimberti, Daniela


    Three single nucleotide polymorphisms (SNPs) with a potential impact on the function of selectins (rs6133, rs4987310 and rs5368 substitutions localized in the coding regions of P-sel, L-sel and E-sel, respectively) were analyzed in an Italian population of 165 patients with multiple sclerosis (MS) as compared with 149 controls and in a replication American population of Caucasian descent consisting of 122 patients and 50 controls. No significant differences in either allelic or genotypic frequency in all the SNPs tested were found in the Italian population. A tendency to an increased frequency of the rs6133 T allele was observed in the American population, but applying the Bonferroni correction the significance threshold was not reached. Haploview analysis demonstrated that rs4987310 and rs5368 markers are in strong LD (D' = 0.97) in both populations. Combining the two SNPs, we found no difference in haplotype distribution in patients compared with controls, either in Italian or in American population. Despite the fact that selectins play a role in the pathogenesis of MS and their encoding genes are located in regions associated with the disease, the selectin gene cluster studied likely does not influence the susceptibility to MS in Caucasians.

  13. Causal and Synthetic Associations of Variants in the SERPINA Gene Cluster with Alpha1-antitrypsin Serum Levels

    DEFF Research Database (Denmark)

    Thun, Gian Andri; Imboden, Medea; Ferrarotti, Ilaria


    a genome-wide association study (GWAS) in 1392 individuals of the SAPALDIA cohort. Five common SNPs, defined by showing minor allele frequencies (MAFs) >5%, reached genome-wide significance, all located in the SERPINA gene cluster at 14q32.13. The top-ranking genotyped SNP rs4905179 was associated...

  14. An ALMT1 gene cluster controlling aluminium (aluminum) tolerance at the Alt4 locus of rye (Secale cereale L.) (United States)

    Aluminium toxicity is a major problem in agriculture worldwide. Among the cultivated triticeae, rye (Secale cereale L.) is one of the most Al-tolerant and represents an important potential source of Al-tolerance for improvement of wheat. The Alt4 Al-tolerance locus of rye contains a cluster of genes...

  15. Evolution and genetic population structure of prickly lettuce (Lactuca serriola) and its RGC2 resistance gene cluster

    NARCIS (Netherlands)

    Kuang, H.; Eck, van H.J.; Sicard, D.; Michelmore, R.; Nevo, E.


    Genetic structure and diversity of natural populations of prickly lettuce (Lactuca serriola) were studied using AFLP markers and then compared with the diversity of the RGC2 disease resistance gene cluster. Screening of 696 accessions from 41 populations using 319 AFLP markers showed that eastern Tu

  16. Deletion of a regulatory gene within the cpk gene cluster reveals novel antibacterial activity in Streptomyces coelicolor A3(2)

    NARCIS (Netherlands)

    Gottelt, Marco; Kol, Stefan; Gomez-Escribano, Juan Pablo; Bibb, Mervyn; Takano, Eriko; Herron, P.R.


    Genome sequencing of Streptomyces coelicolor A3(2) revealed an uncharacterized type I polyketide synthase gene cluster (cpk) Here we describe the discovery of a novel antibacterial activity (abCPK) and a yellow-pigmented secondary metabolite (yCPK) after deleting a presumed pathway-specific regulato

  17. Evolution of the C-Type Lectin-Like Receptor Genes of the DECTIN-1 Cluster in the NK Gene Complex

    Directory of Open Access Journals (Sweden)

    Susanne Sattler


    Full Text Available Pattern recognition receptors are crucial in initiating and shaping innate and adaptive immune responses and often belong to families of structurally and evolutionarily related proteins. The human C-type lectin-like receptors encoded in the DECTIN-1 cluster within the NK gene complex contain prominent receptors with pattern recognition function, such as DECTIN-1 and LOX-1. All members of this cluster share significant homology and are considered to have arisen from subsequent gene duplications. Recent developments in sequencing and the availability of comprehensive sequence data comprising many species showed that the receptors of the DECTIN-1 cluster are not only homologous to each other but also highly conserved between species. Even in Caenorhabditis elegans, genes displaying homology to the mammalian C-type lectin-like receptors have been detected. In this paper, we conduct a comprehensive phylogenetic survey and give an up-to-date overview of the currently available data on the evolutionary emergence of the DECTIN-1 cluster genes.

  18. GenCLiP: a software program for clustering gene lists by literature profiling and constructing gene co-occurrence networks related to custom keywords

    Directory of Open Access Journals (Sweden)

    Zhou Yi-Bo


    Full Text Available Abstract Background Biomedical researchers often want to explore pathogenesis and pathways regulated by abnormally expressed genes, such as those identified by microarray analyses. Literature mining is an important way to assist in this task. Many literature mining tools are now available. However, few of them allows the user to make manual adjustments to zero in on what he/she wants to know in particular. Results We present our software program, GenCLiP (Gene Cluster with Literature Profiles, which is based on the methods presented by Chaussabel and Sher (Genome Biol 2002, 3(10:RESEARCH0055 that search gene lists to identify functional clusters of genes based on up-to-date literature profiling. Four features were added to this previously described method: the ability to 1 manually curate keywords extracted from the literature, 2 search genes and gene co-occurrence networks related to custom keywords, 3 compare analyzed gene results with negative and positive controls generated by GenCLiP, and 4 calculate probabilities that the resulting genes and gene networks are randomly related. In this paper, we show with a set of differentially expressed genes between keloids and normal control, how implementation of functions in GenCLiP successfully identified keywords related to the pathogenesis of keloids and unknown gene pathways involved in the pathogenesis of keloids. Conclusion With regard to the identification of disease-susceptibility genes, GenCLiP allows one to quickly acquire a primary pathogenesis profile and identify pathways involving abnormally expressed genes not previously associated with the disease.

  19. Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata. (United States)

    Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun


    The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.

  20. A novel snoRNA gene cluster in yeast is transcribed as polycistronic pre-snoRNAs

    Institute of Scientific and Technical Information of China (English)

    陆勇军; 周惠; 周惟欣; 朱远琪; 屈良鹄


    Small nueleolar RNAs (snoRNAs) play an important role in eukaryotic rRNA biogenesis. By combination of a computer search of EMBL database and experimental procedure, a novel snoRNA coding sequence (Z8) was screened out and characterized from yeast Saccharomyces cerevisiae genome. Z8 snoRNA gene codes a boxC/D antisonse snoRNA which guides, deduced from structure analysis, the 2’-O-ribose methylation at U2421 of 25S rRNA. After disruption of Z8 snoRNA gene, the methylation at corresponding site was abolished, but no growth delay was observed in various cultural temperatures. Z8 DNA is the first gene of a gene cluster consisting of three cognate snoRNA genes which are located on an intergenie region of chromosome ⅩⅢ. This gene cluster is co-transcribed as a pelycistronic precursor from a+247 bp U snoRNA gene promoter, followed by processing to release individual snoRNAs, representing a new expression pattern of snoRNA genes.

  1. A Functional Bikaverin Biosynthesis Gene Cluster in Rare Strains of Botrytis cinerea Is Positively Controlled by VELVET (United States)

    Schumacher, Julia; Gautier, Angélique; Morgant, Guillaume; Studt, Lena; Ducrot, Paul-Henri; Le Pêcheur, Pascal; Azeddine, Saad; Fillinger, Sabine; Leroux, Pierre; Tudzynski, Bettina; Viaud, Muriel


    The gene cluster responsible for the biosynthesis of the red polyketidic pigment bikaverin has only been characterized in Fusarium ssp. so far. Recently, a highly homologous but incomplete and nonfunctional bikaverin cluster has been found in the genome of the unrelated phytopathogenic fungus Botrytis cinerea. In this study, we provided evidence that rare B. cinerea strains such as 1750 have a complete and functional cluster comprising the six genes orthologous to Fusarium fujikuroi ffbik1-ffbik6 and do produce bikaverin. Phylogenetic analysis confirmed that the whole cluster was acquired from Fusarium through a horizontal gene transfer (HGT). In the bikaverin-nonproducing strain B05.10, the genes encoding bikaverin biosynthesis enzymes are nonfunctional due to deleterious mutations (bcbik2-3) or missing (bcbik1) but interestingly, the genes encoding the regulatory proteins BcBIK4 and BcBIK5 do not harbor deleterious mutations which suggests that they may still be functional. Heterologous complementation of the F. fujikuroi Δffbik4 mutant confirmed that bcbik4 of strain B05.10 is indeed fully functional. Deletion of bcvel1 in the pink strain 1750 resulted in loss of bikaverin and overproduction of melanin indicating that the VELVET protein BcVEL1 regulates the biosynthesis of the two pigments in an opposite manner. Although strain 1750 itself expresses a truncated BcVEL1 protein (100 instead of 575 aa) that is nonfunctional with regard to sclerotia formation, virulence and oxalic acid formation, it is sufficient to regulate pigment biosynthesis (bikaverin and melanin) and fenhexamid HydR2 type of resistance. Finally, a genetic cross between strain 1750 and a bikaverin-nonproducing strain sensitive to fenhexamid revealed that the functional bikaverin cluster is genetically linked to the HydR2 locus. PMID:23308280

  2. ThioFinder: a web-based tool for the identification of thiopeptide gene clusters in DNA sequences.

    Directory of Open Access Journals (Sweden)

    Jing Li

    Full Text Available Thiopeptides are a growing class of sulfur-rich, highly modified heterocyclic peptides that are mainly active against Gram-positive bacteria including various drug-resistant pathogens. Recent studies also reveal that many thiopeptides inhibit the proliferation of human cancer cells, further expanding their application potentials for clinical use. Thiopeptide biosynthesis shares a common paradigm, featuring a ribosomally synthesized precursor peptide and conserved posttranslational modifications, to afford a characteristic core system, but differs in tailoring to furnish individual members. Identification of new thiopeptide gene clusters, by taking advantage of increasing information of DNA sequences from bacteria, may facilitate new thiopeptide discovery and enrichment of the unique biosynthetic elements to produce novel drug leads by applying the principle of combinatorial biosynthesis. In this study, we have developed a web-based tool ThioFinder to rapidly identify thiopeptide biosynthetic gene cluster from DNA sequence using a profile Hidden Markov Model approach. Fifty-four new putative thiopeptide biosynthetic gene clusters were found in the sequenced bacterial genomes of previously unknown producing microorganisms. ThioFinder is fully supported by an open-access database ThioBase, which contains the sufficient information of the 99 known thiopeptides regarding the chemical structure, biological activity, producing organism, and biosynthetic gene (cluster along with the associated genome if available. The ThioFinder website offers researchers a unique resource and great flexibility for sequence analysis of thiopeptide biosynthetic gene clusters. ThioFinder is freely available at

  3. A functional bikaverin biosynthesis gene cluster in rare strains of Botrytis cinerea is positively controlled by VELVET.

    Directory of Open Access Journals (Sweden)

    Julia Schumacher

    Full Text Available The gene cluster responsible for the biosynthesis of the red polyketidic pigment bikaverin has only been characterized in Fusarium ssp. so far. Recently, a highly homologous but incomplete and nonfunctional bikaverin cluster has been found in the genome of the unrelated phytopathogenic fungus Botrytis cinerea. In this study, we provided evidence that rare B. cinerea strains such as 1750 have a complete and functional cluster comprising the six genes orthologous to Fusarium fujikuroi ffbik1-ffbik6 and do produce bikaverin. Phylogenetic analysis confirmed that the whole cluster was acquired from Fusarium through a horizontal gene transfer (HGT. In the bikaverin-nonproducing strain B05.10, the genes encoding bikaverin biosynthesis enzymes are nonfunctional due to deleterious mutations (bcbik2-3 or missing (bcbik1 but interestingly, the genes encoding the regulatory proteins BcBIK4 and BcBIK5 do not harbor deleterious mutations which suggests that they may still be functional. Heterologous complementation of the F. fujikuroi Δffbik4 mutant confirmed that bcbik4 of strain B05.10 is indeed fully functional. Deletion of bcvel1 in the pink strain 1750 resulted in loss of bikaverin and overproduction of melanin indicating that the VELVET protein BcVEL1 regulates the biosynthesis of the two pigments in an opposite manner. Although strain 1750 itself expresses a truncated BcVEL1 protein (100 instead of 575 aa that is nonfunctional with regard to sclerotia formation, virulence and oxalic acid formation, it is sufficient to regulate pigment biosynthesis (bikaverin and melanin and fenhexamid HydR2 type of resistance. Finally, a genetic cross between strain 1750 and a bikaverin-nonproducing strain sensitive to fenhexamid revealed that the functional bikaverin cluster is genetically linked to the HydR2 locus.

  4. Insights into the evolutionary origins of clostridial neurotoxins from analysis of the Clostridium botulinum strain A neurotoxin gene cluster

    Directory of Open Access Journals (Sweden)

    Meiering Elizabeth M


    Full Text Available Abstract Background Clostridial neurotoxins (CNTs are the most deadly toxins known and causal agents of botulism and tetanus neuroparalytic diseases. Despite considerable progress in understanding CNT structure and function, the evolutionary origins of CNTs remain a mystery as they are unique to Clostridium and possess a sequence and structural architecture distinct from other protein families. Uncovering the origins of CNTs would be a significant contribution to our understanding of how pathogens evolve and generate novel toxin families. Results The C. botulinum strain A genome was examined for potential homologues of CNTs. A key link was identified between the neurotoxin and the flagellin gene (CBO0798 located immediately upstream of the BoNT/A neurotoxin gene cluster. This flagellin sequence displayed the strongest sequence similarity to the neurotoxin and NTNH homologue out of all proteins encoded within C. botulinum strain A. The CBO0798 gene contains a unique hypervariable region, which in closely related flagellins encodes a collagenase-like domain. Remarkably, these collagenase-containing flagellins were found to possess the characteristic HEXXH zinc-protease motif responsible for the neurotoxin's endopeptidase activity. Additional links to collagenase-related sequences and functions were detected by further analysis of CNTs and surrounding genes, including sequence similarities to collagen-adhesion domains and collagenases. Furthermore, the neurotoxin's HCRn domain was found to exhibit both structural and sequence similarity to eukaryotic collagen jelly-roll domains. Conclusion Multiple lines of evidence suggest that the neurotoxin and adjacent genes evolved from an ancestral collagenase-like gene cluster, linking CNTs to another major family of clostridial proteolytic toxins. Duplication, reshuffling and assembly of neighboring genes within the BoNT/A neurotoxin gene cluster may have lead to the neurotoxin's unique architecture. This

  5. Three classes of plasmid (47-63 kb) carry the type B neurotoxin gene cluster of group II Clostridium botulinum. (United States)

    Carter, Andrew T; Austin, John W; Weedmark, Kelly A; Corbett, Cindi; Peck, Michael W


    Pulsed-field gel electrophoresis and DNA sequence analysis of 26 strains of Group II (nonproteolytic) Clostridium botulinum type B4 showed that 23 strains carried their neurotoxin gene cluster on a 47-63 kb plasmid (three strains lacked any hybridization signal for the neurotoxin gene, presumably having lost their plasmid). Unexpectedly, no neurotoxin genes were found on the chromosome. This apparent constraint on neurotoxin gene transfer to the chromosome stands in marked contrast to Group I C. botulinum, in which neurotoxin gene clusters are routinely found in both locations. The three main classes of type B4 plasmid identified in this study shared different regions of homology, but were unrelated to any Group I or Group III plasmid. An important evolutionary aspect firmly links plasmid class to geographical origin, with one class apparently dominant in marine environments, whereas a second class is dominant in European terrestrial environments. A third class of plasmid is a hybrid between the other two other classes, providing evidence for contact between these seemingly geographically separated populations. Mobility via conjugation has been previously demonstrated for the type B4 plasmid of strain Eklund 17B, and similar genes associated with conjugation are present in all type B4 plasmids now described. A plasmid toxin-antitoxin system pemI gene located close to the neurotoxin gene cluster and conserved in each type B4 plasmid class may be important in understanding the mechanism which regulates this unique and unexpected bias toward plasmid-borne neurotoxin genes in Group II C. botulinum type B4.

  6. Power training and postmenopausal hormone therapy affect transcriptional control of specific co-regulated gene clusters in skeletal muscle (United States)

    Fey, Vidal; Törmäkangas, Timo; Ronkainen, Paula H. A.; Taaffe, Dennis R.; Takala, Timo; Koskinen, Satu; Cheng, Sulin; Puolakka, Jukka; Kujala, Urho M.; Suominen, Harri; Sipilä, Sarianna; Kovanen, Vuokko


    At the moment, there is no clear molecular explanation for the steeper decline in muscle performance after menopause or the mechanisms of counteractive treatments. The goal of this genome-wide study was to identify the genes and gene clusters through which power training (PT) comprising jumping activities or estrogen containing hormone replacement therapy (HRT) may affect skeletal muscle properties after menopause. We used musculus vastus lateralis samples from early stage postmenopausal (50–57 years old) women participating in a yearlong randomized double-blind placebo-controlled trial with PT and HRT interventions. Using microarray platform with over 24,000 probes, we identified 665 differentially expressed genes. The hierarchical clustering method was used to assort the genes. Additionally, enrichment analysis of gene ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways was carried out to clarify whether assorted gene clusters are enriched with particular functional categories. The analysis revealed transcriptional regulation of 49 GO/KEGG categories. PT upregulated transcription in “response to contraction”—category revealing novel candidate genes for contraction-related regulation of muscle function while HRT upregulated gene expression related to functionality of mitochondria. Moreover, several functional categories tightly related to muscle energy metabolism, development, and function were affected regardless of the treatment. Our results emphasize that during the early stages of the postmenopause, muscle properties are under transcriptional modulation, which both PT and HRT partially counteract leading to preservation of muscle power and potentially reducing the risk for aging-related muscle weakness. More specifically, PT and HRT may function through improving energy metabolism, response to contraction as well as by preserving functionality of the mitochondria. Electronic supplementary material The online version of this

  7. Characterization and transcriptional analysis of two gene clusters for type IV secretion machinery in Wolbachia of Armadillidium vulgare

    DEFF Research Database (Denmark)

    Félix, Christine; Pichon, Samuel; Braquart-Varnier, Christine;


    Wolbachia are maternally inherited alpha-proteobacteria that induce feminization of genetic males in most terrestrial crustacean isopods. Two clusters of vir genes for a type IV secretion machinery have been identified at two separate loci and characterized for the first time in a feminizing...... Wolbachia. Furthermore, we demonstrated that these operons are transcriptionally active in ovaries and in all other tissues tested, suggesting that T4SS has a significant role in Wolbachia biology. These observations and the identification of homologous vir genes in Wolbachia strains infecting insects...... or nematodes show that vir genes are conserved among Wolbachia strains whatever the phenotype induced by the bacteria....

  8. Presence of CTX gene cluster in environmental non-O1/O139 Vibrio cholerae and its potential clinical significance

    Directory of Open Access Journals (Sweden)

    B Bakhshi


    Full Text Available Purpose: The aim of this study was to understand the epidemiological linkage of clinical and environmental isolates of Vibrio cholerae and to determine their genotypes and virulence genes content. Materials and Methods: A total of 60 V. cholerae strains obtained from clinical specimens (n = 40 and surface waters (n = 20 were subjected to genotyping using PFGE and determination of their virulence-associated gene clusters. Result: PCR analysis showed the presence of chromosomally located hly and RTX genetic elements in 100% and 90% of the environmental isolates, respectively. The phage-mediated genetic elements such as CTX, TLC and VPI were detected in 5% of the environmental isolates suggesting that the environmental isolates cannot acquire certain mobile gene clusters. A total of 4 and 18 pulsotypes were obtained among the clinical and environmental V. cholerae isolates, respectively. Non-pathogenic environmentally isolated V. cholerae constituted a distinct cluster with one single non-O1, non-O139 strain (EP6 carrying the virulence genes similar to the epidemic strains. This may suggest the possible potential of conversion of non-pathogenic to a pathogenic environmental strain. Conclusions: The emergence of a single environmental isolate in our study containing the pathogenicity genes amongst the diverse non-pathogenic environmental isolates needs to be further studied in the context of V. cholerae pathogenicity sero-coversion.

  9. Genetic variants of the FADS gene cluster and ELOVL gene family, colostrums LC-PUFA levels, breastfeeding, and child cognition.

    Directory of Open Access Journals (Sweden)

    Eva Morales

    Full Text Available INTRODUCTION: Breastfeeding effects on cognition are attributed to long-chain polyunsaturated fatty acids (LC-PUFAs, but controversy persists. Genetic variation in fatty acid desaturase (FADS and elongase (ELOVL enzymes has been overlooked when studying the effects of LC-PUFAs supply on cognition. We aimed to: 1 to determine whether maternal genetic variants in the FADS cluster and ELOVL genes contribute to differences in LC-PUFA levels in colostrum; 2 to analyze whether these maternal variants are related to child cognition; and 3 to assess whether children's variants modify breastfeeding effects on cognition. METHODS: Data come from two population-based birth cohorts (n = 400 mother-child pairs from INMA-Sabadell; and n = 340 children from INMA-Menorca. LC-PUFAs were measured in 270 colostrum samples from INMA-Sabadell. Tag SNPs were genotyped both in mothers and children (13 in the FADS cluster, 6 in ELOVL2, and 7 in ELOVL5. Child cognition was assessed at 14 mo and 4 y using the Bayley Scales of Infant Development and the McCarthy Scales of Children's Abilities, respectively. RESULTS: Children of mothers carrying genetic variants associated with lower FADS1 activity (regulating AA and EPA synthesis, higher FADS2 activity (regulating DHA synthesis, and with higher EPA/AA and DHA/AA ratios in colostrum showed a significant advantage in cognition at 14 mo (3.5 to 5.3 points. Not being breastfed conferred an 8- to 9-point disadvantage in cognition among children GG homozygote for rs174468 (low FADS1 activity but not among those with the A allele. Moreover, not being breastfed resulted in a disadvantage in cognition (5 to 8 points among children CC homozygote for rs2397142 (low ELOVL5 activity, but not among those carrying the G allele. CONCLUSION: Genetically determined maternal supplies of LC-PUFAs during pregnancy and lactation appear to be crucial for child cognition. Breastfeeding effects on cognition are modified by child genetic

  10. Cluster-cluster clustering (United States)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C. S.


    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales.

  11. Cluster-cluster clustering

    Energy Technology Data Exchange (ETDEWEB)

    Barnes, J.; Dekel, A.; Efstathiou, G.; Frenk, C.S.


    The cluster correlation function xi sub c(r) is compared with the particle correlation function, xi(r) in cosmological N-body simulations with a wide range of initial conditions. The experiments include scale-free initial conditions, pancake models with a coherence length in the initial density field, and hybrid models. Three N-body techniques and two cluster-finding algorithms are used. In scale-free models with white noise initial conditions, xi sub c and xi are essentially identical. In scale-free models with more power on large scales, it is found that the amplitude of xi sub c increases with cluster richness; in this case the clusters give a biased estimate of the particle correlations. In the pancake and hybrid models (with n = 0 or 1), xi sub c is steeper than xi, but the cluster correlation length exceeds that of the points by less than a factor of 2, independent of cluster richness. Thus the high amplitude of xi sub c found in studies of rich clusters of galaxies is inconsistent with white noise and pancake models and may indicate a primordial fluctuation spectrum with substantial power on large scales. 30 references.

  12. Identification and activation of novel biosynthetic gene clusters by genome mining in the kirromycin producer Streptomyces collinus Tü 365

    DEFF Research Database (Denmark)

    Iftime, Dumitrita; Kulik, Andreas; Härtner, Thomas;


    Streptomycetes are prolific sources of novel biologically active secondary metabolites with pharmaceutical potential. S. collinus Tü 365 is a Streptomyces strain, isolated 1972 from Kouroussa (Guinea). It is best known as producer of the antibiotic kirromycin, an inhibitor of the protein...... metabolisms predicted for S. collinus Tü 365 includes PKS, NRPS, PKS-NRPS hybrids, a lanthipeptide, terpenes and siderophores. While some of these gene clusters were found to contain genes related to known secondary metabolites, which also could be detected in HPLC–MS analyses, most of the uncharacterized...... biosynthesis interacting with elongation factor EF-Tu. Genome Mining revealed 32 gene clusters encoding the biosynthesis of diverse secondary metabolites in the genome of Streptomyces collinus Tü 365, indicating an enormous biosynthetic potential of this strain. The structural diversity of secondary...

  13. Molecular evolution of the nif gene cluster carrying nifI1 and nifI2 genes in the Gram-positive phototrophic bacterium Heliobacterium chlorum. (United States)

    Enkh-Amgalan, Jigjiddorj; Kawasaki, Hiroko; Seki, Tatsuji


    A major nif cluster was detected in the strictly anaerobic, Gram-positive phototrophic bacterium Heliobacterium chlorum. The cluster consisted of 11 genes arranged within a 10 kb region in the order nifI1, nifI2, nifH, nifD, nifK, nifE, nifN, nifX, fdx, nifB and nifV. The phylogenetic position of Hbt. chlorum was the same in the NifH, NifD, NifK, NifE and NifN trees; Hbt. chlorum formed a cluster with Desulfitobacterium hafniense, the closest neighbour of heliobacteria based on the 16S rRNA phylogeny, and two species of the genus Geobacter belonging to the Deltaproteobacteria. Two nifI genes, known to occur in the nif clusters of methanogenic archaea between nifH and nifD, were found upstream of the nifH gene of Hbt. chlorum. The organization of the nif operon and the phylogeny of individual and concatenated gene products showed that the Hbt. chlorum nif operon carrying nifI genes upstream of the nifH gene was an intermediate between the nif operon with nifI downstream of nifH (group II and III of the nitrogenase classification) and the nif operon lacking nifI (group I). Thus, the phylogenetic position of Hbt. chlorum nitrogenase may reflect an evolutionary stage of a divergence of the two nitrogenase groups, with group I consisting of the aerobic diazotrophs and group II consisting of strictly anaerobic prokaryotes.

  14. The chloroplast atpA gene cluster in Chlamydomonas reinhardtii. Functional analysis of a polycistronic transcription unit. (United States)

    Drapier, D; Suzuki, H; Levy, H; Rimbault, B; Kindle, K L; Stern, D B; Wollman, F A


    Most chloroplast genes in vascular plants are organized into polycistronic transcription units, which generate a complex pattern of mono-, di-, and polycistronic transcripts. In contrast, most Chlamydomonas reinhardtii chloroplast transcripts characterized to date have been monocistronic. This paper describes the atpA gene cluster in the C. reinhardtii chloroplast genome, which includes the atpA, psbI, cemA, and atpH genes, encoding the alpha-subunit of the coupling-factor-1 (CF1) ATP synthase, a small photosystem II polypeptide, a chloroplast envelope membrane protein, and subunit III of the CF0 ATP synthase, respectively. We show that promoters precede the atpA, psbI, and atpH genes, but not the cemA gene, and that cemA mRNA is present only as part of di-, tri-, or tetracistronic transcripts. Deletions introduced into the gene cluster reveal, first, that CF1-alpha can be translated from di- or polycistronic transcripts, and, second, that substantial reductions in mRNA quantity have minimal effects on protein synthesis rates. We suggest that posttranscriptional mRNA processing is common in C. reinhardtii chloroplasts, permitting the expression of multiple genes from a single promoter.

  15. Transgene-induced silencing of the zoosporogenesis-specific NIFC gene cluster of Phytophthora infestans involves chromatin alterations. (United States)

    Judelson, Howard S; Tani, Shuji


    Clustered within the genome of the oomycete phytopathogen Phytophthora infestans are four genes encoding spore-specific nuclear LIM interactor-interacting factors (NIF proteins, a type of transcriptional regulator) that are moderately conserved in DNA sequence. NIFC1, NIFC2, and NIFC3 are zoosporogenesis-induced and grouped within 4 kb, and 20 kb away resides a sporulation-induced form, NIFS. To test the function of the NIFC family, plasmids expressing full-length hairpin constructs of NIFC1 or NIFC2 were stably transformed into P. infestans. This triggered silencing of the cognate gene in about one-third of transformants, and all three NIFC genes were usually cosilenced. However, NIFS escaped silencing despite its high sequence similarity to the NIFC genes. Silencing of the three NIFC genes impaired zoospore cyst germination by 60% but did not affect other aspects of the life cycle. Silencing was transcriptional based on nuclear run-on assays and associated with tighter chromatin packing based on nuclease accessibility experiments. The chromatin alterations extended a few hundred nucleotides beyond the boundaries of the transcribed region of the NIFC cluster and were not associated with increased DNA methylation. A plasmid expressing a short hairpin RNA having sequence similarity only to NIFC1 silenced both that gene and an adjacent member of the gene cluster, likely due to the expansion of a heterochromatic domain from the targeted locus. These data help illuminate the mechanism of silencing in Phytophthora and suggest that caution should be used when interpreting silencing experiments involving closely spaced genes.

  16. Serial changes in expression of functionally clustered genes in progression of liver fibrosis in hepatitis C patients

    Institute of Scientific and Technical Information of China (English)

    Yoshiyuki Takahara; Mitsuo Takahashi; Qing-Wei Zhang; Hirotaka Wagatsuma; Maiko Mori; Akihiro Tamori; Susumu Shiomi; Shuhei Nishiguchi


    AIM: To investigate the relationship of changes in expression of marker genes in functional categories or molecular networks comprising one functional category or multiple categories in progression of hepatic fibrosis in hepatitis C (HCV) patients.METHODS: Marker genes were initially identified using DNA microarray data from a rat liver fibrosis model. The expression level of each fibrosis associated marker gene was analyzed using reverse transcription-polymerase chain reaction (RT-PCR) in clinical biopsy specimens from HCV-positive patients (n = 61). Analysis of changes in expression patterns and interactions of marker genes in functional categories was used to assess the biological mechanism of fibrosis.RESULTS: The profile data showed several biological changes associated with progression of hepatic fibrosis. Clustered genes in functional categories showed sequential changes in expression. Several sets of clustered genes, including those related to the extracellular matrix (ECM), inflammation, lipid metabolism, steroid metabolism, and some transcription factors important for hepatic biology showed expression changes in the immediate early phase (F1/F2) of fibrosis. Genes associated with aromatic amino acid (AA) metabolism, sulfur-containing AA metabolism and insulin/ Wnt signaling showed expression changes in the middle phase (F2/F3), and some genes related to glucose metabolism showed altered expression in the late phase of fibrosis (F3/F4). Therefore, molecular networks showing serial changes in gene expression are present in liver fibrosis progression in hepatitis C patients.CONCLUSION: Analysis of gene expression profiles from a perspective of functional categories or molecular networks provides an understanding of disease and suggests new diagnostic methods. Selected marker genes have potential utility for biological identification of advanced fibrosis.

  17. Mutation of the iron-sulfur cluster assembly gene IBA57 causes fatal infantile leukodystrophy. (United States)

    Debray, François-Guillaume; Stümpfig, Claudia; Vanlander, Arnaud V; Dideberg, Vinciane; Josse, Claire; Caberg, Jean-Hubert; Boemer, François; Bours, Vincent; Stevens, René; Seneca, Sara; Smet, Joél; Lill, Roland; van Coster, Rudy


    Leukodystrophies are a heterogeneous group of severe genetic neurodegenerative disorders. A multiple mitochondrial dysfunctions syndrome was found in an infant presenting with a progressive leukoencephalopathy. Homozygosity mapping, whole exome sequencing, and functional studies were used to define the underlying molecular defect. Respiratory chain studies in skeletal muscle isolated from the proband revealed a combined deficiency of complexes I and II. In addition, western blotting indicated lack of protein lipoylation. The combination of these findings was suggestive for a defect in the iron-sulfur (Fe/S) protein assembly pathway. SNP array identified loss of heterozygosity in large chromosomal regions, covering the NFU1 and BOLA3, and the IBA57 and ABCB10 candidate genes, in 2p15-p11.2 and 1q31.1-q42.13, respectively. A homozygous c.436C > T (p.Arg146Trp) variant was detected in IBA57 using whole exome sequencing. Complementation studies in a HeLa cell line depleted for IBA57 showed that the mutant protein with the semi-conservative amino acid exchange was unable to restore the biochemical phenotype indicating a loss-of-function mutation of IBA57. In conclusion, defects in the Fe/S protein assembly gene IBA57 can cause autosomal recessive neurodegeneration associated with progressive leukodystrophy and fatal outcome at young age. In the affected patient, the biochemical phenotype was characterized by a defect in the respiratory chain complexes I and II and a decrease in mitochondrial protein lipoylation, both resulting from impaired assembly of Fe/S clusters.

  18. Beta-globin gene cluster haplotypes in sickle cell patients from southwest Iran. (United States)

    Rahimi, Z; Karimi, M; Haghshenass, M; Merat, A


    Sickle cell anemia in Iran is accompanied by a high level of HbF and mild clinical presentation. Here we report haplotypes of the beta gene cluster found in 81 randomly selected sickle cell patients, including 47 sickle cell anemia (SS), 17 sickle cell trait (AS), and 17 sickle/thalassemia (S/thal) from southwest Iran. We found all five common typical haplotypes as well as five atypical haplotypes in our patients. Except for four patients with homozygous Benin haplotype, none of the other African typical haplotypes were found in a homozygous state. Arab-Indian was found to be the most prevalent haplotype in the study population. This haplotype accounted for 51.1% as the homozygous form in SS patients, where 69.1% of chromosomes in these patients had the Arab-Indian haplotype. Bantu A2 was the second most prevalent haplotype among all patients. The mean %HbF in SS patients was 27.83 and in the homozygous Arab-Indian haplotype it was still higher (30.40%), while in AS patients the %HbF was only 1.20. The high %Ggamma chain (71.81) in the Arab-Indian homozygous haplotype was concomitant with the presence of an Xmn I site in both chromosomes. The presence of the Arab-Indian haplotype as the predominant haplotype might be suggestive of a gene flow to/from Saudi Arabia or India. More haplotype investigations of a normal population can clarify the high incidence of Bantu A2 haplotype in our population.

  19. The Fdb3 transcription factor of the Fusarium Detoxification of Benzoxazolinone gene cluster is required for MBOA but not BOA degradation in Fusarium pseudograminearum. (United States)

    Kettle, Andrew J; Carere, Jason; Batley, Jacqueline; Manners, John M; Kazan, Kemal; Gardiner, Donald M


    A number of cereals produce the benzoxazolinone class of phytoalexins. Fusarium species pathogenic towards these hosts can typically degrade these compounds via an aminophenol intermediate, and the ability to do so is encoded by a group of genes found in the Fusarium Detoxification of Benzoxazolinone (FDB) cluster. A zinc finger transcription factor encoded by one of the FDB cluster genes (FDB3) has been proposed to regulate the expression of other genes in the cluster and hence is potentially involved in benzoxazolinone degradation. Herein we show that Fdb3 is essential for the ability of Fusarium pseudograminearum to efficiently detoxify the predominant wheat benzoxazolinone, 6-methoxy-benzoxazolin-2-one (MBOA), but not benzoxazoline-2-one (BOA). Furthermore, additional genes thought to be part of the FDB gene cluster, based upon transcriptional response to benzoxazolinones, are regulated by Fdb3. However, deletion mutants for these latter genes remain capable of benzoxazolinone degradation, suggesting that they are not essential for this process.

  20. Genes encoding Cher-TPR fusion proteins are predominantly found in gene clusters encoding chemosensory pathways with alternative cellular functions. (United States)

    Muñoz-Martínez, Francisco; García-Fontana, Cristina; Rico-Jiménez, Miriam; Alfonso, Carlos; Krell, Tino


    Chemosensory pathways correspond to major signal transduction mechanisms and can be classified into the functional families flagellum-mediated taxis, type four pili-mediated taxis or pathways with alternative cellular functions (ACF). CheR methyltransferases are core enzymes in all of these families. CheR proteins fused to tetratricopeptide repeat (TPR) domains have been reported and we present an analysis of this uncharacterized family. We show that CheR-TPRs are widely distributed in GRAM-negative but almost absent from GRAM-positive bacteria. Most strains contain a single CheR-TPR and its abundance does not correlate with the number of chemoreceptors. The TPR domain fused to CheR is comparatively short and frequently composed of 2 repeats. The majority of CheR-TPR genes were found in gene clusters that harbor multidomain response regulators in which the REC domain is fused to different output domains like HK, GGDEF, EAL, HPT, AAA, PAS, GAF, additional REC, HTH, phosphatase or combinations thereof. The response regulator architectures coincide with those reported for the ACF family of pathways. Since the presence of multidomain response regulators is a distinctive feature of this pathway family, we conclude that CheR-TPR proteins form part of ACF type pathways. The diversity of response regulator output domains suggests that the ACF pathways form a superfamily which regroups many different regulatory mechanisms, in which all CheR-TPR proteins appear to participate. In the second part we characterize WspC of Pseudomonas putida, a representative example of CheR-TPR. The affinities of WspC-Pp for S-adenosylmethionine and S-adenosylhomocysteine were comparable to those of prototypal CheR, indicating that WspC-Pp activity is in analogy to prototypal CheRs controlled by product feed-back inhibition. The removal of the TPR domain did not impact significantly on the binding constants and consequently not on the product feed-back inhibition. WspC-Pp was found to be

  1. Nucleotide polymorphism of the TNF gene cluster in six Chinese populations. (United States)

    Zhang, Yongbiao; Zhang, Feng; Lin, Hongbin; Shi, Lei; Wang, Panpan; Shi, Li; Gong, Qiang; Li, Xin; Wang, Mei; Hu, Songnian; Chu, Jiayou; Wang, Duen-Mei


    DNA variants in a 31-kb region of the human major histocompatibility complex, encompassing the tumor necrosis factor (TNF) gene cluster, were surveyed by direct sequencing of 283 unrelated individuals from six Chinese populations. A total of 273 polymorphic sites were identified, with nearly half of them novel. We observed an excess of rare variants and negative values of selection tests of the region, implying either that these populations experienced a historical expansion or that the surveyed region was subjected to natural selection. Different characteristics of the sequence variation in the six populations outline the genetic differentiation between Northern and Southern Chinese populations. The distributions of recombination rates are similar among all the populations, with variation in the magnitude and/or in the fine location of hot spots. Tag single-nucleotide polymorphisms (SNPs) selected from HapMap (Phase II) CHB data accounted for an average of 64% of common SNPs from the six Chinese populations. We also observed a limited transferability of tag SNPs between Chinese populations on the 31-kb region with an excess of untaggable SNPs and ragged linkage disequilibrium blocks. It suggested that the design and interpretation of future association studies should be more cautious, and that a resequencing approach may refine tag SNP selection on Chinese-specific disease mapping.

  2. The enterobacterial common antigen-like gene cluster of Haemophilus ducreyi contributes to virulence in humans. (United States)

    Banks, Keith E; Fortney, Kate R; Baker, Beth; Billings, Steven D; Katz, Barry P; Munson, Robert S; Spinola, Stanley M


    Haemophilus ducreyi 35000HP contains a cluster of homologues of genes required for the synthesis of enterobacterial common antigen (ECA), suggesting that H. ducreyi may express a putative ECA-like glycoconjugate. WecA initiates the synthesis of ECA by transferring N-acetylglucosamine to undecaprenyl-P, to form lipid I. A wecA mutant (35000HPwecA) was constructed, and 5 volunteers were inoculated at 3 sites with fixed doses of 35000HP on one arm and at 3 sites with varying doses of 35000HPwecA on the other arm. 35000HPwecA caused pustules to form at 3 sites inoculated with a dose 2.5-fold higher than that of 35000HP. However, at sites inoculated with similar doses of 35000HP and 35000HPwecA, pustules developed at 46.7% (95% confidence interval [CI], 23.3%-70.0%) of 15 parent-strain sites and at 8.3% (95% CI, 0.01%-23.6%) of 12 mutant-strain sites (P = .013). Thus, the expression of wecA contributes to the ability of H. ducreyi to cause pustules in humans.

  3. The mouse salivary androgen-binding protein (ABP) gene cluster on chromosomes 7: characterization and evolutionary relationships. (United States)

    Laukaitis, Christina M; Dlouhy, Stephen R; Karn, Robert C


    Mouse salivary androgen-binding protein (ABP) is a pair of dimers, composed of an alpha subunit disulfide bridged to either a beta or a gamma subunit. It has been proposed that each subunit is encoded by a distinct gene: Abpa, Abpb, and Abpg for the alpha, beta, and gamma subunits, respectively. We report here the structures and sequences of the genes that encode these three subunits. Each gene has three exons separated by two introns. Mouse salivary ABP is a member of the secretoglobin family, and we compare the structure of the three ABP subunit genes to those of 18 other mammalian secretoglobins. We map the three genes as a gene cluster located 10 cM from the centromere of Chromosome (Chr) 7 and show that Abpa is the closest of the three to the gene for glucose phosphate isomerase (GPI) and that Abpg is the closest to the centromere, with Abpb mapping between them. Abpa is oriented in the opposite direction to Abpb and Abpg, with its 5' end directed toward their 5' ends. We compare the location of these genes with other secretoglobin genes in the mouse genome and with the known locations of secretoglobin genes in the human genome and present evidence that strong positive selection has driven the divergence of the coding regions of Abpb and Abpg since the putative duplication event that created them.

  4. Involvement of gene methylation changes in the differentiation of human amniotic epithelial cells into islet-like cell clusters. (United States)

    Peng, Lin; Wang, Jian; Lu, Guangxiu


    Insulin-dependent diabetes results from destruction of the insulin-producing β-cells of the pancreas. Islet cell transplantation is a promising cure for diabetes. Here, we induced human amniotic epithelial cells (hAECs) to differentiate into islet-like cell clusters by nicotinamide plus betacellulin in vitro, and further investigated the DNA methylation status by a Nimble MeDIP microarray before and after cell differentiation to shed light on the molecular mechanisms of this differentiation. In addition, 5-Aza-2'-deoxycytidine was used to investigate whether the differentiation of hAECs into islet-like cells occurred through demethylation. Purified hAECs (CK18(+)/E-cadherin(+)/CD29(+)/CD90(-)/CD34(-)/CD45(-)) were isolated from human amnia. After induction, hAECs were found to be insulin positive and sensitive to glucose, indicating successful induction to islet-like cells. The methylation status of cell cytoskeleton-related genes was down-regulated and that of negative regulation of cell adhesion-related genes was up-regulated. The methylation status of pancreas development-related genes such as HNF1α and DGAT1 was decreased in hAECs after induction. After brief demethylation, INS gene expression was up-regulated in islet-like cell clusters, suggesting that DNA methylation changes were associated with the differentiation of hAECs into islet-like cell clusters.

  5. Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

    Directory of Open Access Journals (Sweden)

    Manal Helal

    Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra

  6. Clustered organization, polycistronic transcription, and evolution of modification-guide snoRNA genes in Euglena gracilis. (United States)

    Moore, Ashley N; Russell, Anthony G


    Previous studies have shown that the eukaryotic microbe Euglena gracilis contains an unusually large assortment of small nucleolar RNAs (snoRNAs) and ribosomal RNA (rRNA) modification sites. However, little is known about the evolutionary mechanisms contributing to this situation. In this study, we have examined the organization and evolution of snoRNA genes in Euglena with the additional objective of determining how these properties relate to the rRNA modification pattern in this protist. We have identified and extensively characterized a clustered pattern of genes encoding previously biochemically isolated snoRNA sequences in E. gracilis. We show that polycistronic transcription is a prevalent snoRNA gene expression strategy in this organism. Further, we have identified 121 new snoRNA coding regions through sequence analysis of these clusters. We have identified an E. gracilis U14 snoRNA homolog clustered with modification-guide snoRNA genes. The U14 snoRNAs in other eukaryotic organisms examined to date typically contain both a modification and a processing domain. E. gracilis U14 lacks the modification domain but retains the processing domain. Our analysis of U14 structure and evolution in Euglena and other eukaryotes allows us to propose a model for its evolution and suggest its processing role may be its more important function, explaining its conservation in many eukaryotes. The preponderance of apparent small and larger-scale duplication events in the genomic regions we have characterized in Euglena provides a mechanism for the generation of the unusually diverse collection and abundance of snoRNAs and modified rRNA sites. Our findings provide the framework for more extensive whole genome analysis to elucidate whether these snoRNA gene clusters are spread across multiple chromosomes and/or form dense "arrays" at a limited number of chromosomal loci.

  7. A putatively phase variable gene (dca) required for natural competence in Neisseria gonorrhoeae but not Neisseria meningitidis is located within the division cell wall (dcw) gene cluster. (United States)

    Snyder, L A; Saunders, N J; Shafer, W M


    A cluster of 18 open reading frames (ORFs), 15 of which are homologous to genes involved in division and cell wall synthesis, has been identified in Neisseria gonorrhoeae and Neisseria meningitidis. The three additional ORFs, internal to the dcw cluster, are not homologous to dcw-related genes present in other bacterial species. Analysis of the N. meningitidis strain MC58 genome for foreign DNA suggests that these additional ORFs have not been acquired by recent horizontal exchange, indicating that they are a long-standing, integral part of the neisserial dcw gene cluster. Reverse transcription-PCR analysis of RNA extracted from N. gonorrhoeae strain FA19 confirmed that all three ORFs are transcribed in gonococci. One of these ORFs (dca, for division cluster competence associated), located between murE and murF, was studied in detail and found to be essential for competence in the gonococcal but not in the meningococcal strains tested. Computer analysis predicts that dca encodes an inner membrane protein similar to hypothetical proteins produced by other gram-negative bacteria. In some meningococcal strains dca is prematurely terminated following a homopolymeric tract of G's, the length of which differs between isolates of N. meningitidis, suggesting that dca is phase variable in this species. A deletion and insertional mutation was made in the dca gene of N. gonorrhoeae strain FA19 and N. meningitidis strain NMB. This mutation abrogated the ability of the gonococci to be transformed with chromosomal DNA. Thus, we conclude that the dca-encoded gene product is an essential competence factor for gonococci.

  8. Expansion of the Clavulanic Acid Gene Cluster: Identification and In Vivo Functional Analysis of Three New Genes Required for Biosynthesis of Clavulanic Acid by Streptomyces clavuligerus (United States)

    Li, Rongfeng; Khaleeli, Nusrat; Townsend, Craig A.


    Clavulanic acid is a potent inhibitor of β-lactamase enzymes and is of demonstrated value in the treatment of infections by β-lactam-resistant bacteria. Previously, it was thought that eight contiguous genes within the genome of the producing strain Streptomyces clavuligerus were sufficient for clavulanic acid biosynthesis, because they allowed production of the antibiotic in a heterologous host (K. A. Aidoo, A. S. Paradkar, D. C. Alexander, and S. E. Jensen, p. 219–236, In V. P. Gullo et al., ed., Development in industrial microbiology series, 1993). In contrast, we report the identification of three new genes, orf10 (cyp), orf11 (fd), and orf12, that are required for clavulanic acid biosynthesis as indicated by gene replacement and trans-complementation analysis in S. clavuligerus. These genes are contained within a 3.4-kb DNA fragment located directly downstream of orf9 (cad) in the clavulanic acid cluster. While the orf10 (cyp) and orf11 (fd) proteins show homologies to other known CYP-150 cytochrome P-450 and [3Fe-4S] ferredoxin enzymes and may be responsible for an oxidative reaction late in the pathway, the protein encoded by orf12 shows no significant similarity to any known protein. The results of this study extend the biosynthetic gene cluster for clavulanic acid and attest to the importance of analyzing biosynthetic genes in the context of their natural host. Potential functional roles for these proteins are proposed. PMID:10869089

  9. Prevalence of the lmo0036-0043 gene cluster encoding arginine deiminase and agmatine deiminase systems in Listeria monocytogenes. (United States)

    Chen, Jianshun; Chen, Fan; Cheng, Changyong; Fang, Weihuan


    Arginine deiminase and agmatine deiminase systems are involved in acid tolerance, and their encoding genes form the cluster lmo0036-0043 in Listeria monocytogenes. While lmo0042 and lmo0043 were conserved in all L. monocytogenes strains, the lmo0036-0041 region of this cluster was identified in all lineages I and II, and the majority of lineage IV (83.3%) strains, but absent in all lineage III and a small fraction of lineage IV (16.7%) strains, suggesting that the presence of the complete lmo0036-0043 cluster is dependent on lineages. lmo0036-0043-complete and -deficient lineage IV strains exhibit specific ascB-dapE profiles, which might represent two subpopulations with distinct genetic characteristics.

  10. Molecular identification and characterization of clustered regularly interspaced short palindromic repeat (CRISPR) gene cluster in Taylorella equigenitalis. (United States)

    Hara, Yasushi; Hayashi, Kyohei; Nakajima, Takuya; Kagawa, Shizuko; Tazumi, Akihiro; Moore, John E; Matsuda, Motoo


    Clustered regularly interspaced short palindromic repeats (CRISPRs), of approximately 10,000 base pairs (bp) in length, were shown to occur in the Japanese Taylorella equigenitalis strain, EQ59. The locus was composed of the putative CRISPRs-associated with 5 (cas5), RAMP csd1, csd2, recB, cas1, a leader region, 13 CRISPR consensus sequence repeats (each 32 bp; 5'-TCAGCCACGTTCGCGTGGCTGTGTGTTTAAAG-3'). These were in turn separated by 12 non repetitive unique spacer regions of similar length. In addition, a leader region, a transposase/IS protein, a leader region, and cas3 were also seen. All seven putative open reading frames carry their ribosome binding sites. Promoter consensus sequences at the -35 and -10 regions and putative intrinsic ρ-independent transcription terminator regions also occurred. A possible long overlap of 170 bp in length occurred between the recB and cas1 loci. Positive reverse transcription PCR signals of cas5, RAMP csd1, csd2-recB/cas1, and cas3 were generated. A putative secondary structure of the CRISPR consensus repeats was constructed. Following this, CRISPR results of the T. equigenitalis EQ59 isolate were subsequently compared with those from the Taylorella asinigenitalis MCE3 isolate.

  11. Ribosomal protein L7a is encoded by a gene (Surf-3) within the tightly clustered mouse surfeit locus. (United States)

    Giallongo, A; Yon, J; Fried, M


    The mouse Surfeit locus, which contains a cluster of at least four genes (Surf-1 to Surf-4), is unusual in that adjacent genes are separated by no more than 73 base pairs (bp). The heterogeneous 5' ends of Surf-1 and Surf-2 are separated by only 15 to 73 bp, the 3' ends of Surf-1 and Surf-3 are only 70 bp apart, and the 3' ends of Surf-2 and Surf-4 overlap by 133 bp. This very tight clustering suggests a cis interaction between adjacent Surfeit genes. The Surf-3 gene (which could code for a basic polypeptide of 266 amino acids) is a highly expressed member of a pseudogene-containing multigene family. By use of an anti-peptide serum (against the C-terminal nine amino acids of the putative Surf-3 protein) for immunofluorescence and immunoblotting of mouse cell components and by in vitro translation of Surf-3 cDNA hybrid-selected mRNA, the Surf-3 gene product was identified as a 32-kilodalton ribosomal protein located in the 60S ribosomal subunit. From its subunit location, gel migration, and homology with a limited rat ribosomal peptide sequence, the Surf-3 gene was shown to encode the mouse L7a ribosomal protein. The Surf-3 gene is highly conserved through evolution and was detected by nucleic acid hybridization as existing in multiple copies (multigene families) in other mammals and as one or a few copies in birds, Xenopus, Drosophila, and Schizosaccharomyces pombe. The Surf-3 C-terminal anti-peptide serum detects a 32-kilodalton protein in other mammals, birds, and Xenopus but not in Drosophila and S. pombe. The possible effect of interaction of the Surf-3 ribosomal protein gene with adjacent genes in the Surfeit locus at the transcriptional or posttranscriptional level or both levels is discussed. Images PMID:2648130

  12. In silico analysis highlights the frequency and diversity of type 1 lantibiotic gene clusters in genome sequenced bacteria

    LENUS (Irish Health Repository)

    Marsh, Alan J


    Abstract Background Lantibiotics are lanthionine-containing, post-translationally modified antimicrobial peptides. These peptides have significant, but largely untapped, potential as preservatives and chemotherapeutic agents. Type 1 lantibiotics are those in which lanthionine residues are introduced into the structural peptide (LanA) through the activity of separate lanthionine dehydratase (LanB) and lanthionine synthetase (LanC) enzymes. Here we take advantage of the conserved nature of LanC enzymes to devise an in silico approach to identify potential lantibiotic-encoding gene clusters in genome sequenced bacteria. Results In total 49 novel type 1 lantibiotic clusters were identified which unexpectedly were associated with species, genera and even phyla of bacteria which have not previously been associated with lantibiotic production. Conclusions Multiple type 1 lantibiotic gene clusters were identified at a frequency that suggests that these antimicrobials are much more widespread than previously thought. These clusters represent a rich repository which can yield a large number of valuable novel antimicrobials and biosynthetic enzymes.

  13. The fnr gene of Bacillus licheniformis and the cysteine ligands of the C-terminal FeS cluster. (United States)

    Klinger, A; Schirawski, J; Glaser, P; Unden, G


    In the facultatively anaerobic bacterium Bacillus licheniformis a gene encoding a protein of the fumarate nitrate reductase family of transcriptional regulators (Fnr) was isolated. Unlike Fnr proteins from gram-negative bacteria, but like Fnr from Bacillus subtilis, the protein contained a C-terminal cluster of cysteine residues. Unlike in Fnr from B. subtilis, this cluster (Cys226-X2-Cys229-X4-Cys234) is composed of only three Cys residues, which are supposed to serve together with an internal residue (Cys71) as the ligands for an FeS center. Transfer of the B. licheniformis gene to an fnr mutant of B. subtilis complemented the ability for synthesis of nitrate reductase during anaerobic growth.

  14. Gene Sequence Based Clustering Assists in Dereplication of Pseudoalteromonas luteoviolacea Strains with Identical Inhibitory Activity and Antibiotic Production

    DEFF Research Database (Denmark)

    Vynne, Nikolaj Grønnegaard; Månsson, Maria; Gram, Lone


    of the present study was to determine whether such bioactivity differences could be linked to genotypes allowing methods from phylogenetic analysis to aid in selection of strains for biodiscovery. Thirteen P. luteoviolacea strains divided into three chemotypes based on production of known antibiotics and four......Some microbial species are chemically homogenous, and the same secondary metabolites are found in all strains. In contrast, we previously found that five strains of P. luteoviolacea were closely related by 16S rRNA gene sequence but produced two different antibiotic profiles. The purpose...... correlation to chemotypes and inhibition profiles, while clustering based on concatenated 16S rRNA, gyrB, and recA gene sequences resulted in three clusters, two of which uniformly consisted of strains of identical chemotype and inhibition profile. A major time sink in natural products discovery is the effort...

  15. Identification and characterization of another 4-nitrophenol degradation gene cluster, nps, in Rhodococcus sp. strain PN1. (United States)

    Yamamoto, Kenta; Nishimura, Munehiro; Kato, Dai-ichiro; Takeo, Masahiro; Negoro, Seiji


    4-Nitrophenol (4-NP) is a toxic compound formed in soil by the hydrolysis of organophosphorous pesticides, such as parathion. We previously reported the presence of the 4-NP degradation gene cluster (nphRA1A2) in Rhodococcus sp. strain PN1, which encodes a two-component 4-NP hydroxylase system that oxidizes 4-NP into 4-nitrocatechol. In the current study, another gene cluster (npsC and npsRA2A1B) encoding a similar 4-NP hydroxylase system was cloned from strain PN1. The enzymes from this 4-NP hydroxylase system (NpsA1 and NpsA2) were purified as histidine-tagged (His-) proteins and then characterized. His-NpsA2 showed NADH/FAD oxidoreductase activity, and His-NpsA1 showed 4-NP oxidizing activity in the presence of His-NpsA2. In the 4-NP oxidation using the reconstituted enzyme system (His-NpsA1 and His-NpsA2), hydroquinone (35% of 4-NP disappeared) and hydroxyquinol (59% of 4-NP disappeared) were detected in the presence of ascorbic acid as a reducing reagent, suggesting that, without the reducing reagent, 4-NP was converted into their oxidized forms, 1,4-benzoquinone and 2-hydroxy-1,4-benzoquinone. In addition, in the cell extract of recombinant Escherichia coli expressing npsB, a typical spectral change showing conversion of hydroxyquinol into maleylacetate was observed. These results indicate that this nps gene cluster, in addition to the nph gene cluster, is also involved in 4-NP degradation in strain PN1.

  16. Meiotic recombination in the beta globin gene cluster causing an error in prenatal diagnosis of beta thalassaemia.


    Camaschella, C.; Serra, A.; Saglio, G; Bertero, M T; Mazza, U; Terzoli, S; Brambati, B; Cremonesi, L.; Travi, M; Ferrari, M


    In the course of a prenatal diagnosis for beta thalassaemia by linkage analysis of restriction fragment length polymorphisms, a homozygous beta thalassaemia fetus was misdiagnosed as beta thalassaemia trait. Extensive studies of the polymorphic sites within the beta globin gene cluster in all the members of the family resulted in the conclusion that the paternal chromosome 11 of the newborn was different from that expected. Paternity was confirmed by HLA typing and blood group studies. The an...

  17. Investigation of pathogenic genes in peri-implantitis from implant clustering failure patients: a whole-exome sequencing pilot study.

    Directory of Open Access Journals (Sweden)

    Soohyung Lee

    Full Text Available Peri-implantitis is a frequently occurring gum disease linked to multi-factorial traits with various environmental and genetic causalities and no known concrete pathogenesis. The varying severity of peri-implantitis among patients with relatively similar environments suggests a genetic aspect which needs to be investigated to understand and regulate the pathogenesis of the disease. Six unrelated individuals with multiple clusterization implant failure due to severe peri-implantitis were chosen for this study. These six individuals had relatively healthy lifestyles, with minimal environmental causalities affecting peri-implantitis. Research was undertaken to investigate pathogenic genes in peri-implantitis albeit with a small number of subjects and incomplete elimination of environmental causalities. Whole-exome sequencing was performed on collected saliva samples via self DNA collection kit. Common variants with minor allele frequencies (MAF > = 0.05 from all control datasets were eliminated and variants having high and moderate impact and loss of function were used for comparison. Gene set enrichment analysis was performed to reveal functional groups associated with the genetic variants. 2,022 genes were left after filtering against dbSNP, the 1000 Genomes East Asian population, and healthy Korean randomized subsample data (GSK project. 175 (p-value <0.05 out of 927 gene sets were obtained via GSEA (DAVID. The top 10 was chosen (p-value <0.05 from cluster enrichment showing significance of cytoskeleton, cell adhesion, and metal ion binding. Network analysis was applied to find relationships between functional clusters. Among the functional groups, ion metal binding was located in the center of all clusters, indicating dysfunction of regulation in metal ion concentration might affect cell morphology or cell adhesion, resulting in implant failure. This result may demonstrate the feasibility of and provide pilot data for a larger research

  18. Improved inference of gene regulatory networks through integrated Bayesian clustering and dynamic modeling of time-course expression data. (United States)

    Godsey, Brian


    Inferring gene regulatory networks from expression data is difficult, but it is common and often useful. Most network problems are under-determined--there are more parameters than data points--and therefore data or parameter set reduction is often necessary. Correlation between variables in the model also contributes to confound network coefficient inference. In this paper, we present an algorithm that uses integrated, probabilistic clustering to ease the problems of under-determination and correlated variables within a fully Bayesian framework. Specifically, ours is a dynamic Bayesian network with integrated Gaussian mixture clustering, which we fit using variational Bayesian methods. We show, using public, simulated time-course data sets from the DREAM4 Challenge, that our algorithm outperforms non-clustering methods in many cases (7 out of 25) with fewer samples, rarely underperforming (1 out of 25), and often selects a non-clustering model if it better describes the data. Source code (GNU Octave) for BAyesian Clustering Over Networks (BACON) and sample data are available at:

  19. Analysis of an inactive cyanobactin biosynthetic gene cluster leads to discovery of new natural products from strains of the genus Microcystis.

    Directory of Open Access Journals (Sweden)

    Niina Leikoski

    Full Text Available Cyanobactins are cyclic peptides assembled through the cleavage and modification of short precursor proteins. An inactive cyanobactin gene cluster has been described from the genome Microcystis aeruginosa NIES843. Here we report the discovery of active counterparts in strains of the genus Microcystis guided by this silent cyanobactin gene cluster. The end products of the gene clusters were structurally diverse cyclic peptides, which we named piricyclamides. Some of the piricyclamides consisted solely of proteinogenic amino acids while others contained disulfide bridges and some were prenylated or geranylated. The piricyclamide gene clusters encoded between 1 and 4 precursor genes. They encoded highly diverse core peptides ranging in length from 7-17 amino acids with just a single conserved amino acid. Heterologous expression of the pir gene cluster from Microcystis aeruginosa PCC7005 in Escherichia coli confirmed that this gene cluster is responsible for the biosynthesis of piricyclamides. Chemical analysis demonstrated that Microcystis strains could produce an array of piricyclamides some of which are geranylated or prenylated. The genetic diversity of piricyclamides in a bloom sample was explored and 19 different piricyclamide precursor genes were found. This study provides evidence for a stunning array of piricyclamides in Microcystis, a worldwide occurring bloom forming cyanobacteria.

  20. Variants in linkage disequilibrium with the late cornified envelope gene cluster deletion are associated with susceptibility to psoriatic arthritis.

    LENUS (Irish Health Repository)

    Bowes, John


    A common deletion mapping to the psoriasis susceptibility locus 4 on chromosome 1q21, encompassing two genes of the late cornified envelope (LCE) gene cluster, has been associated with an increased risk of psoriasis vulgaris (PsV). One previous report found no association of the deletion with psoriatic arthritis (PsA), suggesting it may be a specific risk factor for PsV. Given the genetic overlap between PsA and PsV, a study was undertaken to investigate whether single nucleotide polymorphisms (SNPs) mapping to this locus are risk factors for PsA in a UK and Irish population.

  1. Transcriptional analysis of the jamaicamide gene cluster from the marine cyanobacterium Lyngbya majuscula and identification of possible regulatory proteins

    Directory of Open Access Journals (Sweden)

    Dorrestein Pieter C


    Full Text Available Abstract Background The marine cyanobacterium Lyngbya majuscula is a prolific producer of bioactive secondary metabolites. Although biosynthetic gene clusters encoding several of these compounds have been identified, little is known about how these clusters of genes are transcribed or regulated, and techniques targeting genetic manipulation in Lyngbya strains have not yet been developed. We conducted transcriptional analyses of the jamaicamide gene cluster from a Jamaican strain of Lyngbya majuscula, and isolated proteins that could be involved in jamaicamide regulation. Results An unusually long untranslated leader region of approximately 840 bp is located between the jamaicamide transcription start site (TSS and gene cluster start codon. All of the intergenic regions between the pathway ORFs were transcribed into RNA in RT-PCR experiments; however, a promoter prediction program indicated the possible presence of promoters in multiple intergenic regions. Because the functionality of these promoters could not be verified in vivo, we used a reporter gene assay in E. coli to show that several of these intergenic regions, as well as the primary promoter preceding the TSS, are capable of driving β-galactosidase production. A protein pulldown assay was also used to isolate proteins that may regulate the jamaicamide pathway. Pulldown experiments using the intergenic region upstream of jamA as a DNA probe isolated two proteins that were identified by LC-MS/MS. By BLAST analysis, one of these had close sequence identity to a regulatory protein in another cyanobacterial species. Protein comparisons suggest a possible correlation between secondary metabolism regulation and light dependent complementary chromatic adaptation. Electromobility shift assays were used to evaluate binding of the recombinant proteins to the jamaicamide promoter region. Conclusion Insights into natural product regulation in cyanobacteria are of significant value to drug discovery

  2. Severe Developmental Delay in a Patient with 7p21.1-p14.3 Microdeletion Spanning the TWIST Gene and the HOXA Gene Cluster. (United States)

    Fryssira, H; Makrythanasis, P; Kattamis, A; Stokidis, K; Menten, B; Kosaki, K; Willems, P; Kanavakis, E


    We describe a patient with a rare interstitial deletion of chromosome 7p21.1-p14.3 detected by array-CGH. The deletion encompassed 74 genes and caused haploinsufficiency (or loss of allele) of 6 genes known to be implicated in different autosomal dominant genetic disorders: TWIST, DFNA5, CYCS, HOXA11, HOXA13, and GARS. The patient had several morphological abnormalities similar to Saethre-Chotzen syndrome (caused by TWIST mutations) including craniosynostosis of the coronal suture and anomalies similar to those seen in hand-foot-uterus syndrome (caused by HOXA13 mutations) including hypospadias. The combined phenotype of Saethre-Chotzen syndrome and hand-foot-uterus syndrome of our patient closely resembles a previously reported case with a cytogenetically visible small deletion spanning 7p21-p14.3. We therefore conclude that microdeletions of 7p spanning the TWIST gene and HOXA gene cluster lead to a clinically recognizable 'haploinsufficiency syndrome'.

  3. Identification of gene clusters associated with host adaptation and antibiotic resistance in Chinese Staphylococcus aureus isolates by microarray-based comparative genomics.

    Directory of Open Access Journals (Sweden)

    Henan Li

    Full Text Available A comparative genomic microarray comprising 2,457 genes from two whole genomes of S. aureus was employed for the comparative genome hybridization analysis of 50 strains of divergent clonal lineages, including methicillin-resistant S. aureus (MRSA, methicillin-susceptible S. aureus (MSSA, and swine strains in China. Large-scale validation was confirmed via polymerase chain reaction in 160 representative clinical strains. All of the 50 strains were clustered into seven different complexes by phylogenetic tree analysis. Thirteen gene clusters were specific to different S. aureus clones. Ten gene clusters, including seven known (vSa3, vSa4, vSaα, vSaβ, Tn5801, and phage ϕSa3 and three novel (C8, C9, and C10 gene clusters, were specific to human MRSA. Notably, two global regulators, sarH2 and sarH3, at cluster C9 were specific to human MRSA, and plasmid pUB110 at cluster C10 was specific to swine MRSA. Three clusters known to be part of SCCmec, vSa4 or Tn5801, and vSaα as well as one novel gene cluster C12 with homology with Tn554 of S. epidermidis were identified as MRSA-specific gene clusters. The replacement of ST239-spa t037 with ST239-spa t030 in Beijing may be a result of its acquisition of vSa4, phage ϕSa1, and ϕSa3. In summary, thirteen critical gene clusters were identified to be contributors to the evolution of host specificity and antibiotic resistance in Chinese S. aureus.

  4. Mapping of a gene for epidermolytic palmoplantar keratoderma to the region of the acidic keratin gene cluster at 17q12-q21. (United States)

    Reis, A; Küster, W; Eckardt, R; Sperling, K


    Epidermolytic palmoplantar keratoderma (EPPK) (Vörner-Unna-Thost) is an autosomal dominantly inherited skin disease of unknown etiology characterized by diffuse severe hyperkeratosis of the palms and soles and, histologically, by cellular degeneration. We have mapped a gene for EPPK to chromosome 17q11-q23, with linkage analysis using microsatellite DNA-polymorphisms, in a single large family of 7 generations. A maximum lod score of z = 6.66 was obtained with the probe D17S579 at a recombination fraction of theta = 0.00. This locus maps to the same region as the type I (acidic) keratin gene cluster. Keratins, members of the intermediate filament family, the major proteins of the cytoskeleton in epidermis, are differentially expressed in a tissue-specific manner. One acidic keratin, keratin 9 (KRT9), is expressed only in the terminally differentiated epidermis of palms and soles. The KRT9 gene has not yet been cloned; however, since the genes for most acidic keratins are clustered, it is highly probable that it too will map to this region. We therefore propose KRT9 as the candidate gene for EPPK.

  5. Cloning, reassembling and integration of the entire nikkomycin biosynthetic gene cluster into Streptomyces ansochromogenes lead to an improved nikkomycin production

    Directory of Open Access Journals (Sweden)

    Yang Haihua


    Full Text Available Abstract Background Nikkomycins are a group of peptidyl nucleoside antibiotics produced by Streptomyces ansochromogenes. They are competitive inhibitors of chitin synthase and show potent fungicidal, insecticidal, and acaricidal activities. Nikkomycin X and Z are the main components produced by S. ansochromogenes. Generation of a high-producing strain is crucial to scale up nikkomycins production for further clinical trials. Results To increase the yields of nikkomycins, an additional copy of nikkomycin biosynthetic gene cluster (35 kb was introduced into nikkomycin producing strain, S. ansochromogenes 7100. The gene cluster was first reassembled into an integrative plasmid by Red/ET technology combining with classic cloning methods and then the resulting plasmid(pNIKwas introduced into S. ansochromogenes by conjugal transfer. Introduction of pNIK led to enhanced production of nikkomycins (880 mg L-1, 4 -fold nikkomycin X and 210 mg L-1, 1.8-fold nikkomycin Z in the resulting exconjugants comparing with the parent strain (220 mg L-1 nikkomycin X and 120 mg L-1 nikkomycin Z. The exconjugants are genetically stable in the absence of antibiotic resistance selection pressure. Conclusion A high nikkomycins producing strain (1100 mg L-1 nikkomycins was obtained by introduction of an extra nikkomycin biosynthetic gene cluster into the genome of S. ansochromogenes. The strategies presented here could be applicable to other bacteria to improve the yields of secondary metabolites.

  6. Novel polyoxins generated by heterologously expressing polyoxin biosynthetic gene cluster in the sanN inactivated mutant of Streptomyces ansochromogenes

    Directory of Open Access Journals (Sweden)

    Li Jine


    Full Text Available Abstract Background Polyoxins are potent inhibitors of chitin synthetases in fungi and insects. The gene cluster responsible for biosynthesis of polyoxins has been cloned and sequenced from Streptomyces cacaoi and tens of polyoxin analogs have been identified already. Results The polyoxin biosynthetic gene cluster from Streptomyces cacaoi was heterologously expressed in the sanN inactivated mutant of Streptomyces ansochromogenes as a nikkomycin producer. Besides hybrid antibiotics (polynik A and polyoxin N and some known polyoxins, two novel polyoxin analogs were accumulated. One of them is polyoxin P that has 5-aminohexuronic acid with N-glycosidically bound thymine as the nucleoside moiety and dehydroxyl-carbamoylpolyoxic acid as the peptidyl moiety. The other analog is polyoxin O that contains 5-aminohexuronic acid bound thymine as the nucleoside moiety, but recruits polyoximic acid as the sole peptidyl moiety. Bioassay against phytopathogenic fungi showed that polyoxin P displayed comparatively strong inhibitory activity, whereas the inhibitory activity of polyoxin O was weak under the same testing conditions. Conclusion Two novel polyoxin analogs (polyoxin P and O were generated by the heterologous expression of polyoxin biosynthetic gene cluster in the sanN inactivated mutant of Streptomyces ansochromogenes. Polyoxin P showed potent antifungal activity,while the activity of polyoxin O was weak. The strategy presented here may be available for other antibiotics producers.

  7. Evolution of a pentameral body plan was not linked to translocation of anterior Hox genes: the echinoderm HOX cluster revisited. (United States)

    Byrne, Maria; Martinez, Pedro; Morris, Valerie


    Echinodermata is a large phylum of marine invertebrates characterized by an adult, pentameral body plan. This morphology is clearly derived as all members of Deuterostomia (the superphylum to which they belong) have a bilateral body plan. The origin of the pentameral plan has been the subject of intense debate. It is clear that the ancestor of Echinodermata had a bilateral plan but how this ancestor transformed its body "architecture" in such a drastic manner is not clear. Data from the fossil record and ontogeny are sparse and, so far, not very informative. The sequencing of the sea urchin genome a decade ago opened the possibility that the pentameral body plan was a consequence of a broken Hox cluster and a series of papers dwelt on the putative relationship between Hox gene arrangements in the chromosomes and the origin of pentamery. This relationship, sound as it was, is challenged by the revelation that the sea star HOX cluster is, in fact, intact, thus falsifying the hypothesis of a direct relationship between HOX cluster arrangement and the origin of the pentameral body plan. Here, we explore the relationship between Hox gene arrangements and echinoderm body "architecture," the expression of Hox genes in development and alternative scenarios for the origin of pentamery, with putative roles for signaling centers in generating multiple axes.

  8. Total alpha-globin gene cluster deletion has high frequency in Filipinos

    Energy Technology Data Exchange (ETDEWEB)

    Hunt, J.A.; Haruyama, A.Z.; Chu, B.M. [Kapiolani Medical Center, Honolulu, HI (United States)] [and others


    Most {alpha}-thalassemias [Thal] are due to large deletions. In Southeast Asians, the (--{sup SEA}) double {alpha}-globin gene deletion is common, 3 (--{sup Tot}) total {alpha}-globin cluster deletions are known: Filipino (--{sup Fil}), Thai (--{sup Thai}), and Chinese (--{sup Chin}). In a Hawaii Thal project, provisional diagnosis of {alpha}-Thal-1 heterozygotes was based on microcytosis, normal isoelectric focusing, and no iron deficiency. One in 10 unselected Filipinos was an {alpha}-Thal-1 heterozygote, 2/3 of these had a (--{sup Tot}) deletion: a {var_sigma}-cDNA probe consistently showed fainter intensity of the constant 5.5 kb {var_sigma}{sub 2} BamHI band, with no heterzygosity for {var_sigma}-globin region polymorphisms; {alpha}-cDNA or {var_sigma}-cDNA probes showed no BamHI or BglII bands diagnostic of the (--{sup SEA}) deletion; bands for the (-{alpha}) {alpha}-Thal-2 single {alpha}-globin deletions were only seen in Hb H cases. A reliable monoclonal anti-{var_sigma}-peptide antibody test for the (--{sup SEA}) deletion was always negative in (--{sup Tot}) samples. Southern digests with the Lo probe, a gift from D. Higgs of Oxford Univ., confirmed that 49 of 50 (--{sup Tot}) chromosomes in Filipinos were (--{sup Fil}). Of 20 {alpha}-Thal-1 hydrops born to Filipinos, 11 were (--{sup Fil}/--{sup SEA}) compound heterozygotes; 9 were (--{sup SEA}/--{sup SEA}) homozygotes, but none was a (--{sup Fil}/--{sup Fil}).

  9. Identification of Hox genes and rearrangements within the single homeobox (Hox) cluster (192.8 kb) of the cyclopoid copepod (Paracyclopina nana). (United States)

    Kim, Hui-Su; Kim, Bo-Mi; Lee, Bo-Young; Souissi, Sami; Park, Heum Gi; Lee, Jae-Seong


    We report the first identification of the entire complement of the eight typical homeobox (hox) genes (lab, pb, Dfd, scr, antp, ubx, Abd-A, and Abd-B) and the ftz gene in a 192.8 kb region in the cyclopoid copepod Paracyclopina nana. A Hox3 gene ortholog was not present in the P. nana hox gene cluster, while the P. nana Dfd gene was transcribed in the opposite direction to the Daphnia pulex Dfd gene, but in the same direction as the Dfd genes of the fruit fly Drosophila melanogaster and red flour beetle Tribolium castaneum. The location of the lab and pb genes was switched in the P. nana hox cluster, while the order of the remaining hox genes was generally conserved with those of other arthropods. J. Exp. Zool. (Mol. Dev. Evol.) 9999B:XX-XX, 2016. © 2016 Wiley Periodicals, Inc.

  10. Genomic sequence analysis of the 238-kb swine segment with a cluster of TRIM and olfactory receptor genes located, but with no class I genes, at the distal end of the SLA class I region. (United States)

    Ando, Asako; Shigenari, Atsuko; Kulski, Jerzy K; Renard, Christine; Chardon, Patrick; Shiina, Takashi; Inoko, Hidetoshi


    Continuous genomic sequence has been previously determined for the swine leukocyte antigen (SLA) class I region from the TNF gene cluster at the border between the major histocompatibility complex (MHC) class III and class I regions to the UBD gene at the telomeric end of the classical class I gene cluster (SLA-1 to SLA-5, SLA-9, SLA-11). To complete the genomic sequence of the entire SLA class I genomic region, we have analyzed the genomic sequences of two BAC clones carrying a continuous 237,633-bp-long segment spanning from the TRIM15 gene to the UBD gene located on the telomeric side of the classical SLA class I gene cluster. Fifteen non-class I genes, including the zinc finger and the tripartite motif (TRIM) ring-finger-related family genes and olfactory receptor genes, were identified in the 238-kilobase (kb) segment, and their location in the segment was similar to their apparent human homologs. In contrast, a human segment (alpha block) spanning about 375 kb from the gene ETF1P1 and from the HLA-J to HLA-F genes was absent from the 238-kb swine segment. We conclude that the gene organization of the MHC non-class I genes located in the telomeric side of the classical SLA class I gene cluster is remarkably similar between the swine and the human segments, although the swine lacks a 375-kb segment corresponding to the human alpha block.

  11. An original SERPINA3 gene cluster: Elucidation of genomic organization and gene expression in the Bos taurus 21q24 region

    Directory of Open Access Journals (Sweden)

    Ouali Ahmed


    Full Text Available Abstract Background The superfamily of serine proteinase inhibitors (serpins is involved in numerous fundamental biological processes as inflammation, blood coagulation and apoptosis. Our interest is focused on the SERPINA3 sub-family. The major human plasma protease inhibitor, α1-antichymotrypsin, encoded by the SERPINA3 gene, is homologous to genes organized in clusters in several mammalian species. However, although there is a similar genic organization with a high degree of sequence conservation, the reactive-centre-loop domains, which are responsible for the protease specificity, show significant divergences. Results We provide additional information by analyzing the situation of SERPINA3 in the bovine genome. A cluster of eight genes and one pseudogene sharing a high degree of identity and the same structural organization was characterized. Bovine SERPINA3 genes were localized by radiation hybrid mapping on 21q24 and only spanned over 235 Kilobases. For all these genes, we propose a new nomenclature from SERPINA3-1 to SERPINA3-8. They share approximately 70% of identity with the human SERPINA3 homologue. In the cluster, we described an original sub-group of six members with an unexpected high degree of conservation for the reactive-centre-loop domain, suggesting a similar peptidase inhibitory pattern. Preliminary expression analyses of these bovSERPINA3s showed different tissue-specific patterns and diverse states of glycosylation and phosphorylation. Finally, in the context of phylogenetic analyses, we improved our knowledge on mammalian SERPINAs evolution. Conclusion Our experimental results update data of the bovine genome sequencing, substantially increase the bovSERPINA3 sub-family and enrich the phylogenetic tree of serpins. We provide new opportunities for future investigations to approach the biological functions of this unusual subset of serine proteinase inhibitors.

  12. The phn Genes of Burkholderia sp. Strain RP007 Constitute a Divergent Gene Cluster for Polycyclic Aromatic Hydrocarbon Catabolism



    Cloning and molecular ecological studies have underestimated the diversity of polycyclic aromatic hydrocarbon (PAH) catabolic genes by emphasizing classical nah-like (nah, ndo, pah, and dox) sequences. Here we report the description of a divergent set of PAH catabolic genes, the phn genes, which although isofunctional to the classical nah-like genes, show very low homology. This phn locus, which contains nine open reading frames (ORFs), was isolated on an 11.5-kb HindIII fragment from phenant...

  13. Cluster of genes that encode positive and negative elements influencing filament length in a heterocyst-forming cyanobacterium. (United States)

    Merino-Puerto, Victoria; Herrero, Antonia; Flores, Enrique


    The filamentous, heterocyst-forming cyanobacteria perform oxygenic photosynthesis in vegetative cells and nitrogen fixation in heterocysts, and their filaments can be hundreds of cells long. In the model heterocyst-forming cyanobacterium Anabaena sp. strain PCC 7120, the genes in the fraC-fraD-fraE operon are required for filament integrity mainly under conditions of nitrogen deprivation. The fraC operon transcript partially overlaps gene all2395, which lies in the opposite DNA strand and ends 1 bp beyond fraE. Gene all2395 produces transcripts of 1.35 kb (major transcript) and 2.2 kb (minor transcript) that overlap fraE and whose expression is dependent on the N-control transcription factor NtcA. Insertion of a gene cassette containing transcriptional terminators between fraE and all2395 prevented production of the antisense RNAs and resulted in an increased length of the cyanobacterial filaments. Deletion of all2395 resulted in a larger increase of filament length and in impaired growth, mainly under N2-fixing conditions and specifically on solid medium. We denote all2395 the fraF gene, which encodes a protein restricting filament length. A FraF-green fluorescent protein (GFP) fusion protein accumulated significantly in heterocysts. Similar to some heterocyst differentiation-related proteins such as HglK, HetL, and PatL, FraF is a pentapeptide repeat protein. We conclude that the fraC-fraD-fraE←fraF gene cluster (where the arrow indicates a change in orientation), in which cis antisense RNAs are produced, regulates morphology by encoding proteins that influence positively (FraC, FraD, FraE) or negatively (FraF) the length of the filament mainly under conditions of nitrogen deprivation. This gene cluster is often conserved in heterocyst-forming cyanobacteria.

  14. Pathway-specific regulation revisited: cross-regulation of multiple disparate gene clusters by PAS-LuxR transcriptional regulators. (United States)

    Vicente, Cláudia M; Payero, Tamara D; Santos-Aberturas, Javier; Barreales, Eva G; de Pedro, Antonio; Aparicio, Jesús F


    PAS-LuxR regulators are highly conserved proteins devoted to the control of antifungal production by binding to operators located in given promoters of polyene biosynthetic genes. The canonical operator of PimM, archetype of this class of regulators, has been used here to search for putative targets of orthologous protein PteF in the genome of Streptomyces avermitilis, finding 97 putative operators outside the pentaene filipin gene cluster (pte). The processes putatively affected included genetic information processing; energy, carbohydrate, and lipid metabolism; DNA replication and repair; morphological differentiation; secondary metabolite biosynthesis; and transcriptional regulation, among others. Seventeen of these operators were selected, and their binding to PimM DNA-binding domain was assessed by electrophoretic mobility shift assays. Strikingly, the protein bound all predicted operators suggesting a direct control over targeted processes. As a proof of concept, we studied the biosynthesis of the ATP-synthase inhibitor oligomycin whose gene cluster included two operators. Regulator mutants showed a severe loss of oligomycin production, whereas gene complementation of the mutant restored phenotype, and gene duplication in the wild-type strain boosted oligomycin production. Comparative gene expression analyses in parental and mutant strains by reverse transcription-quantitative polymerase chain reaction of selected olm genes corroborated production results. These results demonstrate that PteF is able to cross-regulate the biosynthesis of two related secondary metabolites, filipin and oligomycin, but might be extended to all the processes indicated above. This study highlights the complexity of the network of interactions in which PAS-LuxR regulators are involved and opens new possibilities for the manipulation of metabolite production in Streptomycetes.

  15. Blast fungus-induction and developmental and tissuespecific expression of a rice P450 CYP72A gene cluster

    Institute of Scientific and Technical Information of China (English)

    WANG Yaling; LI Qun; HE Zuhua


    Cytochrome P450 gene superfamily is widely involved in diverse processes of plant development and environmental responses including defense response to pathogens. We previously isolated a rice cDNA fragment in a DD-PCR screening for blast fungus-induced genes. In the current study, we isolated a CYP72A gene cluster consisting of 7 P450 CYP72A genes (CYP72A17~23) with the conserved cDNA sequence through the public rice genome data. There are total 14 putative CYP72A members in the rice genome, with high diversity at N-terminal sequences while high homology at C-terminal sequences of those 14 putative proteins. We analyzed expression profiles of the cloned 7 CYP72A genes during pathogen infection and development. The results showed that expression of CYP72A18, 19, 22 and 23 was differentially regulated in the incompatible and compatible interactions between rice and blast fungus. Except CYP72A20, a pseudogene, other 6 CYP72A genes also exhibited temporal and spatial expression patterns, respectively. These findings provide fundamental data for rice P450 gene function analysis.

  16. Organisation and expression of a cluster of yolk protein genes in the Australian sheep blowfly, Lucilia cuprina. (United States)

    Scott, Maxwell J; Atapattu, Asela; Schiemann, Anja H; Concha, Carolina; Henry, Rebecca; Carey, Brandi-lee; Belikoff, Esther J; Heinrich, Jörg C; Sarkar, Abhimanyu


    The Australian sheep blowfly Lucilia cuprina is a major pest for the Australian and New Zealand sheep industries. With the long-term aim of making a strain of L. cuprina suitable for a genetic control program, we previously developed a tetracycline-repressible female lethal genetic system in Drosophila. A key part of this system is a female-specific promoter from a yolk protein (yp) gene controlling expression of the tetracycline-dependent transactivator (tTA). Here we report the sequence of a 14.2 kb genomic clone from L. cuprina that contains a cluster of three complete yp genes and one partial yp gene. The Lcyp genes are specifically expressed in females that have received a protein meal. A bioinformatic analysis of the promoter of one of the yp genes (LcypA) identified several putative binding sites for DSX, a known regulator of yp gene expression in other Diptera. A transgenic strain of L. cuprina was made that contained the LcypA promoter driving the expression of the Escherichia coli lacZ reporter gene. Transgenic females express high levels of β-galactosidase after a protein meal. Thus the LcypA promoter could be used to obtain female-specific expression of tTA in transgenic L. cuprina.

  17. Concepts of relative sample outlier (RSO) and weighted sample similarity (WSS) for improving performance of clustering genes: co-function and co-regulation. (United States)

    Bhattacharya, Anindya; Chowdhury, Nirmalya; De, Rajat K


    Performance of clustering algorithms is largely dependent on selected similarity measure. Efficiency in handling outliers is a major contributor to the success of a similarity measure. Better the ability of similarity measure in measuring similarity between genes in the presence of outliers, better will be the performance of the clustering algorithm in forming biologically relevant groups of genes. In the present article, we discuss the problem of handling outliers with different existing similarity measures and introduce the concepts of Relative Sample Outlier (RSO). We formulate new similarity, called Weighted Sample Similarity (WSS), incorporated in Euclidean distance and Pearson correlation coefficient and then use them in various clustering and biclustering algorithms to group different gene expression profiles. Our results suggest that WSS improves performance, in terms of finding biologically relevant groups of genes, of all the considered clustering algorithms.

  18. Genome engineering and direct cloning of antibiotic gene clusters via phage ϕBT1 integrase-mediated site-specific recombination in Streptomyces. (United States)

    Du, Deyao; Wang, Lu; Tian, Yuqing; Liu, Hao; Tan, Huarong; Niu, Guoqing


    Several strategies have been used to clone large DNA fragments directly from bacterial genome. Most of these approaches are based on different site-specific recombination systems consisting of a specialized recombinase and its target sites. In this study, a novel strategy based on phage ϕBT1 integrase-mediated site-specific recombination was developed, and used for simultaneous Streptomyces genome engineering and cloning of antibiotic gene clusters. This method has been proved successful for the cloning of actinorhodin gene cluster from Streptomyces coelicolor M145, napsamycin gene cluster and daptomycin gene cluster from Streptomyces roseosporus NRRL 15998 at a frequency higher than 80%. Furthermore, the system could be used to increase the titer of antibiotics as we demonstrated with actinorhodin and daptomycin, and it will be broadly applicable in many Streptomyces.

  19. Genetic studies on the APOA1-C3-A5 gene cluster in Asian Indians with premature coronary artery disease

    Directory of Open Access Journals (Sweden)

    Hebbagodi Sridhara


    Full Text Available Abstract Background The APOA1-C3-A5 gene cluster plays an important role in the regulation of lipids. Asian Indians have an increased tendency for abnormal lipid levels and high risk of Coronary Artery Disease (CAD. Therefore, the present study aimed to elucidate the relationship of four single nucleotide polymorphisms (SNPs in the Apo11q cluster, namely the -75G>A, +83C>T SNPs in the APOA1 gene, the Sac1 SNP in the APOC3 gene and the S19W variant in the APOA5 gene to plasma lipids and CAD in 190 affected sibling pairs (ASPs belonging to Asian Indian families with a strong CAD history. Methods & results Genotyping and lipid assays were carried out using standard protocols. Plasma lipids showed a strong heritability (h2 48% – 70%; P P A (LOD score 2.77 SNPs by single-point analysis (P A (pi 0.56 and +83C>T (pi 0.52 (P P A SNPs along with hypertension showed maximized correlations with TC, TG and Apo B by association analysis. Conclusion The APOC3-Sac1 SNP is an important genetic variant that is associated with CAD through its interaction with plasma lipids and other standard risk factors among Asian Indians.

  20. The powdery mildew resistance gene REN1 co-segregates with an NBS-LRR gene cluster in two Central Asian grapevines

    Directory of Open Access Journals (Sweden)

    Morgante Michele


    Full Text Available Abstract Background Grape powdery mildew is caused by the North American native pathogen Erysiphe necator. Eurasian Vitis vinifera varieties were all believed to be susceptible. REN1 is the first resistance gene naturally found in cultivated plants of Vitis vinifera. Results REN1 is present in 'Kishmish vatkana' and 'Dzhandzhal kara', two grapevines documented in Central Asia since the 1920's. These cultivars have a second-degree relationship (half sibs, grandparent-grandchild, or avuncular, and share by descent the chromosome on which the resistance allele REN1 is located. The REN1 interval was restricted to 1.4 cM using 38 SSR markers distributed across the locus and the segregation of the resistance phenotype in two progenies of collectively 461 offspring, derived from either resistant parent. The boundary markers delimit a 1.4-Mbp sequence in the PN40024 reference genome, which contains 27 genes with known functions, 2 full-length coiled-coil NBS-LRR genes, and 9 NBS-LRR pseudogenes. In the REN1 locus of PN40024, NBS genes have proliferated through a mixture of segmental duplications, tandem gene duplications, and intragenic recombination between paralogues, indicating that the REN1 locus has been inherently prone to producing genetic variation. Three SSR markers co-segregate with REN1, the outer ones confining the 908-kb array of NBS-LRR genes. Kinship and clustering analyses based on genetic distances with susceptible cultivars representative of Central Asian Vitis vinifera indicated that 'Kishmish vatkana' and 'Dzhandzhal kara' fit well into local germplasm. 'Kishmish vatkana' also has a parent-offspring relationship with the seedless table grape 'Sultanina'. In addition, the distant genetic relatedness to rootstocks, some of which are derived from North American species resistant to powdery mildew and have been used worldwide to guard against phylloxera since the late 1800's, argues against REN1 being infused into Vitis vinifera from a

  1. Global methylation silencing of clustered proto-cadherin genes in cervical cancer: serving as diagnostic markers comparable to HPV. (United States)

    Wang, Kai-Hung; Lin, Cuei-Jyuan; Liu, Chou-Jen; Liu, Dai-Wei; Huang, Rui-Lan; Ding, Dah-Ching; Weng, Ching-Feng; Chu, Tang-Yuan


    Epigenetic remodeling of cell adhesion genes is a common phenomenon in cancer invasion. This study aims to investigate global methylation of cell adhesion genes in cervical carcinogenesis and to apply them in early detection of cancer from cervical scraping. Genome-wide methylation array was performed on an investigation cohort, including 16 cervical intraepithelial neoplasia 3 (CIN3) and 20 cervical cancers (CA) versus 12 each of normal, inflammation and CIN1 as controls. Twelve members of clustered proto-cadherin (PCDH) genes were collectively methylated and silenced, which were validated in cancer cells of the cervix, endometrium, liver, head and neck, breast, and lung. In an independent cohort including 107 controls, 66 CIN1, 85 CIN2/3, and 38 CA, methylated PCDHA4 and PCDHA13 were detected in 2.8%, 24.2%, 52.9%, and 84.2% (P diagnostic markers for cervical cancer noninferior to HPV.

  2. The gp63 Gene Cluster Is Highly Polymorphic in Natural Leishmania (Viannia) braziliensis Populations, but Functional Sites Are Conserved (United States)

    Medina, Lilian S.; Souza, Bruno Araújo; Queiroz, Adriano; Guimarães, Luiz Henrique; Lima Machado, Paulo Roberto; M Carvalho, Edgar; Wilson, Mary Edythe; Schriefer, Albert


    GP63 or leishmanolysin is the major surface protease of Leishmania spp. involved in parasite virulence and host cell interaction. As such, GP63 is a potential target of eventual vaccines against these protozoa. In the current study we evaluate the polymorphism of gp63 in Leishmania (Viannia) braziliensis isolated from two sets of American tegumentary leishmaniasis (ATL) cases from Corte de Pedra, Brazil, including 35 cases diagnosed between 1994 and 2001 and 6 cases diagnosed between 2008 and 2011. Parasites were obtained from lesions by needle aspiration and cultivation. Genomic DNA was extracted, and 405 bp fragments, including sequences encoding the putative macrophage interacting sites, were amplified from gp63 genes of all isolates. DNA amplicons were cloned into plasmid vectors and ten clones per L. (V.) braziliensis isolate were sequenced. Alignment of cloned sequences showed extensive polymorphism among gp63 genes within, and between parasite isolates. Overall, 45 different polymorphic alleles were detected in all samples, which could be segregated into two clusters. Cluster one included 25, and cluster two included 20 such genotypes. The predicted peptides showed overall conservation below 50%. In marked contrast, the conservation at segments with putative functional domains approached 90% (Fisher’s exact test p<0.0001). These findings show that gp63 is very polymorphic even among parasites from a same endemic focus, but the functional domains interacting with the mammalian host environment are conserved. PMID:27648939

  3. The ArcD1 and ArcD2 arginine/ornithine exchangers encoded in the arginine deiminase (ADI) pathway gene cluster of Lactococcus lactis

    NARCIS (Netherlands)

    Noens, Elke E E; Kaczmarek, Michał B; Żygo, Monika; Lolkema, Juke S


    The arginine deiminase pathway (ADI) gene cluster in Lactococcus lactis contains two copies of a gene encoding an L-arginine/L-ornithine exchanger, the arcD1 and arcD2 genes. The physiological function of ArcD1 and ArcD2 was studied by deleting the two genes. Deletion of arcD1 resulted in loss of th

  4. The type F6 neurotoxin gene cluster locus of group II clostridium botulinum has evolved by successive disruption of two different ancestral precursors. (United States)

    Carter, Andrew T; Stringer, Sandra C; Webb, Martin D; Peck, Michael W


    Genome sequences of five different Group II (nonproteolytic) Clostridium botulinum type F6 strains were compared at a 50-kb locus containing the neurotoxin gene cluster. A clonal origin for these strains is indicated by the fact that sequences were identical except for strain Eklund 202F, with 10 single-nucleotide polymorphisms and a 15-bp deletion. The essential topB gene encoding topoisomerase III was found to have been split by the apparent insertion of 34.4 kb of foreign DNA (in a similar manner to that in Group II C. botulinum type E where the rarA gene has been disrupted by a neurotoxin gene cluster). The foreign DNA, which includes the intact 13.6-kb type F6 neurotoxin gene cluster, bears not only a newly introduced topB gene but also two nonfunctional botulinum neurotoxin gene remnants, a type B and a type E. This observation combined with the discovery of bacteriophage integrase genes and IS4 elements suggest that several rounds of recombination/horizontal gene transfer have occurred at this locus. The simplest explanation for the current genotype is that the ancestral bacterium, a Group II C. botulinum type B strain, received DNA firstly from a strain containing a type E neurotoxin gene cluster, then from a strain containing a type F6 neurotoxin gene cluster. Each event disrupted the previously functional neurotoxin gene. This degree of successive recombination at one hot spot is without precedent in C. botulinum, and it is also the first description of a Group II C. botulinum genome containing more than one neurotoxin gene sequence.

  5. Functional characterization of diverse ring-hydroxylating oxygenases and induction of complex aromatic catabolic gene clusters in Sphingobium sp. PNB

    Directory of Open Access Journals (Sweden)

    Pratick Khara


    Full Text Available Sphingobium sp. PNB, like other sphingomonads, has multiple ring-hydroxylating oxygenase (RHO genes. Three different fosmid clones have been sequenced to identify the putative genes responsible for the degradation of various aromatics in this bacterial strain. Comparison of the map of the catabolic genes with that of different sphingomonads revealed a similar arrangement of gene clusters that harbors seven sets of RHO terminal components and a sole set of electron transport (ET proteins. The presence of distinctly conserved amino acid residues in ferredoxin and in silico molecular docking analyses of ferredoxin with the well characterized terminal oxygenase components indicated the structural uniqueness of the ET component in sphingomonads. The predicted substrate specificities, derived from the phylogenetic relationship of each of the RHOs, were examined based on transformation of putative substrates and their structural homologs by the recombinant strains expressing each of the oxygenases and the sole set of available ET proteins. The RHO AhdA1bA2b was functionally characterized for the first time and was found to be capable of transforming ethylbenzene, propylbenzene, cumene, p-cymene and biphenyl, in addition to a number of polycyclic aromatic hydrocarbons. Overexpression of aromatic catabolic genes in strain PNB, revealed by real-time PCR analyses, is a way forward to understand the complex regulation of degradative genes in sphingomonads.

  6. Clustered array of ochratoxin A biosynthetic genes in Aspergillus steynii and their expression patterns in permissive conditions. (United States)

    Gil-Serna, Jessica; Vázquez, Covadonga; González-Jaén, María Teresa; Patiño, Belén


    Aspergillus steynii is probably the most relevant species of section Circumdati producing ochratoxin A (OTA). This mycotoxin contaminates a wide number of commodities and it is highly toxic for humans and animals. Little is known on the biosynthetic genes and their regulation in Aspergillus species. In this work, we identified and analysed three contiguous genes in A. steynii using 5'-RACE and genome walking approaches which predicted a cytochrome P450 monooxygenase (p450ste), a non-ribosomal peptide synthetase (nrpsste) and a polyketide synthase (pksste). These three genes were contiguous within a 20742 bp long genomic DNA fragment. Their corresponding cDNA were sequenced and their expression was analysed in three A. steynii strains using real time RT-PCR specific assays in permissive conditions in in vitro cultures. OTA was also analysed in these cultures. Comparative analyses of predicted genomic, cDNA and amino acid sequences were performed with sequences of similar gene functions. All the results obtained in these analyses were consistent and point out the involvement of these three genes in OTA biosynthesis by A. steynii and showed a co-ordinated expression pattern. This is the first time that a clustered organization OTA biosynthetic genes has been reported in Aspergillus genus. The results also suggested that this situation might be common in Aspergillus OTA-producing species and distinct to the one described for Penicillium species.

  7. IMG-ABC: An Atlas of Biosynthetic Gene Clusters to Fuel the Discovery of Novel Secondary Metabolites

    Energy Technology Data Exchange (ETDEWEB)

    Chen, I-Min; Chu, Ken; Ratner, Anna; Palaniappan, Krishna; Huang, Jinghua; Reddy, T. B.K.; Cimermancic, Peter; Fischbach, Michael; Ivanova, Natalia; Markowitz, Victor; Kyrpides, Nikos; Pati, Amrita


    In the discovery of secondary metabolites (SMs), large-scale analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of relevant computational resources. We present IMG-ABC ( -- An Atlas of Biosynthetic gene Clusters within the Integrated Microbial Genomes (IMG) system1. IMG-ABC is a rich repository of both validated and predicted biosynthetic clusters (BCs) in cultured isolates, single-cells and metagenomes linked with the SM chemicals they produce and enhanced with focused analysis tools within IMG. The underlying scalable framework enables traversal of phylogenetic dark matter and chemical structure space -- serving as a doorway to a new era in the discovery of novel molecules.

  8. Mandibulofacial dysostosis in a patient with a de novo 2;17 translocation that disrupts the HOXD gene cluster. (United States)

    Stevenson, David A; Bleyl, Steven B; Maxwell, Teresa; Brothman, Arthur R; South, Sarah T


    Treacher Collins syndrome (TCS) is the prototypical mandibulofacial dysostosis syndrome, but other mandibulofacial dysostosis syndromes have been described. We report an infant with mandibulofacial dysostosis and an apparently balanced de novo 2;17 translocation. She presented with severe lower eyelid colobomas requiring skin grafting, malar and mandibular hypoplasia, bilateral microtia with external auditory canal atreasia, dysplastic ossicles, hearing loss, bilateral choanal stenosis, cleft palate without cleft lip, several oral frenula of the upper lip/gum, and micrognathia requiring tracheostomy. Her limbs were normal. Chromosome analysis at the 600-band level showed a 46,XX,t(2;17)(q24.3;q23) karyotype. Sequencing of the entire TCOF1 coding region did not show evidence of a sequence variation. High-resolution genomic microarray analysis did not identify a cryptic imbalance. FISH mapping refined the breakpoints to 2q31.1 and 17q24.3-25.1 and showed the 2q31.1 breakpoint likely affects the HOXD gene cluster. Several atypical findings and lack of an identifiable TCOF1 mutation suggest that this child has a provisionally unique mandibulofacial dysostosis syndrome. The apparently balanced de novo translocation provides candidate loci for atypical and TCOF1 mutation negative cases of TCS. Based on the agreement of our findings with one previous case of mandibulofacial dysostosis with a 2q31.1 transocation, we hypothesize that misexpression of genes in the HOXD gene cluster produced the described phenotype in this patient.

  9. New natural products isolated from Metarhizium robertsii ARSEF 23 by chemical screening and identification of the gene cluster through engineered biosynthesis in Aspergillus nidulans A1145. (United States)

    Kato, Hiroki; Tsunematsu, Yuta; Yamamoto, Tsuyoshi; Namiki, Takuya; Kishimoto, Shinji; Noguchi, Hiroshi; Watanabe, Kenji


    To rapidly identify novel natural products and their associated biosynthetic genes from underutilized and genetically difficult-to-manipulate microbes, we developed a method that uses (1) chemical screening to isolate novel microbial secondary metabolites, (2) bioinformatic analyses to identify a potential biosynthetic gene cluster and (3) heterologous expression of the genes in a convenient host to confirm the identity of the gene cluster and the proposed biosynthetic mechanism. The chemical screen was achieved by searching known natural product databases with data from liquid chromatographic and high-resolution mass spectrometric analyses collected on the extract from a target microbe culture. Using this method, we were able to isolate two new meroterpenes, subglutinols C (1) and D (2), from an entomopathogenic filamentous fungus Metarhizium robertsii ARSEF 23. Bioinformatics analysis of the genome allowed us to identify a gene cluster likely to be responsible for the formation of subglutinols. Heterologous expression of three genes from the gene cluster encoding a polyketide synthase, a prenyltransferase and a geranylgeranyl pyrophosphate synthase in Aspergillus nidulans A1145 afforded an α-pyrone-fused uncyclized diterpene, the expected intermediate of the subglutinol biosynthesis, thereby confirming the gene cluster to be responsible for the subglutinol biosynthesis. These results indicate the usefulness of our methodology in isolating new natural products and identifying their associated biosynthetic gene cluster from microbes that are not amenable to genetic manipulation. Our method should facilitate the natural product discovery efforts by expediting the identification of new secondary metabolites and their associated biosynthetic genes from a wider source of microbes.

  10. Organization of a resistance gene cluster linked to rhizomania resistance in sugar beet (United States)

    Genetic resistance to rhizomania has been in use for over 40 years. Characterization of the molecular basis for susceptibility and resistance has proved challenging. Nucleotide-binding leucine-rich-repeat-containing (NB-LRR) genes have been implicated in numerous gene-for-gene resistance interaction...

  11. Widespread occurrence and lateral transfer of the cyanobactin biosynthesis gene cluster in cyanobacteria. (United States)

    Leikoski, Niina; Fewer, David P; Sivonen, Kaarina


    Cyanobactins are small cyclic peptides produced by cyanobacteria. Here we demonstrate the widespread but sporadic occurrence of the cyanobactin biosynthetic pathway. We detected a cyanobactin biosynthetic gene in 48 of the 132 strains included in this study. Our results suggest that cyanobactin biosynthetic genes have a complex evolutionary history in cyanobacteria punctuated by a series of ancient horizontal gene transfer events.

  12. Widespread Occurrence and Lateral Transfer of the Cyanobactin Biosynthesis Gene Cluster in Cyanobacteria ▿ †


    Leikoski, Niina; Fewer, David P.; Sivonen, Kaarina


    Cyanobactins are small cyclic peptides produced by cyanobacteria. Here we demonstrate the widespread but sporadic occurrence of the cyanobactin biosynthetic pathway. We detected a cyanobactin biosynthetic gene in 48 of the 132 strains included in this study. Our results suggest that cyanobactin biosynthetic genes have a complex evolutionary history in cyanobacteria punctuated by a series of ancient horizontal gene transfer events.

  13. Mutations in the ligand-binding domain of the androgen receptor gene cluster in two regions of the gene. (United States)

    McPhaul, M J; Marcelli, M; Zoppi, S; Wilson, C M; Griffin, J E; Wilson, J D


    We have analyzed the nucleotide sequence of the androgen receptor from 22 unrelated subjects with substitution mutations of the hormone-binding domain. Eleven had the phenotype of complete testicular feminization, four had incomplete testicular feminization, and seven had Reifenstein syndrome. The underlying functional defect in cultured skin fibroblasts included individuals with absent, qualitative, or quantitative defects in ligand binding. 19 of the 21 substitution mutations (90%) cluster in two regions that account for approximately 35% of the hormone-binding domain, namely, between amino acids 726 and 772 and between amino acids 826 and 864. The fact that one of these regions is homologous to a region of the human thyroid hormone receptor (hTR-beta) which is a known cluster site for mutations that cause thyroid hormone resistance implies that this localization of mutations is not a coincidence. These regions of the androgen receptor may be of particular importance for the formation and function of the hormone-receptor complex.

  14. Characterization of divIVA and other genes located in the chromosomal region downstream of the dcw cluster in Streptococcus pneumoniae. (United States)

    Fadda, Daniela; Pischedda, Carla; Caldara, Fabrizio; Whalen, Michael B; Anderluzzi, Daniela; Domenici, Enrico; Massidda, Orietta


    We analyzed the chromosome region of Streptococcus pneumoniae located downstream of the division and cell wall (dcw) cluster that contains the homolog of the Bacillus subtilis cell division gene divIVA and some genes of unknown function. Inactivation of divIVA in S. pneumoniae resulted in severe growth inhibition and defects in cell shape, nucleoid segregation, and cell division. Inactivation of the ylm genes resulted in some morphological and/or division abnormalities, depending on the inactivated gene. Transcriptional analysis revealed a relationship between these genes and the ftsA and ftsZ cell division genes, also indicating that the connection between the dcw cluster and the divIVA region is more extensive than just chromosomal position and gene organization.

  15. Interleukin-1 gene cluster variants in hemodialysis patients with end stage renal disease: An association and meta-analysis. (United States)

    Tripathi, G; Rangaswamy, D; Borkar, M; Prasad, N; Sharma, R K; Sankhwar, S N; Agrawal, S


    We evaluated whether polymorphisms in interleukin (IL-1) gene cluster (IL-1 alpha [IL-1A], IL-1 beta [IL-1B], and IL-1 receptor antagonist [IL-1RN]) are associated with end stage renal disease (ESRD). A total of 258 ESRD patients and 569 ethnicity matched controls were examined for IL-1 gene cluster. These were genotyped for five single-nucleotide gene polymorphisms in the IL-1A, IL-1B and IL-1RN genes and a variable number of tandem repeats (VNTR) in the IL-1RN. The IL-1B - 3953 and IL-1RN + 8006 polymorphism frequencies were significantly different between the two groups. At IL-1B, the T allele of - 3953C/T was increased among ESRD (P = 0.0001). A logistic regression model demonstrated that two repeat (240 base pair [bp]) of the IL-1Ra VNTR polymorphism was associated with ESRD (P = 0.0001). The C/C/C/C/C/1 haplotype was more prevalent in ESRD = 0.007). No linkage disequilibrium (LD) was observed between six loci of IL-1 gene. We further conducted a meta-analysis of existing studies and found that there is a strong association of IL-1 RN VNTR 86 bp repeat polymorphism with susceptibility to ESRD (odds ratio = 2.04, 95% confidence interval = 1.48-2.82; P = 0.000). IL-1B - 5887, +8006 and the IL-1RN VNTR polymorphisms have been implicated as potential risk factors for ESRD. The meta-analysis showed a strong association of IL-1RN 86 bp VNTR polymorphism with susceptibility to ESRD.

  16. An Ipomoea batatas iron-sulfur cluster scaffold protein gene, IbNFU1, is involved in salt tolerance.

    Directory of Open Access Journals (Sweden)

    Degao Liu

    Full Text Available Iron-sulfur cluster biosynthesis involving the nitrogen fixation (Nif proteins has been proposed as a general mechanism acting in various organisms. NifU-like protein may play an important role in protecting plants against abiotic and biotic stresses. An iron-sulfur cluster scaffold protein gene, IbNFU1, was isolated from a salt-tolerant sweetpotato (Ipomoea batatas (L. Lam. line LM79 in our previous study, but its role in sweetpotato stress tolerance was not investigated. In the present study, the IbNFU1 gene was introduced into a salt-sensitive sweetpotato cv. Lizixiang to characterize its function in salt tolerance. The IbNFU1-overexpressing sweetpotato plants exhibited significantly higher salt tolerance compared with the wild-type. Proline and reduced ascorbate content were significantly increased, whereas malonaldehyde (MDA content was significantly decreased in the transgenic plants. The activities of superoxide dismutase (SOD and photosynthesis were significantly enhanced in the transgenic plants. H2O2 was also found to be significantly less accumulated in the transgenic plants than in the wild-type. Overexpression of IbNFU1 up-regulated pyrroline-5-carboxylate synthase (P5CS and pyrroline-5-carboxylate reductase (P5CR genes under salt stress. The systemic up-regulation of reactive oxygen species (ROS scavenging genes was found in the transgenic plants under salt stress. These findings suggest that IbNFU1gene is involved in sweetpotato salt tolerance and enhances salt tolerance of the transgenic sweetpotato plants by regulating osmotic balance, protecting membrane integrity and photosynthesis and activating ROS scavenging system.

  17. Mutations in the ligand-binding domain of the androgen receptor gene cluster in two regions of the gene.


    McPhaul, M J; Marcelli, M; Zoppi, S; Wilson, C. M.; Griffin, J E; Wilson, J. D.


    We have analyzed the nucleotide sequence of the androgen receptor from 22 unrelated subjects with substitution mutations of the hormone-binding domain. Eleven had the phenotype of complete testicular feminization, four had incomplete testicular feminization, and seven had Reifenstein syndrome. The underlying functional defect in cultured skin fibroblasts included individuals with absent, qualitative, or quantitative defects in ligand binding. 19 of the 21 substitution mutations (90%) cluster ...

  18. Structure and gene cluster of the O-antigen of Escherichia coli O156 containing a pyruvic acid acetal. (United States)

    Duan, Zhifeng; Senchenkova, Sof'ya N; Guo, Xi; Perepelov, Andrei V; Shashkov, Alexander S; Liu, Bin; Knirel, Yuriy A


    The lipopolysaccharide of Escherichia coli O156 was degraded under mild acidic and alkaline conditions and the resulting polysaccharides were studied by sugar analysis and (1)H and (13)C NMR spectroscopy. The following structure of the pentasaccharide repeating unit of the O-polysaccharide was established: where Rpyr indicates R-configurated pyruvic acid acetal. Minor O-acetyl groups also were present and tentatively localized on the Gal residues. The gene cluster for biosynthesis of the O-antigen of E. coli O156 was analyzed and shown to be consistent with the O-polysaccharide structure.

  19. Gene Repression in Haloarchaea Using the CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats)-Cas I-B System. (United States)

    Stachler, Aris-Edda; Marchfelder, Anita


    The clustered regularly interspaced short palindromic repeats (CRISPR)-Cas system is used by bacteria and archaea to fend off foreign genetic elements. Since its discovery it has been developed into numerous applications like genome editing and regulation of transcription in eukaryotes and bacteria. For archaea currently no tools for transcriptional repression exist. Because molecular biology analyses in archaea become more and more widespread such a tool is vital for investigating the biological function of essential genes in archaea. Here we use the model archaeon Haloferax volcanii to demonstrate that its endogenous CRISPR-Cas system I-B can be harnessed to repress gene expression in archaea. Deletion of cas3 and cas6b genes results in efficient repression of transcription. crRNAs targeting the promoter region reduced transcript levels down to 8%. crRNAs targeting the reading frame have only slight impact on transcription. crRNAs that target the coding strand repress expression only down to 88%, whereas crRNAs targeting the template strand repress expression down to 8%. Repression of an essential gene results in reduction of transcription levels down to 22%. Targeting efficiencies can be enhanced by expressing a catalytically inactive Cas3 mutant. Genes can be targeted on plasmids or on the chromosome, they can be monocistronic or part of a polycistronic operon.

  20. Molecular characterization of the PR-toxin gene cluster in Penicillium roqueforti and Penicillium chrysogenum: cross talk of secondary metabolite pathways. (United States)

    Hidalgo, Pedro I; Ullán, Ricardo V; Albillos, Silvia M; Montero, Olimpio; Fernández-Bodega, María Ángeles; García-Estrada, Carlos; Fernández-Aguado, Marta; Martín, Juan-Francisco


    The PR-toxin is a potent mycotoxin produced by Penicillium roqueforti in moulded grains and grass silages and may contaminate blue-veined cheese. The PR-toxin derives from the 15 carbon atoms sesquiterpene aristolochene formed by the aristolochene synthase (encoded by ari1). We have cloned and sequenced a four gene cluster that includes the ari1 gene from P. roqueforti. Gene silencing of each of the four genes (named prx1 to prx4) resulted in a reduction of 65-75% in the production of PR-toxin indicating that the four genes encode enzymes involved in PR-toxin biosynthesis. Interestingly the four silenced mutants overproduce large amounts of mycophenolic acid, an antitumor compound formed by an unrelated pathway suggesting a cross-talk of PR-toxin and mycophenolic acid production. An eleven gene cluster that includes the above mentioned four prx genes and a 14-TMS drug/H(+) antiporter was found in the genome of Penicillium chrysogenum. This eleven gene cluster has been reported to be very poorly expressed in a transcriptomic study of P. chrysogenum genes under conditions of penicillin production (strongly aerated cultures). We found that this apparently silent gene cluster is able to produce PR-toxin in P. chrysogenum under static culture conditions on hydrated rice medium. Noteworthily, the production of PR-toxin was 2.6-fold higher in P. chrysogenum npe10, a strain deleted in the 56.8kb amplifiable region containing the pen gene cluster, than in the parental strain Wisconsin 54-1255 providing another example of cross-talk between secondary metabolite pathways in this fungus. A detailed PR-toxin biosynthesis pathway is proposed based on all available evidence.

  1. Functional clustering and lineage markers: insights into cellular differentiation and gene function from large-scale microarray studies of purified primary cell populations. (United States)

    Hume, David A; Summers, Kim M; Raza, Sobia; Baillie, J Kenneth; Freeman, Thomas C


    Very large microarray datasets showing gene expression across multiple tissues and cell populations provide a window on the transcriptional networks that underpin the differences in functional activity between biological systems. Clusters of co-expressed genes provide lineage markers, candidate regulators of cell function and, by applying the principle of guilt by association, candidate functions for genes of currently unknown function. We have analysed a dataset comprising pure cell populations from hemopoietic and non-hemopoietic cell types ( Using a novel network visualisation and clustering approach, we demonstrate that it is possible to identify very tight expression signatures associated specifically with embryonic stem cells, mesenchymal cells and hematopoietic lineages. Selected examples validate the prediction that gene function can be inferred by co-expression. One expression cluster was enriched in phagocytes, which, alongside endosome-lysosome constituents, contains genes that may make up a 'pathway' for phagocyte differentiation. Promoters of these genes are enriched for binding sites for the ETS/PU.1 and MITF families. Another cluster was associated with the production of a specific extracellular matrix, with high levels of gene expression shared by cells of mesenchymal origin (fibroblasts, adipocytes, osteoblasts and myoblasts). We discuss the limitations placed upon such data by the presence of alternative promoters with distinct tissue specificity within many protein-coding genes.

  2. Heterologous Reconstitution of the Intact Geodin Gene Cluster in Aspergillus nidulans through a Simple and Versatile PCR Based Approach

    DEFF Research Database (Denmark)

    Nielsen, Morten Thrane; Nielsen, Jakob Blæsbjerg; Anyaogu, Dianna Chinyere;


    Fungal natural products are a rich resource for bioactive molecules. To fully exploit this potential it is necessary to link genes to metabolites. Genetic information for numerous putative biosynthetic pathways has become available in recent years through genome sequencing. However, the lack...... was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to ransformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were...... in characterizing the many exciting pathways for secondary metabolite production that are currently being uncovered by the fungal genome sequencing projects....

  3. Delineation of a scab resistance gene cluster on linkage group 2 of apple


    Bus, V.G.M.; De Weg, Van, W.E.; Durel, C.E.; Gessler, C.; Parisi, L.; Rikkerink, E.H.A.; Gardiner, S.E.; Meulenbroek, E.J.; Calenge, F.; Patocchi, A.; Laurens, F.N.D.


    With the advent of genetic maps for apple that carry common transferable markers, it is possible to investigate genomic relationships between genes present in different accessions. Co-dominant markers, such as microsatellites, are particularly useful for this purpose. In recent years, genetic markers have been developed for a number of resistance genes for apple scab (Venturia inaequalis). In this paper, we present the discovery of a new scab resistance gene (Vh8) that maps to linkage group 2...

  4. Characterization of biosynthetic gene cluster for the production of virginiamycin M, a streptogramin type A antibiotic, in Streptomyces virginiae. (United States)

    Pulsawat, Nattika; Kitani, Shigeru; Nihira, Takuya


    Virginiamycin M (VM) of Streptomyces virginiae is a hybrid polyketide-peptide antibiotic with peptide antibiotic virginiamycin S (VS) as its synergistic counterpart. VM and VS belong to the Streptogramin family, which is characterized by strong synergistic antibacterial activity, and their water-soluble derivatives are a new therapeutic option for combating vancomycin-resistant Gram-positive bacteria. Here, the VM biosynthetic gene cluster was isolated from S. virginiae in the 62-kb region located in the vicinity of the regulatory island for virginiamycin production. Sequence analysis revealed that the region consists of 19 complete open reading frames (ORFs) and one C-terminally truncated ORF, encoding hybrid polyketide synthase (PKS)-nonribosomal peptide synthetase (NRPS), typical PKS, enzymes synthesizing precursors for VM, transporters for resistance, regulatory proteins, and auxiliary enzymes. The involvement of the cloned gene cluster in VM biosynthesis was confirmed by gene disruption of virA encoding a hybrid PKS-NRPS megasynthetase, which resulted in complete loss of VM production without any effect on VS production. To assemble the VM core structure, VirA, VirF, VirG, and VirH consisting, as a whole, of 24 domains in 8 PKS modules and 7 domains in 2 NRPS modules were predicted to act as an acyltransferase (AT)-less hybrid PKS-NRPS, whereas VirB, VirC, VirD, and VirE are likely to be essential for the incorporation of the methyl group into the VM framework by a HMG-CoA synthase-based reaction. Among several uncommon features of gene organization in the VM gene cluster, the lack of AT domain in every PKS module and the presence of a discrete AT encoded by virI are notable. AT-overexpression by an additional copy of virI driven by ermEp() resulted in 1.5-fold increase of VM production, suggesting that the amount of VirI is partly limiting VM biosynthesis.

  5. Biosynthesis of Akaeolide and Lorneic Acids and Annotation of Type I Polyketide Synthase Gene Clusters in the Genome of Streptomyces sp. NPS554

    Directory of Open Access Journals (Sweden)

    Tao Zhou


    Full Text Available The incorporation pattern of biosynthetic precursors into two structurally unique polyketides, akaeolide and lorneic acid A, was elucidated by feeding experiments with 13C-labeled precursors. In addition, the draft genome sequence of the producer, Streptomyces sp. NPS554, was performed and the biosynthetic gene clusters for these polyketides were identified. The putative gene clusters contain all the polyketide synthase (PKS domains necessary for assembly of the carbon skeletons. Combined with the 13C-labeling results, gene function prediction enabled us to propose biosynthetic pathways involving unusual carbon-carbon bond formation reactions. Genome analysis also indicated the presence of at least ten orphan type I PKS gene clusters that might be responsible for the production of new polyketides.

  6. Verdict Accuracy of Quick Reduct Algorithm using Clustering and Classification Techniques for Gene Expression Data

    Directory of Open Access Journals (Sweden)



    Full Text Available In most gene expression data, the number of training samples is very small compared to the large number of genes involved in the experiments. However, among the large amount of genes, only a small fraction is effective for performing a certain task. Furthermore, a small subset of genes is desirable in developing gene expression based diagnostic tools for delivering reliable and understandable results. With the gene selection results, the cost of biological experiment and decision can be greatly reduced by analyzing only the marker genes. An important application of gene expression data in functional genomics is to classify samples according to their gene expression profiles. Feature selection (FS is a process which attempts to select more informative features. It is one of the important steps in knowledge discovery. Conventional supervised FS methods evaluate various feature subsets using an evaluation function or metric to select only those features which are related to the decision classes of the data under consideration. This paper studies a feature selection method based on rough set theory. Further K-Means, Fuzzy C-Means (FCM algorithm have implemented for the reduced feature set without considering class labels. Then the obtained results are compared with the original class labels. Back Propagation Network (BPN has also been used for classification. Then the performance of K-Means, FCM, and BPN are analyzed through the confusion matrix. It is found that the BPN is performing well comparatively.

  7. Regulation of the F11, Klkb1, Cyp4v3 gene cluster in livers of metabolically challenged mice.

    Directory of Open Access Journals (Sweden)

    Huma Safdar

    Full Text Available Single nucleotide polymorphisms (SNPs in a 4q35.2 locus that harbors the coagulation factor XI (F11, prekallikrein (KLKB1, and a cytochrome P450 family member (CYP4V2 genes are associated with deep venous thrombosis (DVT. These SNPs exert their effect on DVT by modifying the circulating levels of FXI. However, SNPs associated with DVT were not necessarily all in F11, but also in KLKB1 and CYP4V2. Here, we searched for evidence for common regulatory elements within the 4q35.2 locus, outside the F11 gene, that might control FXI plasma levels and/or DVT risk. To this end, we investigated the regulation of the orthologous mouse gene cluster under several metabolic conditions that impact mouse hepatic F11 transcription. In livers of mice in which HNF4α, a key transcription factor controlling F11, was ablated, or reduced by siRNA, a strong decrease in hepatic F11 transcript levels was observed that correlated with Cyp4v3 (mouse orthologue of CYP4V2, but not by Klkb1 levels. Estrogens induced hepatic F11 and Cyp4v3, but not Klkb1 transcript levels, whereas thyroid hormone strongly induced hepatic F11 transcript levels, and reduced Cyp4v3, leaving Klkb1 levels unaffected. Mice fed a high-fat diet also had elevated F11 transcription, markedly paralleled by an induction of Klkb1 and Cyp4v3 expression. We conclude that within the mouse F11, Klkb1, Cyp4v3 gene cluster, F11 and Cyp4v3 frequently display striking parallel transcriptional responses suggesting the presence of shared regulatory elements.

  8. Fine mapping of a HvCBF gene cluster at the frost resistance locus Fr-H2 in barley. (United States)

    Francia, E; Barabaschi, D; Tondelli, A; Laidò, G; Rizza, F; Stanca, A M; Busconi, M; Fogher, C; Stockinger, E J; Pecchioni, N


    Barley is an economically important model for the Triticeae tribe. We recently developed a new resource: the 'Nure' x 'Tremois' mapping population. Two low temperature QTLs were found to segregate on the long arm of chromosome 5H (Fr-H1, distal; Fr-H2, proximal). With the final aim of positional cloning of the genetic determinants of Fr-H1 and Fr-H2, a large segregating population of 1,849 F(2) plants between parents 'Nure' and 'Tremois' was prepared. These two QT loci were first validated by using a set of F(3) families, marker-selected to harbor pairs of reciprocal haplotypes, with one QTL fixed at homozygosity and the alternate one in heterozygous phase. The study was then focused towards the isolation of the determinant of Fr-H2. Subsequent recombinant screens and phenotypic evaluation of F(4) segregants allowed us to estimate (P < or = 0.01) a refined genomic interval of Fr-H2 (4.6 cM). Several barley genes with the CBF transcription factor signature had been already roughly mapped in cluster at Fr-H2, and they represent likely candidate genes underlying this QTL. Using the large segregating population (3,698 gametes) a high-resolution genetic map of the HvCBF gene cluster was then constructed, and after fine mapping, six recombinations between the HvCBFs were observed. It was therefore possible to genetically divide seven HvCBF subclusters in barley, in a region spanning 0.81 cM, with distances among them varying from 0.03 to 0.32 cM. The few recombinants between the different HvCBF subclusters are being marker-selected and taken to homozygosity, to phenotypically separate the effects of the single HvCBF genes.

  9. Genetic variation in the toll-like receptor gene cluster (TLR10-TLR1-TLR6) and prostate cancer risk. (United States)

    Stevens, Victoria L; Hsing, Ann W; Talbot, Jeffrey T; Zheng, Siqun Lilly; Sun, Jielin; Chen, Jinbo; Thun, Michael J; Xu, Jianfeng; Calle, Eugenia E; Rodriguez, Carmen


    Toll-like receptors (TLRs) are key players in the innate immune system and initiate the inflammatory response to foreign pathogens such as bacteria, fungi and viruses. The proposed role of chronic inflammation in prostate carcinogenesis has prompted investigation into the association of common genetic variation in TLRs with the risk of this cancer. We investigated the role of common SNPs in a gene cluster encoding the TLR10, TLR6 and TLR1 proteins in prostate cancer etiology among 1,414 cancer cases and 1,414 matched controls from the Cancer Prevention Study II Nutrition Cohort. Twenty-eight SNPs, which included the majority of the common nonsynonymous SNPs in the 54-kb gene region and haplotype-tagging SNPs that defined 5 specific haplotype blocks, were genotyped and their association with prostate cancer risk determined. Two SNPs in TLR10 [I369L (rs11096955) and N241H (rs11096957)] and 4 SNPs in TLR1 [N248S (rs4833095), S26L (rs5743596), rs5743595 and rs5743551] were associated with a statistically significant reduced risk of prostate cancer of 29-38% (for the homozygous variant genotype). The association of these SNPs was similar when the analysis was limited to cases with advanced prostate cancer. Haplotype analysis and linkage disequilibrium findings revealed that the 6 associated SNPs were not independent and represent a single association with reduced prostate cancer risk (OR = 0.55, 95% CI: 0.33, 0.90). Our study suggest that a common haplotype in the TLR10-TLR1-TLR6 gene cluster influences prostate cancer risk and clearly supports the need for further investigation of TLR genes in other populations.

  10. Characterization of the cysK2-ctl1-cysE2 gene cluster involved in sulfur metabolism in Lactobacillus casei. (United States)

    Bogicevic, Biljana; Irmler, Stefan; Portmann, Reto; Meile, Leo; Berthoud, Hélène


    The up- and downstream regions of ctl1 and ctl2 that encode a cystathionine lyase were analyzed in various Lactobacillus casei strains. ctl1 and ctl2 were found to be part of a gene cluster encoding two other open reading frames. One of the two open reading frames precedes ctl1 and encodes a putative cysteine synthase. The other open reading frame lies downstream of ctl1 and encodes a putative serine acetyltransferase. The gene cluster is not present in the publicly available genome sequences of L. casei ATCC 334, BL23 and Zhang. Apparently, the gene cluster was acquired by a horizontal gene transfer event and can also be found in other lactic acid bacteria such as Lactobacillus helveticus, Lactobacillus delbrueckii subsp. bulgaricus and Streptococcus thermophilus. RT-PCR was used to analyze the expression of the gene cluster. Additionally, an mass spectrometry-based selected reaction monitoring method was developed for quantifying Ctl1 in a cell-free extract of lactic acid bacteria. The gene cluster cysK2-ctl1-cysE2 was expressed as single transcript, and expression was down-regulated by cysteine. In addition, cystathionine lyase activity present in cell-free extracts disappeared when L. casei was grown in the presence of cysteine. Whereas the transcript and the gene product of ctl1 protein were found in all studied ctl1(+)L. casei strains, only the transcript but not the protein or cystathionine lyase activity was detected in L. helveticus FAM2888, L. delbrueckii subsp. bulgaricus ATCC 11842 and S. thermophilus FAM17014, which actually possess a homolog of the cysK2-ctl1-cysE2 gene cluster.

  11. IMG-ABC: new features for bacterial secondary metabolism analysis and targeted biosynthetic gene cluster discovery in thousands of microbial genomes (United States)

    Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C.; Ivanova, Natalia N.


    Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) ( Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. PMID:27903896

  12. A novel gene cluster allows preferential utilization of fucosylated milk oligosaccharides in Bifidobacterium longum subsp. longum SC596 (United States)

    Garrido, Daniel; Ruiz-Moyano, Santiago; Kirmiz, Nina; Davis, Jasmine C.; Totten, Sarah M.; Lemay, Danielle G.; Ugalde, Juan A.; German, J. Bruce; Lebrilla, Carlito B.; Mills, David A.


    The infant intestinal microbiota is often colonized by two subspecies of Bifidobacterium longum: subsp. infantis (B. infantis) and subsp. longum (B. longum). Competitive growth of B. infantis in the neonate intestine has been linked to the utilization of human milk oligosaccharides (HMO). However, little is known how B. longum consumes HMO. In this study, infant-borne B. longum strains exhibited varying HMO growth phenotypes. While all strains efficiently utilized lacto-N-tetraose, certain strains additionally metabolized fucosylated HMO. B. longum SC596 grew vigorously on HMO, and glycoprofiling revealed a preference for consumption of fucosylated HMO. Transcriptomes of SC596 during early-stage growth on HMO were more similar to growth on fucosyllactose, transiting later to a pattern similar to growth on neutral HMO. B. longum SC596 contains a novel gene cluster devoted to the utilization of fucosylated HMO, including genes for import of fucosylated molecules, fucose metabolism and two α-fucosidases. This cluster showed a modular induction during early growth on HMO and fucosyllactose. This work clarifies the genomic and physiological variation of infant-borne B. longum to HMO consumption, which resembles B. infantis. The capability to preferentially consume fucosylated HMO suggests a competitive advantage for these unique B. longum strains in the breast-fed infant gut. PMID:27756904

  13. Characterization of the Gene Cluster Involved in Isoprene Metabolism in Rhodococcus sp. Strain AD45

    NARCIS (Netherlands)

    van Hylckama Vlieg, Johan E.T.; Leemhuis, Hans; Lutje Spelberg, Jeffrey H.; Janssen, Dick B.


    The genes involved in isoprene (2-methyl-1,3-butadiene) utilization in Rhodococcus sp. strain AD45 were cloned and characterized. Sequence analysis of an 8.5-kb DNA fragment showed the presence of 10 genes of which 2 encoded enzymes which were previously found to be involved in isoprene degradation:

  14. Gene clusters involved in isethionate degradation by terrestrial and marine bacteria.

    KAUST Repository

    Weinitschke, Sonja


    Ubiquitous isethionate (2-hydroxyethanesulfonate) is dissimilated by diverse bacteria. Growth of Cupriavidus necator H16 with isethionate was observed, as was inducible membrane-bound isethionate dehydrogenase (IseJ) and inducible transcription of the genes predicted to encode IseJ and a transporter (IseU). Biodiversity in isethionate transport genes was observed and investigated by transcription experiments.

  15. Vector integration is nonrandom and clustered and influences the fate of lymphopoiesis in SCID-X1 gene therapy. (United States)

    Deichmann, Annette; Hacein-Bey-Abina, Salima; Schmidt, Manfred; Garrigue, Alexandrine; Brugman, Martijn H; Hu, Jingqiong; Glimm, Hanno; Gyapay, Gabor; Prum, Bernard; Fraser, Christopher C; Fischer, Nicolas; Schwarzwaelder, Kerstin; Siegler, Maria-Luise; de Ridder, Dick; Pike-Overzet, Karin; Howe, Steven J; Thrasher, Adrian J; Wagemaker, Gerard; Abel, Ulrich; Staal, Frank J T; Delabesse, Eric; Villeval, Jean-Luc; Aronow, Bruce; Hue, Christophe; Prinz, Claudia; Wissler, Manuela; Klanke, Chuck; Weissenbach, Jean; Alexander, Ian; Fischer, Alain; von Kalle, Christof; Cavazzana-Calvo, Marina


    Recent reports have challenged the notion that retroviruses and retroviral vectors integrate randomly into the host genome. These reports pointed to a strong bias toward integration in and near gene coding regions and, for gammaretroviral vectors, around transcription start sites. Here, we report the results obtained from a large-scale mapping of 572 retroviral integration sites (RISs) isolated from cells of 9 patients with X-linked SCID (SCID-X1) treated with a retrovirus-based gene therapy protocol. Our data showed that two-thirds of insertions occurred in or very near to genes, of which more than half were highly expressed in CD34(+) progenitor cells. Strikingly, one-fourth of all integrations were clustered as common integration sites (CISs). The highly significant incidence of CISs in circulating T cells and the nature of their locations indicate that insertion in many gene loci has an influence on cell engraftment, survival, and proliferation. Beyond the observed cases of insertional mutagenesis in 3 patients, these data help to elucidate the relationship between vector insertion and long-term in vivo selection of transduced cells in human patients with SCID-X1.

  16. Deletion of the MBII-85 snoRNA gene cluster in mice results in postnatal growth retardation.

    Directory of Open Access Journals (Sweden)

    Boris V Skryabin


    Full Text Available Prader-Willi syndrome (PWS [MIM 176270] is a neurogenetic disorder characterized by decreased fetal activity, muscular hypotonia, failure to thrive, short stature, obesity, mental retardation, and hypogonadotropic hypogonadism. It is caused by the loss of function of one or more imprinted, paternally expressed genes on the proximal long arm of chromosome 15. Several potential PWS mouse models involving the orthologous region on chromosome 7C exist. Based on the analysis of deletions in the mouse and gene expression in PWS patients with chromosomal translocations, a critical region (PWScr for neonatal lethality, failure to thrive, and growth retardation was narrowed to the locus containing a cluster of neuronally expressed MBII-85 small nucleolar RNA (snoRNA genes. Here, we report the deletion of PWScr. Mice carrying the maternally inherited allele (PWScr(m-/p+ are indistinguishable from wild-type littermates. All those with the paternally inherited allele (PWScr(m+/p- consistently display postnatal growth retardation, with about 15% postnatal lethality in C57BL/6, but not FVB/N crosses. This is the first example in a multicellular organism of genetic deletion of a C/D box snoRNA gene resulting in a pronounced phenotype.

  17. Evaluation of the vector space representation in text-based gene clustering. (United States)

    Glenisson, P; Antal, P; Mathys, J; Moreau, Y; De Moor, B


    Thanks to its increasing availability, electronic literature can now be a major source of information when developing complex statistical models where data is scarce or contains much noise. This raises the question of how to deeply integrate information from domain literature with experimental data. Evaluating what kind of statistical text representations can integrate literature knowledge in clustering still remains an unsufficiently explored topic. In this work we discuss how the bag-of-words representation can be used successfully to represent genetic annotation and free-text information coming from different databases. We demonstrate the effect of various weighting schemes and information sources in a functional clustering setup. As a quantitative evaluation, we contrast for different parameter settings the functional groupings obtained from text with those obtained from expert assessments and link each of the results to a biological discussion.

  18. Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering

    Directory of Open Access Journals (Sweden)

    Sharma Animesh


    Full Text Available Abstract Background The four heterogeneous childhood cancers, neuroblastoma, non-Hodgkin lymphoma, rhabdomyosarcoma, and Ewing sarcoma present a similar histology of small round blue cell tumor (SRBCT and thus often leads to misdiagnosis. Identification of biomarkers for distinguishing these cancers is a well studied problem. Existing methods typically evaluate each gene separately and do not take into account the nonlinear interaction between genes and the tools that are used to design the diagnostic prediction system. Consequently, more genes are usually identified as necessary for prediction. We propose a general scheme for finding a small set of biomarkers to design a diagnostic system for accurate classification of the cancer subgroups. We use multilayer networks with online gene selection ability and relational fuzzy clustering to identify a small set of biomarkers for accurate classification of the training and blind test cases of a well studied data set. Results Our method discerned just seven biomarkers that precisely categorized the four subgroups of cancer both in training and blind samples. For the same problem, others suggested 19–94 genes. These seven biomarkers include three novel genes (NAB2, LSP1 and EHD1 – not identified by others with distinct class-specific signatures and important role in cancer biology, including cellular proliferation, transendothelial migration and trafficking of MHC class antigens. Interestingly, NAB2 is downregulated in other tumors including Non-Hodgkin lymphoma and Neuroblastoma but we observed moderate to high upregulation in a few cases of Ewing sarcoma and Rabhdomyosarcoma, suggesting that NAB2 might be mutated in these tumors. These genes can discover the subgroups correctly with unsupervised learning, can differentiate non-SRBCT samples and they perform equally well with other machine learning tools including support vector machines. These biomarkers lead to four simple human interpretable

  19. Role of CCL3L1-CCR5 Genotypes in the Epidemic Spread of HIV-1 and Evaluation of Vaccine Efficacy (United States)


    The rarity of HIV infection via cross- species transmission from chimpanzee to Pygmies contrasts with the fact that documented zoonosis of other viruses...viral zoonosis ? In this respect, it is noteworthy that compared to other African populations that reside in geographical proximity (e.g., non-Pygmy

  20. Burkholderia mallei and Burkholderia pseudomallei cluster 1 type VI secretion system gene expression is negatively regulated by iron and zinc.

    Directory of Open Access Journals (Sweden)

    Mary N Burtnick

    Full Text Available Burkholderia mallei is a facultative intracellular pathogen that causes glanders in humans and animals. Previous studies have demonstrated that the cluster 1 type VI secretion system (T6SS-1 expressed by this organism is essential for virulence in hamsters and is positively regulated by the VirAG two-component system. Recently, we have shown that T6SS-1 gene expression is up-regulated following internalization of this pathogen into phagocytic cells and that this system promotes multinucleated giant cell formation in infected tissue culture monolayers. In the present study, we further investigated the complex regulation of this important virulence factor. To assess T6SS-1 expression, B. mallei strains were cultured in various media conditions and Hcp1 production was analyzed by Western immunoblotting. Transcript levels of several VirAG-regulated genes (bimA, tssA, hcp1 and tssM were also determined using quantitative real time PCR. Consistent with previous observations, T6SS-1 was not expressed during growth of B. mallei in rich media. Curiously, growth of the organism in minimal media (M9G or minimal media plus casamino acids (M9CG facilitated robust expression of T6SS-1 genes whereas growth in minimal media plus tryptone (M9TG did not. Investigation of this phenomenon confirmed a regulatory role for VirAG in this process. Additionally, T6SS-1 gene expression was significantly down-regulated by the addition of iron and zinc to M9CG. Other genes under the control of VirAG did not appear to be as tightly regulated by these divalent metals. Similar results were observed for B. pseudomallei, but not for B. thailandensis. Collectively, our findings indicate that in addition to being positively regulated by VirAG, B. mallei and B. pseudomallei T6SS-1 gene expression is negatively regulated by iron and zinc.

  1. Disruption of six open reading frames on chromosome X of Saccharomyces cerevisiae reveals a cluster of four essential genes. (United States)

    Esser, K; Scholle, B; Michaelis, G


    In this study we report the construction and basic phenotypic analysis of six Saccharomyces cerevisiae deletion mutants. The open reading frames (ORFs) YJL008C (gene symbol CCT8), YJL010C, YJL011C, YJL012C, YJL017W, and YJL020C from chromosome X have been disrupted by integration of deletion cassettes, comprising the bacterial KanMX4 marker gene and terminal long (LFH) or short (SFH) flanking sequences that are homologous to the 5' and 3' untranslated regions of the respective ORFs. For correct disruption of ORF YJL008C, it was necessary to construct a deletion cassette flanked by 300-350 bp long target guide sequences by LFH-PCR. Transformations using ORF YJL008C gene disruption cassettes synthesized by standard SFH-PCR exclusively resulted in false-positive or multiple integration events, probably because seven additional genes homologous to CCT8 exist in the yeast genome. The other five ORFs have been disrupted using cassettes generated by SFH-PCR, comprising terminal homologous regions of approximately 50 bp to each target site. Correct genomic integration of the reporter modules was verified by analytical PCR and Southern hybridization. Deletion of YJL008C, YJL010C, YJL011C, and YJL012C was found to be lethal, as shown by sporulation and tetrad analysis. This result is in contrast to the finding that only 16-20% of the genes in S. cerevisiae are estimated to be essential. The four essential genes described in this work are clustered, while the two other non-essential ORFs are separated by further ORFs. Although the two viable deletion mutants were tested against 60 different inhibitors, heavy metal ions and salts, no phenotype could be detected that co-segregated with the deletion during meiosis.

  2. Chloromethane-Dependent Expression of the cmu Gene Cluster of Hyphomicrobium chloromethanicum


    Borodina, Elena; McDonald, Ian R.; Murrell, J. Colin


    The methylotrophic bacterium Hyphomicrobium chloromethanicum CM2 can utilize chloromethane (CH3Cl) as the sole carbon and energy source. Previously genes cmuB, cmuC, cmuA, and folD were shown to be essential for the growth of Methylobacterium chloromethanicum on CH3Cl. These CH3Cl-specific genes were subsequently detected in H. chloromethanicum. Transposon and marker exchange mutagenesis studies were carried out to identify the genes essential for CH3Cl metabolism in H. chloromethanicum. New ...

  3. Distinct Loci in the CHRNA5/CHRNA3/CHRNB4 Gene Cluster Are Associated With Onset of Regular Smoking (United States)

    Stephens, Sarah H.; Hartz, Sarah M.; Hoft, Nicole R.; Saccone, Nancy L.; Corley, Robin C.; Hewitt, John K.; Hopfer, Christian J.; Breslau, Naomi; Coon, Hilary; Chen, Xiangning; Ducci, Francesca; Dueker, Nicole; Franceschini, Nora; Frank, Josef; Han, Younghun; Hansel, Nadia N.; Jiang, Chenhui; Korhonen, Tellervo; Lind, Penelope A.; Liu, Jason; Lyytikäinen, Leo-Pekka; Michel, Martha; Shaffer, John R.; Short, Susan E.; Sun, Juzhong; Teumer, Alexander; Thompson, John R.; Vogelzangs, Nicole; Vink, Jacqueline M.; Wenzlaff, Angela; Wheeler, William; Yang, Bao-Zhu; Aggen, Steven H.; Balmforth, Anthony J.; Baumeister, Sebastian E.; Beaty, Terri H.; Benjamin, Daniel J.; Bergen, Andrew W.; Broms, Ulla; Cesarini, David; Chatterjee, Nilanjan; Chen, Jingchun; Cheng, Yu-Ching; Cichon, Sven; Couper, David; Cucca, Francesco; Dick, Danielle; Foroud, Tatiana; Furberg, Helena; Giegling, Ina; Gillespie, Nathan A.; Gu, Fangyi; Hall, Alistair S.; Hällfors, Jenni; Han, Shizhong; Hartmann, Annette M.; Heikkilä, Kauko; Hickie, Ian B.; Hottenga, Jouke Jan; Jousilahti, Pekka; Kaakinen, Marika; Kähönen, Mika; Koellinger, Philipp D.; Kittner, Stephen; Konte, Bettina; Landi, Maria-Teresa; Laatikainen, Tiina; Leppert, Mark; Levy, Steven M.; Mathias, Rasika A.; McNeil, Daniel W.; Medland, Sarah E.; Montgomery, Grant W.; Murray, Tanda; Nauck, Matthias; North, Kari E.; Paré, Peter D.; Pergadia, Michele; Ruczinski, Ingo; Salomaa, Veikko; Viikari, Jorma; Willemsen, Gonneke; Barnes, Kathleen C.; Boerwinkle, Eric; Boomsma, Dorret I.; Caporaso, Neil; Edenberg, Howard J.; Francks, Clyde; Gelernter, Joel; Grabe, Hans Jörgen; Hops, Hyman; Jarvelin, Marjo-Riitta; Johannesson, Magnus; Kendler, Kenneth S.; Lehtimäki, Terho; Magnusson, Patrik K.E.; Marazita, Mary L.; Marchini, Jonathan; Mitchell, Braxton D.; Nöthen, Markus M.; Penninx, Brenda W.; Raitakari, Olli; Rietschel, Marcella; Rujescu, Dan; Samani, Nilesh J.; Schwartz, Ann G.; Shete, Sanjay; Spitz, Margaret; Swan, Gary E.; Völzke, Henry; Veijola, Juha; Wei, Qingyi; Amos, Chris; Cannon, Dale S.; Grucza, Richard; Hatsukami, Dorothy; Heath, Andrew; Johnson, Eric O.; Kaprio, Jaakko; Madden, Pamela; Martin, Nicholas G.; Stevens, Victoria L.; Weiss, Robert B.; Kraft, Peter; Bierut, Laura J.; Ehringer, Marissa A.


    Neuronal nicotinic acetylcholine receptor (nAChR) genes (CHRNA5/CHRNA3/CHRNB4) have been reproducibly associated with nicotine dependence, smoking behaviors, and lung cancer risk. Of the few reports that have focused on early smoking behaviors, association results have been mixed. This meta-analysis examines early smoking phenotypes and SNPs in the gene cluster to determine: (1) whether the most robust association signal in this region (rs16969968) for other smoking behaviors is also associated with early behaviors, and/or (2) if additional statistically independent signals are important in early smoking. We focused on two phenotypes: age of tobacco initiation (AOI) and age of first regular tobacco use (AOS). This study included 56,034 subjects (41 groups) spanning nine countries and evaluated five SNPs including rs1948, rs16969968, rs578776, rs588765, and rs684513. Each dataset was analyzed using a centrally generated script. Meta-analyses were conducted from summary statistics. AOS yielded significant associations with SNPs rs578776 (beta = 0.02, P = 0.004), rs1948 (beta = 0.023, P = 0.018), and rs684513 (beta = 0.032, P = 0.017), indicating protective effects. There were no significant associations for the AOI phenotype. Importantly, rs16969968, the most replicated signal in this region for nicotine dependence, cigarettes per day, and cotinine levels, was not associated with AOI (P = 0.59) or AOS (P = 0.92). These results provide important insight into the complexity of smoking behavior phenotypes, and suggest that association signals in the CHRNA5/A3/B4 gene cluster affecting early smoking behaviors may be different from those affecting the mature nicotine dependence phenotype. PMID:24186853

  4. The Conserved Dcw Gene Cluster of R. sphaeroides Is Preceded by an Uncommonly Extended 5' Leader Featuring the sRNA UpsM. (United States)

    Weber, Lennart; Thoelken, Clemens; Volk, Marcel; Remes, Bernhard; Lechner, Marcus; Klug, Gabriele


    Cell division and cell wall synthesis mechanisms are similarly conserved among bacteria. Consequently some bacterial species have comparable sets of genes organized in the dcw (division and cell wall) gene cluster. Dcw genes, their regulation and their relative order within the cluster are outstandingly conserved among rod shaped and gram negative bacteria to ensure an efficient coordination of growth and division. A well studied representative is the dcw gene cluster of E. coli. The first promoter of the gene cluster (mraZ1p) gives rise to polycistronic transcripts containing a 38 nt long 5' UTR followed by the first gene mraZ. Despite reported conservation we present evidence for a much longer 5' UTR in the gram negative and rod shaped bacterium Rhodobacter sphaeroides and in the family of Rhodobacteraceae. This extended 268 nt long 5' UTR comprises a Rho independent terminator, which in case of termination gives rise to a non-coding RNA (UpsM). This sRNA is conditionally cleaved by RNase E under stress conditions in an Hfq- and very likely target mRNA-dependent manner, implying its function in trans. These results raise the question for the regulatory function of this extended 5' UTR. It might represent the rarely described case of a trans acting sRNA derived from a riboswitch with exclusive presence in the family of Rhodobacteraceae.

  5. The Conserved Dcw Gene Cluster of R. sphaeroides Is Preceded by an Uncommonly Extended 5’ Leader Featuring the sRNA UpsM (United States)

    Weber, Lennart; Thoelken, Clemens; Volk, Marcel; Remes, Bernhard; Lechner, Marcus; Klug, Gabriele


    Cell division and cell wall synthesis mechanisms are similarly conserved among bacteria. Consequently some bacterial species have comparable sets of genes organized in the dcw (division and cell wall) gene cluster. Dcw genes, their regulation and their relative order within the cluster are outstandingly conserved among rod shaped and gram negative bacteria to ensure an efficient coordination of growth and division. A well studied representative is the dcw gene cluster of E. coli. The first promoter of the gene cluster (mraZ1p) gives rise to polycistronic transcripts containing a 38 nt long 5’ UTR followed by the first gene mraZ. Despite reported conservation we present evidence for a much longer 5’ UTR in the gram negative and rod shaped bacterium Rhodobacter sphaeroides and in the family of Rhodobacteraceae. This extended 268 nt long 5’ UTR comprises a Rho independent terminator, which in case of termination gives rise to a non-coding RNA (UpsM). This sRNA is conditionally cleaved by RNase E under stress conditions in an Hfq- and very likely target mRNA-dependent manner, implying its function in trans. These results raise the question for the regulatory function of this extended 5’ UTR. It might represent the rarely described case of a trans acting sRNA derived from a riboswitch with exclusive presence in the family of Rhodobacteraceae. PMID:27802301

  6. Three LIF-dependent signatures and gene clusters with atypical expression profiles, identified by transcriptome studies in mouse ES cells and early derivatives

    Directory of Open Access Journals (Sweden)

    Hummel Oliver


    Full Text Available Abstract Background Mouse embryonic stem (ES cells remain pluripotent in vitro when grown in the presence of the cytokine Leukaemia Inhibitory Factor (LIF. Identification of LIF targets and of genes regulating the transition between pluripotent and early differentiated cells is a critical step for understanding the control of ES cell pluripotency. Results By gene profiling studies carried out with mRNAs from ES cells and their early derivatives treated or not with LIF, we have identified i LIF-dependent genes, highly expressed in pluripotent cells, whose expression level decreases sharply upon LIF withdrawal [Pluri genes], ii LIF induced genes [Lifind genes] whose expression is differentially regulated depending upon cell context and iii genes specific to the reversible or irreversible committed states. In addition, by hierarchical gene clustering, we have identified, among eight independent gene clusters, two atypical groups of genes, whose expression level was highly modulated in committed cells only. Computer based analyses led to the characterization of different sub-types of Pluri and Lifind genes, and revealed their differential modulation by Oct4 or Nanog master genes. Individual knock down of a selection of Pluri and Lifind genes leads to weak changes in the expression of early differentiation markers, in cell growth conditions in which these master genes are still expressed. Conclusion We have identified different sets of LIF-regulated genes depending upon the cell state (reversible or irreversible commitment, which allowed us to present a novel global view of LIF responses. We are also reporting on the identification of genes whose expression is strictly regulated during the commitment step. Furthermore, our studies identify sub-networks of genes with a restricted expression in pluripotent ES cells, whose down regulation occurs while the master knot (composed of OCT4, SOX2 and NANOG is still expressed and which might be down

  7. Cluster editing

    DEFF Research Database (Denmark)

    Böcker, S.; Baumbach, Jan


    . The problem has been the inspiration for numerous algorithms in bioinformatics, aiming at clustering entities such as genes, proteins, phenotypes, or patients. In this paper, we review exact and heuristic methods that have been proposed for the Cluster Editing problem, and also applications......The Cluster Editing problem asks to transform a graph into a disjoint union of cliques using a minimum number of edge modifications. Although the problem has been proven NP-complete several times, it has nevertheless attracted much research both from the theoretical and the applied side...

  8. A proteomic approach to investigating gene cluster expression and secondary metabolite functionality in Aspergillus fumigatus.

    Directory of Open Access Journals (Sweden)

    Rebecca A Owens

    Full Text Available A combined proteomics and metabolomics approach was utilised to advance the identification and characterisation of secondary metabolites in Aspergillus fumigatus. Here, implementation of a shotgun proteomic strategy led to the identification of non-redundant mycelial proteins (n = 414 from A. fumigatus including proteins typically under-represented in 2-D proteome maps: proteins with multiple transmembrane regions, hydrophobic proteins and proteins with extremes of molecular mass and pI. Indirect identification of secondary metabolite cluster expression was also achieved, with proteins (n = 18 from LaeA-regulated clusters detected, including GliT encoded within the gliotoxin biosynthetic cluster. Biochemical analysis then revealed that gliotoxin significantly attenuates H2O2-induced oxidative stress in A. fumigatus (p>0.0001, confirming observations from proteomics data. A complementary 2-D/LC-MS/MS approach further elucidated significantly increased abundance (p<0.05 of proliferating cell nuclear antigen (PCNA, NADH-quinone oxidoreductase and the gliotoxin oxidoreductase GliT, along with significantly attenuated abundance (p<0.05 of a heat shock protein, an oxidative stress protein and an autolysis-associated chitinase, when gliotoxin and H2O2 were present, compared to H2O2 alone. Moreover, gliotoxin exposure significantly reduced the abundance of selected proteins (p<0.05 involved in de novo purine biosynthesis. Significantly elevated abundance (p<0.05 of a key enzyme, xanthine-guanine phosphoribosyl transferase Xpt1, utilised in purine salvage, was observed in the presence of H2O2 and gliotoxin. This work provides new insights into the A. fumigatus proteome and experimental strategies, plus mechanistic data pertaining to gliotoxin functionality in the organism.

  9. Two genetic clusters in swine hemoplasmas revealed by analyses of the 16S rRNA and RNase P RNA genes. (United States)

    Watanabe, Yusaku; Fujihara, Masatoshi; Obara, Hisato; Nagai, Kazuya; Harasawa, Ryô


    Only two hemoplasma species, Eperythrozoon parvum and Mycoplasma suis, have been recognized in pigs. Here we demonstrate the genetic variations among six hemoplasma strains detected from pigs, by analyzing the 16S rRNA and RNase P RNA (rnpB) genes, and propose a novel hemoplasma taxon that has not been described previously. Phylogenetic trees based on the nucleotide sequence of the 16S rRNA gene indicated that these six hemoplasmas were divided into two clusters representing M. suis and a novel taxon. We further examined the primary and secondary structures of the nucleotide sequences of the rnpB gene of the novel taxon, and found it distinct from that of M. suis. In conclusion, we unveiled a genetic cluster distinct from M. suis, suggesting a new swine hemoplasma species or E. parvum. Our findings also suggest that this novel cluster should be included in the genus Mycoplasma.

  10. Isolation of Resistance Gene Candidates (RGCs) and characterization of an RGC cluster in cassava. (United States)

    López, C E; Zuluaga, A P; Cooke, R; Delseny, M; Tohme, J; Verdier, V


    Plant disease resistance genes (R genes) show significant similarity amongst themselves in terms of both their DNA sequences and structural motifs present in their protein products. Oligonucleotide primers designed from NBS (Nucleotide Binding Site) domains encoded by several R-genes have been used to amplify NBS sequences from the genomic DNA of various plant species, which have been called Resistance Gene Analogues (RGAs) or Resistance Gene Candidates (RGCs). Using specific primers from the NBS and TIR (Toll/Interleukin-1 Receptor) regions, we identified twelve classes of RGCs in cassava (Manihot esculenta Crantz). Two classes were obtained from the PCR-amplification of the TIR domain. The other 10 classes correspond to the NBS sequences and were grouped into two subfamilies. Classes RCa1 to RCa5 are part of the first subfamily and were linked to a TIR domain in the N terminus. Classes RCa6 to RCa10 corresponded to non-TIR NBS-LRR encoding sequences. BAC library screening with the 12 RGC classes as probes allowed the identification of 42 BAC clones that were assembled into 10 contigs and 19 singletons. Members of the two TIR and non-TIR NBS-LRR subfamilies occurred together within individual BAC clones. The BAC screening and Southern hybridization analyses showed that all RGCs were single copy sequences except RCa6 that represented a large and diverse gene family. One BAC contained five NBS sequences and sequence analysis allowed the identification of two complete RGCs encoding two highly similar proteins. This BAC was located on linkage group J with three other RGC-containing BACs. At least one of these genes, RGC2, is expressed constitutively in cassava tissues.

  11. Detection of the intercellular adhesion gene cluster (ica in clinical Staphylococcus aureus isolates

    Directory of Open Access Journals (Sweden)

    Namvar, Amirmorteza Ebrahimzadeh


    Full Text Available [english] is a major hospital and community pathogen having the aptitude to cause a wide variety of infections in men. The ability of microorganisms to produce biofilm facilitates them to withstand the host immune response and is recognized as one factor contributing to chronic or persistent infections. It was demonstrated that the -encoded genes lead to the biosynthesis of polysaccharide intercellular adhesion (PIA molecules, and may be involved in the accumulation phase of biofilm formation. Different studies have shown the decisive role of the gene as virulence factors in staphylococcal infections. This study was carried out to demonstrate the relationship between gene and production of slime layer in strains. Sixty strains were isolated from patients. The isolates were identified morphologically and biochemically following standard laboratory methods. After identification, the staphylococcal isolates were maintained in trypticase soy broth (TSB, to which 15% glycerol was added, and stored at –20°C. Slime formation and biofilm assay was monitored. A PCR assay was developed to identify the presence of (intercellular adhesion gene gene in all isolates. Thirty-nine slime producing colonies with CRA plates (65% formed black colors, the remaining 21 isolates were pink (35%. In the quantitative biofilm assay 35 (58% produced biofilm while 25 (42% isolates did not exhibit this property. All isolates were positive for detection of gene by PCR method. The interaction of and in the investigated isolates may be important in slime layer formation and biofilm phenomena.We propose PCR detection of the gene locus as a rapid and effective method to be used for discrimination between potentially virulent and nonvirulent isolates, with implications for therapeutic and preventive measures pertainin to the management of colonized indwelling catheters.

  12. Oxidative stress enhances the expression of sulfur assimilation genes: preliminary insights on the Enterococcus faecalis iron-sulfur cluster machinery regulation (United States)

    Riboldi, Gustavo Pelicioli; Bierhals, Christine Garcia; de Mattos, Eduardo Preusser; Frazzon, Ana Paula Guedes; d‘Azevedo, Pedro Alves; Frazzon, Jeverson


    The Firmicutes bacteria participate extensively in virulence and pathological processes. Enterococcus faecalis is a commensal microorganism; however, it is also a pathogenic bacterium mainly associated with nosocomial infections in immunocompromised patients. Iron-sulfur [Fe-S] clusters are inorganic prosthetic groups involved in diverse biological processes, whose in vivo formation requires several specific protein machineries. Escherichia coli is one of the most frequently studied microorganisms regarding [Fe-S] cluster biogenesis and encodes the iron-sulfur cluster and sulfur assimilation systems. In Firmicutes species, a unique operon composed of the sufCDSUB genes is responsible for [Fe-S] cluster biogenesis. The aim of this study was to investigate the potential of the E. faecalis sufCDSUB system in the [Fe-S] cluster assembly using oxidative stress and iron depletion as adverse growth conditions. Quantitative real-time polymerase chain reaction demonstrated, for the first time, that Gram-positive bacteria possess an OxyR component responsive to oxidative stress conditions, as fully described for E. coli models. Likewise, strong expression of the sufCDSUB genes was observed in low concentrations of hydrogen peroxide, indicating that the lowest concentration of oxygen free radicals inside cells, known to be highly damaging to [Fe-S] clusters, is sufficient to trigger the transcriptional machinery for prompt replacement of [Fe-S] clusters. PMID:24936909

  13. Organization and molecular evolution of a disease-resistance gene cluster in coffee trees

    Directory of Open Access Journals (Sweden)

    Lashermes Philippe


    Full Text Available Abstract Background Most disease-resistance (R genes in plants encode NBS-LRR proteins and belong to one of the largest and most variable gene families among plant genomes. However, the specific evolutionary routes of NBS-LRR encoding genes remain elusive. Recently in coffee tree (Coffea arabica, a region spanning the SH3 locus that confers resistance to coffee leaf rust, one of the most serious coffee diseases, was identified and characterized. Using comparative sequence analysis, the purpose of the present study was to gain insight into the genomic organization and evolution of the SH3 locus. Results Sequence analysis of the SH3 region in three coffee genomes, Ea and Ca subgenomes from the allotetraploid C. arabica and Cc genome from the diploid C. canephora, revealed the presence of 5, 3 and 4 R genes in Ea, Ca, and Cc genomes, respectively. All these R-gene sequences appeared to be members of a CC-NBS-LRR (CNL gene family that was only found at the SH3 locus in C. arabica. Furthermore, while homologs were found in several dicot species, comparative genomic analysis failed to find any CNL R-gene in the orthologous regions of other eudicot species. The orthology relationship among the SH3-CNL copies in the three analyzed genomes was determined and the duplication/deletion events that shaped the SH3 locus were traced back. Gene conversion events were detected between paralogs in all three genomes and also between the two sub-genomes of C. arabica. Significant positive selection was detected in the solvent-exposed residues of the SH3-CNL copies. Conclusion The ancestral SH3-CNL copy was inserted in the SH3 locus after the divergence between Solanales and Rubiales lineages. Moreover, the origin of most of the SH3-CNL copies predates the divergence between Coffea species. The SH3-CNL family appeared to evolve following the birth-and-death model, since duplications and deletions were inferred in the evolution of the SH3 locus. Gene conversion

  14. Genome-wide Gene Order Distances Support Clustering The Gram-Positive Bacteria

    Directory of Open Access Journals (Sweden)

    Christopher H House


    Full Text Available Initially using 143 genomes, we developed a method for calculating the pair-wise distance between prokaryotic genomes using a Monte Carlo method to estimate the conservation of gene order. The method was based on repeatedly selecting five or six non-adjacent random orthologs from each of two genomes and determining if the chosen orthologs were in the same order. The raw distances were then corrected for gene order convergence using an adaptation of the Jukes-Cantor model, as well as using the common distance correction D’ = -ln(1-D. First, we compared the distances found via the order of six orthologs to distances found based on ortholog gene content and small subunit rRNA sequences. The Jukes-Cantor gene order distances are reasonably well correlated with the divergence of rRNA (R2 = 0.24, especially at rRNA Jukes-Cantor distances of less than 0.2 (R2 = 0.52. Gene content is only weakly correlated with rRNA divergence (R2 = 0.04 over all distances, however, it is especially strongly correlated at rRNA Jukes-Cantor distances of less than 0.1 (R2 = 0.67. This initial work suggests that gene order may be useful in conjunction with other methods to help understand the relatedness of genomes. Using the gene order distances in 143 genomes, the relations of prokaryotes were studied using neighbor joining and agreement subtrees. We then repeated our study of the relations of prokaryotes using gene order in 172 complete genomes better representing a wider-diversity of prokaryotes. Consistently, our trees show the Actinobacteria as a sister group to the bulk of the Firmicutes. In fact, the robustness of gene order support was found to be considerably greater for uniting these two phyla than for uniting any of the proteobacterial classes together. The results are supportive of the idea that Actinobacteria and Firmicutes are closely related, which in turn implies a single origin for the gram-positive cell.

  15. Identification of anrF gene, a homology of admM of andrimid biosynthetic gene cluster related to the antagonistic activity of Enterobacter cloacae B8

    Institute of Scientific and Technical Information of China (English)

    Xu-Ping Yu; Jun-Li Zhu; Xue-Ping Yao; Shi-Cheng He; Hai-Ning Huang; Wei-Liang Chen; Yong-Hao Hu; De-Bao Li


    AIM: To identify the gene (s) related to the antagonistic activity of Enterobacter cloacae B8 and to elucidate its antagonistic mechanism.METHODS: Transposon-mediated mutagenesis and tagging method and cassette PCR-based chromosomal walking method were adopted to isolate the mutant strain(s) of B8 that lost the antagonistic activity and to clone DNA fragments around Tn5 insertion site. Sequence compiling and open reading frame (ORF) finding were done with DNAStar program and homologous sequence and conserved domain searches were performed with BlastN or BlastP programs at www. To verify the gene involved in the antagonistic activity, complementation of a full-length clone of the anrFgene to the mutant B8F strain was used.RESULTS: A 3 321 bp contig around the Tn5 insertion site was obtained and an ORF of 2 634 bp in length designated as anrFgene encoding for a 877 aa polyketide synthase-like protein was identified. It had a homology of 83% at the nucleotide level and 79% ID/87% SIM at the protein level, to the admM gene of Pantoea agglornerans andrimid biosynthetic gene cluster (AY192157). The Tn5was inserted at 2 420 bp of the gene corresponding to the COG3319 (the thioesterase domain of type Ⅰ polyketide synthase) coding region on B8F. The antagonistic activity against Xanthomonas oryzae pv. oryzae was resumed with complementation of the full-length anrFgene to the mutant B8F.CONCLUSION: The anrFgene obtained is related to the antagonistic activity of B8, and the antagonistic substances produced by B8 are andrimid and/or its analogs.

  16. Scab resistance in 'Geneva' apple is conditioned by a resistance gene cluster with complex genetic control. (United States)

    Bastiaanse, Héloïse; Bassett, Heather C M; Kirk, Christopher; Gardiner, Susan E; Deng, Cecilia; Groenworld, Remmelt; Chagné, David; Bus, Vincent G M


    Apple scab, caused by the fungal pathogen Venturia inaequalis, is one of the most severe diseases of apple worldwide. It is the most studied plant-pathogen interaction involving a woody species using modern genetic, genomic, proteomic and bioinformatic approaches in both species. Although 'Geneva' apple was recognized long ago as a potential source of resistance to scab, this resistance has not been characterized previously. Differential interactions between various monoconidial isolates of V. inaequalis and six segregating F1 and F2 populations indicate the presence of at least five loci governing the resistance in 'Geneva'. The 17 chromosomes of apple were screened using genotyping-by-sequencing, as well as single marker mapping, to position loci controlling the V. inaequalis resistance on linkage group 4. Next, we fine mapped a 5-cM region containing five loci conferring both dominant and recessive scab resistance to the distal end of the linkage group. This region corresponds to 2.2 Mbp (from 20.3 to 22.5 Mbp) on the physical map of 'Golden Delicious' containing nine candidate nucleotide-binding site leucine-rich repeat (NBS-LRR) resistance genes. This study increases our understanding of the complex genetic basis of apple scab resistance conferred by 'Geneva', as well as the gene-for-gene (GfG) relationships between the effector genes in the pathogen and resistance genes in the host.

  17. Pseudomonas aeruginosa IscR-Regulated Ferredoxin NADP(+ Reductase Gene (fprB Functions in Iron-Sulfur Cluster Biogenesis and Multiple Stress Response.

    Directory of Open Access Journals (Sweden)

    Adisak Romsang

    Full Text Available P. aeruginosa (PAO1 has two putative genes encoding ferredoxin NADP(+ reductases, denoted fprA and fprB. Here, the regulation of fprB expression and the protein's physiological roles in [4Fe-4S] cluster biogenesis and stress protection are characterized. The fprB mutant has defects in [4Fe-4S] cluster biogenesis, as shown by reduced activities of [4Fe-4S] cluster-containing enzymes. Inactivation of the gene resulted in increased sensitivity to oxidative, thiol, osmotic and metal stresses compared with the PAO1 wild type. The increased sensitivity could be partially or completely suppressed by high expression of genes from the isc operon, which are involved in [Fe-S] cluster biogenesis, indicating that stress sensitivity in the fprB mutant is partially caused by a reduction in levels of [4Fe-4S] clusters. The pattern and regulation of fprB expression are in agreement with the gene physiological roles; fprB expression was highly induced by redox cycling drugs and diamide and was moderately induced by peroxides, an iron chelator and salt stress. The stress-induced expression of fprB was abolished by a deletion of the iscR gene. An IscR DNA-binding site close to fprB promoter elements was identified and confirmed by specific binding of purified IscR. Analysis of the regulation of fprB expression supports the role of IscR in directly regulating fprB transcription as a transcription activator. The combination of IscR-regulated expression of fprB and the fprB roles in response to multiple stressors emphasizes the importance of [Fe-S] cluster homeostasis in both gene regulation and stress protection.

  18. RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond (United States)

    Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto; Ledezma-Tejeida, Daniela; Muñiz-Rascado, Luis; García-Sotelo, Jair Santiago; Alquicira-Hernández, Kevin; Martínez-Flores, Irma; Pannier, Lucia; Castro-Mondragón, Jaime Abraham; Medina-Rivera, Alejandra; Solano-Lira, Hilda; Bonavides-Martínez, César; Pérez-Rueda, Ernesto; Alquicira-Hernández, Shirley; Porrón-Sotelo, Liliana; López-Fuentes, Alejandra; Hernández-Koutoucheva, Anastasia; Moral-Chávez, Víctor Del; Rinaldi, Fabio; Collado-Vides, Julio


    RegulonDB ( is one of the most useful and important resources on bacterial gene regulation,as it integrates the scattered scientific knowledge of the best-characterized organism, Escherichia coli K-12, in a database that organizes large amounts of data. Its electronic format enables researchers to compare their results with the legacy of previous knowledge and supports bioinformatics tools and model building. Here, we summarize our progress with RegulonDB since our last Nucleic Acids Research publication describing RegulonDB, in 2013. In addition to maintaining curation up-to-date, we report a collection of 232 interactions with small RNAs affecting 192 genes, and the complete repertoire of 189 Elementary Genetic Sensory-Response units (GENSOR units), integrating the signal, regulatory interactions, and metabolic pathways they govern. These additions represent major progress to a higher level of understanding of regulated processes. We have updated the computationally predicted transcription factors, which total 304 (184 with experimental evidence and 120 from computational predictions); we updated our position-weight matrices and have included tools for clustering them in evolutionary families. We describe our semiautomatic strategy to accelerate curation, including datasets from high-throughput experiments, a novel coexpression distance to search for ‘neighborhood’ genes to known operons and regulons, and computational developments. PMID:26527724

  19. Clinical Fusobacterium mortiferum Isolates Cluster with Undifferentiated Clostridium rectum Species Based on 16S rRNA Gene Phylogenetic Analysis. (United States)

    Lee, Yangsoon; Eun, Chang Soo; Han, Dong Soo


    The most commonly encountered clinical Fusobacterium species are F. nucleatum and F. necrophorum; other Fusobacteria, such as F. mortiferum and F. varium, have occasionally been isolated from human specimens. Clostridium rectum is a gram-positive species characterized as a straight bacillus with oval sub-terminal spores. The close 16S rRNA gene sequence relationship of C. rectum with the genus Fusobacterium is unexpected given their very different phenotypic characteristics. Between 2014 and 2015, a total of 19 Fusobacterium isolates were recovered from the colonic tissue of 10 patients at a university hospital. All isolates were identified based on 16S rRNA gene sequencing. The phylogenetic relationship among these isolates was estimated using the neighbor-joining method and the Molecular Evolutionary Genetic Analysis (MEGA) version 6. Based on phylogenetic analysis, the F. mortiferum isolates clustered into two groups - F. mortiferum DSM 19809 (group I) and F. mortiferum ATCC 25557 (group II) - even though they are of the same species. Furthermore, the F. mortiferum DSM 19809 (group I) showed a close phylogenetic relationship with C. rectum, even though C. rectum is classified as a gram-positive spore-producing bacillus. C. rectum is clearly unrelated to the genus Clostridium as it shows highest 16S rRNA gene sequence similarity with species from the genus Fusobacterium Therefore, additional methods such as Gram staining and other biochemical methods should be performed for Fusobacterium identification.

  20. Whole-genome sequencing suggests a chemokine gene cluster that modifies age at onset in familial Alzheimer's disease. (United States)

    Lalli, M A; Bettcher, B M; Arcila, M L; Garcia, G; Guzman, C; Madrigal, L; Ramirez, L; Acosta-Uribe, J; Baena, A; Wojta, K J; Coppola, G; Fitch, R; de Both, M D; Huentelman, M J; Reiman, E M; Brunkow, M E; Glusman, G; Roach, J C; Kao, A W; Lopera, F; Kosik, K S