Kendal Wayne S
Full Text Available Abstract Background Vertebrate genes often appear to cluster within the background of nontranscribed genomic DNA. Here an analysis of the physical distribution of gene structures on human chromosome 7 was performed to confirm the presence of clustering, and to elucidate possible underlying statistical and biological mechanisms. Results Clustering of genes was confirmed by virtue of a variance of the number of genes per unit physical length that exceeded the respective mean. Further evidence for clustering came from a power function relationship between the variance and mean that possessed an exponent of 1.51. This power function implied that the spatial distribution of genes on chromosome 7 was scale invariant, and that the underlying statistical distribution had a Poisson-gamma (PG form. A PG distribution for the spatial scattering of genes was validated by stringent comparisons of both the predicted variance to mean power function and its cumulative distribution function to data derived from chromosome 7. Conclusion The PG distribution was consistent with at least two different biological models: In the microrearrangement model, the number of genes per unit length of chromosome represented the contribution of a random number of smaller chromosomal segments that had originated by random breakage and reconstruction of more primitive chromosomes. Each of these smaller segments would have necessarily contained (on average a gamma distributed number of genes. In the gene cluster model, genes would be scattered randomly to begin with. Over evolutionary timescales, tandem duplication, mutation, insertion, deletion and rearrangement could act at these gene sites through a stochastic birth death and immigration process to yield a PG distribution. On the basis of the gene position data alone it was not possible to identify the biological model which best explained the observed clustering. However, the underlying PG statistical model implicated neutral
Blom van Assendelft, Margaretha van
The structure and regulation of the human β -like globin gene cluster has been studied extensively. Genetic disorders connected with this gene cluster are responsible for human diseases associated with high levels of morbidity and mortality, such as β-thalassaemia and sickle cell anaemia. The work
Borg, Joseph; Georgitsi, Marianthi; Aleporou-Marinou, Vassiliki; Kollia, Panagoula; Patrinos, George P
Homologous recombination is a frequent phenomenon in multigene families and as such it occurs several times in both the alpha- and beta-like globin gene families. In numerous occasions, genetic recombination has been previously implicated as a major mechanism that drives mutagenesis in the human globin gene clusters, either in the form of unequal crossover or gene conversion. Unequal crossover results in the increase or decrease of the human globin gene copies, accompanied in the majority of cases with minor phenotypic consequences, while gene conversion contributes either to maintaining sequence homogeneity or generating sequence diversity. The role of genetic recombination, particularly gene conversion in the evolution of the human globin gene families has been discussed elsewhere. Here, we summarize our current knowledge and review existing experimental evidence outlining the role of genetic recombination in the mutagenic process in the human globin gene families.
Spies, T.; Bresnahan, M.; Strominger, J.L.
A 600-kilobase (kb) DNA segment from the human major histocompatibility complex (MHC) class III region was isolated by extension of a previous 435-kb chromosome walk. The contiguous series of cloned overlapping cosmids contains the entire 555-kb interval between C2 in the complement gene cluster and HLA-B. This region is known to encode the tumor necrosis factors (TNFs) α and β, B144, and the major heat shock protein HSP70. Moreover, a cluster of genes, BAT1-BAT5 (HLA-B-associated transcripts) have been localized in the vicinity of the genes for TNFα and TNFβ. An additional four genes were identified by isolation of corresponding cDNA clones with cosmid DNA probes. These genes for BAT6-BAT9 were mapped near the gene for C2 within a 120-kb region that includes a HSP70 gene pair. These results, together with complementary data from a similar recent study, indicated the presence of a minimum of 19 genes within the C2-HLA-B interval of the MHC class III region. Although the functional properties of most of these genes are yet unknown, they may be involved in some aspects of immunity. This idea is supported by the genetic mapping of the hematopoietic histocompatibility locus-1 (Hh-1) in recombinant mice between TNFα and H-2S, which is homologous to the complement gene cluster in humans
Martínez-del Campo, Ana; Bodea, Smaranda; Hamer, Hilary A; Marks, Jonathan A; Haiser, Henry J; Turnbaugh, Peter J; Balskus, Emily P
Elucidation of the molecular mechanisms underlying the human gut microbiota's effects on health and disease has been complicated by difficulties in linking metabolic functions associated with the gut community as a whole to individual microorganisms and activities. Anaerobic microbial choline metabolism, a disease-associated metabolic pathway, exemplifies this challenge, as the specific human gut microorganisms responsible for this transformation have not yet been clearly identified. In this study, we established the link between a bacterial gene cluster, the choline utilization (cut) cluster, and anaerobic choline metabolism in human gut isolates by combining transcriptional, biochemical, bioinformatic, and cultivation-based approaches. Quantitative reverse transcription-PCR analysis and in vitro biochemical characterization of two cut gene products linked the entire cluster to growth on choline and supported a model for this pathway. Analyses of sequenced bacterial genomes revealed that the cut cluster is present in many human gut bacteria, is predictive of choline utilization in sequenced isolates, and is widely but discontinuously distributed across multiple bacterial phyla. Given that bacterial phylogeny is a poor marker for choline utilization, we were prompted to develop a degenerate PCR-based method for detecting the key functional gene choline TMA-lyase (cutC) in genomic and metagenomic DNA. Using this tool, we found that new choline-metabolizing gut isolates universally possessed cutC. We also demonstrated that this gene is widespread in stool metagenomic data sets. Overall, this work represents a crucial step toward understanding anaerobic choline metabolism in the human gut microbiota and underscores the importance of examining this microbial community from a function-oriented perspective. Anaerobic choline utilization is a bacterial metabolic activity that occurs in the human gut and is linked to multiple diseases. While bacterial genes responsible for
Liebhaber, S.A.; Weiss, I.; Cash, F.E.; Griese, E.U.; Horst, J.; Ayyub, H.; Higgs, D.R.
Synthesis of normal human hemoglobin A, α 2 β 2 , is based upon balanced expression of genes in the α-globin gene cluster on chromosome 15 and the β-globin gene cluster on chromosome 11. Full levels of erythroid-specific activation of the β-globin cluster depend on sequences located at a considerable distance 5' to the β-globin gene, referred to as the locus-activating or dominant control region. The existence of an analogous element(s) upstream of the α-globin cluster has been suggested from observations on naturally occurring deletions and experimental studies. The authors have identified an individual with α-thalassemia in whom structurally normal α-globin genes have been inactivated in cis by a discrete de novo 35-kilobase deletion located ∼30 kilobases 5' from the α-globin gene cluster. They conclude that this deletion inactivates expression of the α-globin genes by removing one or more of the previously identified upstream regulatory sequences that are critical to expression of the α-globin genes
Raghupathy, Narayanan; Durand, Dannie
Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data
Full Text Available Abstract Background Gene expression is regulated mainly by transcription factors (TFs that interact with regulatory cis-elements on DNA sequences. To identify functional regulatory elements, computer searching can predict TF binding sites (TFBS using position weight matrices (PWMs that represent positional base frequencies of collected experimentally determined TFBS. A disadvantage of this approach is the large output of results for genomic DNA. One strategy to identify genuine TFBS is to utilize local concentrations of predicted TFBS. It is unclear whether there is a general tendency for TFBS to cluster at promoter regions, although this is the case for certain TFBS. Also unclear is the identification of TFs that have TFBS concentrated in promoters and to what level this occurs. This study hopes to answer some of these questions. Results We developed the cluster score measure to evaluate the correlation between predicted TFBS clusters and promoter sequences for each PWM. Non-promoter sequences were used as a control. Using the cluster score, we identified a PWM group called PWM-PCP, in which TFBS clusters positively correlate with promoters, and another PWM group called PWM-NCP, in which TFBS clusters negatively correlate with promoters. The PWM-PCP group comprises 47% of the 199 vertebrate PWMs, while the PWM-NCP group occupied 11 percent. After reducing the effect of CpG islands (CGI against the clusters using partial correlation coefficients among three properties (promoter, CGI and predicted TFBS cluster, we identified two PWM groups including those strongly correlated with CGI and those not correlated with CGI. Conclusion Not all PWMs predict TFBS correlated with human promoter sequences. Two main PWM groups were identified: (1 those that show TFBS clustered in promoters associated with CGI, and (2 those that show TFBS clustered in promoters independent of CGI. Assessment of PWM matches will allow more positive interpretation of TFBS in
Allcock, Richard J N; Barrow, Alexander D; Forbes, Simon; Beck, Stephan; Trowsdale, John
We have characterized a cluster of single immunoglobulin variable (IgV) domain receptors centromeric of the major histocompatibility complex (MHC) on human chromosome 6. In addition to triggering receptor expressed on myeloid cells (TREM)-1 and TREM2, the cluster contains NKp44, a triggering receptor whose expression is limited to NK cells. We identified three new related genes and two gene fragments within a cluster of approximately 200 kb. Two of the three new genes lack charged residues in their transmembrane domain tails. Further, one of the genes contains two potential immunotyrosine Inhibitory motifs in its cytoplasmic tail, suggesting that it delivers inhibitory signals. The human and mouse TREM clusters appear to have diverged such that there are unique sequences in each species. Finally, each gene in the TREM cluster was expressed in a different range of cell types.
Full Text Available Many pragmatic clustering methods have been developed to group data vectors or objects into clusters so that the objects in one cluster are very similar and objects in different clusters are distinct based on some similarity measure. The availability of time course data has motivated researchers to develop methods, such as mixture and mixed-effects modelling approaches, that incorporate the temporal information contained in the shape of the trajectory of the data. However, there is still a need for the development of time-course clustering methods that can adequately deal with inhomogeneous clusters (some clusters are quite large and others are quite small. Here we propose two such methods, hierarchical clustering (IHC and iterative pairwise-correlation clustering (IPC. We evaluate and compare the proposed methods to the Markov Cluster Algorithm (MCL and the generalised mixed-effects model (GMM using simulation studies and an application to a time course gene expression data set from a study containing human subjects who were challenged by a live influenza virus. We identify four types of temporal gene response modules to influenza infection in humans, i.e., single-gene modules (SGM, small-size modules (SSM, medium-size modules (MSM and large-size modules (LSM. The LSM contain genes that perform various fundamental biological functions that are consistent across subjects. The SSM and SGM contain genes that perform either different or similar biological functions that have complex temporal responses to the virus and are unique to each subject. We show that the temporal response of the genes in the LSM have either simple patterns with a single peak or trough a consequence of the transient stimuli sustained or state-transitioning patterns pertaining to developmental cues and that these modules can differentiate the severity of disease outcomes. Additionally, the size of gene response modules follows a power-law distribution with a consistent
Wang, Qian-fei; Liu, Xin; O' Connell, Jeff; Peng, Ze; Krauss, Ronald M.; Rainwater, David L.; VandeBerg, John L.; Rubin, Edward M.; Cheng, Jan-Fang; Pennacchio, Len A.
Genetic studies in non-human primates serve as a potential strategy for identifying genomic intervals where polymorphisms impact upon human disease-related phenotypes. It remains unclear, however, whether independently arising polymorphisms in orthologous regions of non-human primates leads to similar variation in a quantitative trait found in both species. To explore this paradigm, we studied a baboon apolipoprotein gene cluster (APOA1/C3/A4/A5) for which the human gene orthologs have well established roles in influencing plasma HDL-cholesterol and triglyceride concentrations. Our extensive polymorphism analysis of this 68 kb gene cluster in 96 pedigreed baboons identified several haplotype blocks each with limited diversity, consistent with haplotype findings in humans. To determine whether baboons, like humans, also have particular haplotypes associated with lipid phenotypes, we genotyped 634 well characterized baboons using 16 haplotype tagging SNPs. Genetic analysis of single SNPs, as well as haplotypes, revealed an association of APOA5 and APOC3 variants with HDL cholesterol and triglyceride concentrations, respectively. Thus, independent variation in orthologous genomic intervals does associate with similar quantitative lipid traits in both species, supporting the possibility of uncovering human QTL genes in a highly controlled non-human primate model.
Michael B Walker
Full Text Available Arrangements of genes along chromosomes are a product of evolutionary processes, and we can expect that preferable arrangements will prevail over the span of evolutionary time, often being reflected in the non-random clustering of structurally and/or functionally related genes. Such non-random arrangements can arise by two distinct evolutionary processes: duplications of DNA sequences that give rise to clusters of genes sharing both sequence similarity and common sequence features and the migration together of genes related by function, but not by common descent. To provide a background for distinguishing between the two, which is important for future efforts to unravel the evolutionary processes involved, we here provide a description of the extent to which ancestrally related genes are found in proximity.Towards this purpose, we combined information from five genomic datasets, InterPro, SCOP, PANTHER, Ensembl protein families, and Ensembl gene paralogs. The results are provided in publicly available datasets (http://cgd.jax.org/datasets/clustering/paraclustering.shtml describing the extent to which ancestrally related genes are in proximity beyond what is expected by chance (i.e. form paraclusters in the human and nine other vertebrate genomes, as well as the D. melanogaster, C. elegans, A. thaliana, and S. cerevisiae genomes. With the exception of Saccharomyces, paraclusters are a common feature of the genomes we examined. In the human genome they are estimated to include at least 22% of all protein coding genes. Paraclusters are far more prevalent among some gene families than others, are highly species or clade specific and can evolve rapidly, sometimes in response to environmental cues. Altogether, they account for a large portion of the functional clustering previously reported in several genomes.
Roubelakis, Maria G; Zotos, Pantelis; Papachristoudis, Georgios; Michalopoulos, Ioannis; Pappa, Kalliopi I; Anagnou, Nicholas P; Kossida, Sophia
microRNAs (miRNAs) are single-stranded RNA molecules of about 20-23 nucleotides length found in a wide variety of organisms. miRNAs regulate gene expression, by interacting with target mRNAs at specific sites in order to induce cleavage of the message or inhibit translation. Predicting or verifying mRNA targets of specific miRNAs is a difficult process of great importance. GOmir is a novel stand-alone application consisting of two separate tools: JTarget and TAGGO. JTarget integrates miRNA target prediction and functional analysis by combining the predicted target genes from TargetScan, miRanda, RNAhybrid and PicTar computational tools as well as the experimentally supported targets from TarBase and also providing a full gene description and functional analysis for each target gene. On the other hand, TAGGO application is designed to automatically group gene ontology annotations, taking advantage of the Gene Ontology (GO), in order to extract the main attributes of sets of proteins. GOmir represents a new tool incorporating two separate Java applications integrated into one stand-alone Java application. GOmir (by using up to five different databases) introduces miRNA predicted targets accompanied by (a) full gene description, (b) functional analysis and (c) detailed gene ontology clustering. Additionally, a reverse search initiated by a potential target can also be conducted. GOmir can freely be downloaded BRFAA.
Rocha Eduardo PC
Full Text Available Abstract Background Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering did not take into account the function of the gene products nor the likely presence or absence of a given gene in a genome. However, genomes harbor two very different categories of genes: those genes present in a majority of organisms – persistent genes – and those present in very few organisms – rare genes. Results We show that two classes of genes are significantly clustered in bacterial genomes: the highly persistent and the rare genes. The clustering of rare genes is readily explained by the selfish operon theory. Yet, genes persistently present in bacterial genomes are also clustered and we try to understand why. We propose a model accounting specifically for such clustering, and show that indispensability in a genome with frequent gene deletion and insertion leads to the transient clustering of these genes. The model describes how clusters are created via the gene flux that continuously introduces new genes while deleting others. We then test if known selective processes, such as co-transcription, physical interaction or functional neighborhood, account for the stabilization of these clusters. Conclusion We show that the strong selective pressure acting on the function of persistent genes, in a permanent state of flux of genes in bacterial genomes, maintaining their size fairly constant, that drives persistent genes clustering. A further selective stabilization process might contribute to maintaining the clustering.
Dhillon, Inderjit S; Marcotte, Edward M; Roshan, Usman
Clustering genes based upon their expression patterns allows us to predict gene function. Most existing clustering algorithms cluster genes together when their expression patterns show high positive correlation. However, it has been observed that genes whose expression patterns are strongly anti-correlated can also be functionally similar. Biologically, this is not unintuitive-genes responding to the same stimuli, regardless of the nature of the response, are more likely to operate in the same pathways. We present a new diametrical clustering algorithm that explicitly identifies anti-correlated clusters of genes. Our algorithm proceeds by iteratively (i). re-partitioning the genes and (ii). computing the dominant singular vector of each gene cluster; each singular vector serving as the prototype of a 'diametric' cluster. We empirically show the effectiveness of the algorithm in identifying diametrical or anti-correlated clusters. Testing the algorithm on yeast cell cycle data, fibroblast gene expression data, and DNA microarray data from yeast mutants reveals that opposed cellular pathways can be discovered with this method. We present systems whose mRNA expression patterns, and likely their functions, oppose the yeast ribosome and proteosome, along with evidence for the inverse transcriptional regulation of a number of cellular systems.
Galiová-Šustáčková, Gabriela; Bártová, Eva; Kozubek, Stanislav
Roč. 33, č. 1 (2004), s. 4-14 ISSN 1079-9796 R&D Projects: GA ČR GA301/01/0186; GA AV ČR KSK5052113; GA AV ČR IAA5004306; GA ČR GA202/04/0907; GA MŠk ME 565 Institutional research plan: CEZ:AV0Z5004920 Keywords : beta-like globin gene cluster * K-562 cells * nuclear topography Subject RIV: BO - Biophysics Impact factor: 2.549, year: 2004
Jones, M.H.; Learned, R.M.; Tjian, R.
The authors have mapped the cis regulatory elements required in vivo for initiation at the human rRNA promoter by RNA polymerase I. Transient expression in COS-7 cells was used to evaluate the transcription phenotype of clustered base substitution mutations in the human rRNA promoter. The promoter consists of two major elements: a large upstream region, composed of several domains, that lies between nucleotides -234 and -107 relative to the transcription initiation site and affects transcription up to 100-fold and a core element that lies between nucleotides -45 and +20 and affects transcription up to 1000-fold. The upstream regions is able to retain partial function when positioned within 100-160 nucleotides of the transcription initiation site, but it cannot stimulate transcription from distances of ≥ 600 nucleotides. In addition, they demonstrate, using mouse-human hybrid rRNA promoters, that the sequences responsible for human species-specific transcription in vivo appear to reside in both the core and upstream elements, and sequences from the mouse rRNA promoter cannot be substituted for them
Thomas W. Jeffries; Jennifer R. Headman Van Vleet
Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...
Molas, Susanna; Gener, Thomas; Güell, Jofre; Martín, Mairena; Ballesteros-Yáñez, Inmaculada; Sanchez-Vives, Maria V; Dierssen, Mara
Addiction involves long-lasting maladaptive changes including development of disruptive drug-stimuli associations. Nicotine-induced neuroplasticity underlies the development of tobacco addiction but also, in regions such as the hippocampus, the ability of this drug to enhance cognitive capabilities. Here, we propose that the genetic locus of susceptibility to nicotine addiction, the CHRNA5/A3/B4 gene cluster, encoding the α5, α3 and β4 subunits of the nicotinic acetylcholine receptors (nAChRs), may influence nicotine-induced neuroadaptations. We have used transgenic mice overexpressing the human cluster (TgCHRNA5/A3/B4) to investigate hippocampal structure and function in genetically susceptible individuals. TgCHRNA5/A3/B4 mice presented a marked reduction in the dendrite complexity of CA1 hippocampal pyramidal neurons along with an increased dendritic spine density. In addition, TgCHRNA5/A3/B4 exhibited increased VGLUT1/VGAT ratio in the CA1 region, suggesting an excitatory/inhibitory imbalance. These hippocampal alterations were accompanied by a significant impairment in short-term novelty recognition memory. Interestingly, chronic infusion of nicotine (3.25 mg/kg/d for 7 d) was able to rescue the reduced dendritic complexity, the excitatory/inhibitory imbalance and the cognitive impairment in TgCHRNA5/A3/B4. Our results suggest that chronic nicotine treatment may represent a compensatory strategy in individuals with altered expression of the CHRNA5/A3/B4 region.
Zhang, Hu; Zheng, Jiajia; Shen, Hongliang; Huang, Yongyi; Liu, Te; Xi, Hao; Chen, Chuan
Curcumin can suppress human prostate cancer (HuPCa) cell proliferation and invasion. However, it is not known whether curcumin can inhibit HuPCa stem cell (HuPCaSC) proliferation and invasion. We used methyl thiazolyl tetrazolium and Transwell assays to examine the proliferation and invasion of the HuPCaSC lines DU145 and 22Rv1 following curcumin or dimethyl sulfoxide (control) treatment. The microRNA (miRNA) expression levels in the DLK1-DIO3 imprinted genomic region in the cells and in tumor tissues from patients with PCa were examined using microarray and quantitative PCR. The median inhibitory concentration of curcumin for HuPCa cells significantly inhibited HuPCaSC proliferation and invasion in vitro. The miR-770-5p and miR-1247 expression levels in the DLK1-DIO3 imprinted gene cluster were significantly different between the curcumin-treated and control HuPCaSCs. Overexpression of these positive miRNAs significantly increased the inhibition rates of miR-770-5p- and miR-1247-transfected HuPCaSCs compared to the control miR-Mut-transfected HuPCaSCs. Lastly, low-tumor grade PCa tissues had higher miR-770-5p and miR-1247 expression levels than high-grade tumor tissues. Curcumin can suppress HuPCaSC proliferation and invasion in vitro by modulating specific miRNAs in the DLK1-DIO3 imprinted gene cluster.
Friedrich, Alexander W; Köck, Robin; Bielaszewska, Martina; Zhang, Wenlan; Karch, Helge; Mathys, Werner
Enterohemorrhagic Escherichia coli (EHEC) O157 strains belong to two closely related major groups, which are differentiated by their sorbitol fermentation phenotypes. Here we studied the conservation of urease genes and their expression in sorbitol-fermenting (SF) and non-SF EHEC O157 isolates. PCR
Ashina, Håkan; Newman, Lawrence; Ashina, Sait
Calcitonin gene-related peptide (CGRP) is a key signaling molecule involved in migraine pathophysiology. Efficacy of CGRP monoclonal antibodies and antagonists in migraine treatment has fueled an increasing interest in the prospect of treating cluster headache (CH) with CGRP antagonism. The exact...... role of CGRP and its mechanism of action in CH have not been fully clarified. A search for original studies and randomized controlled trials (RCTs) published in English was performed in PubMed and in ClinicalTrials.gov . The search term used was "cluster headache and calcitonin gene related peptide......" and "primary headaches and calcitonin gene related peptide." Reference lists of identified articles were also searched for additional relevant papers. Human experimental studies have reported elevated plasma CGRP levels during both spontaneous and glyceryl trinitrate-induced cluster attacks. CGRP may play...
Wang, Yunli; Pan, Youlian
Background Simple clustering methods such as hierarchical clustering and k-means are widely used for gene expression data analysis; but they are unable to deal with noise and high dimensionality associated with the microarray gene expression data. Consensus clustering appears to improve the robustness and quality of clustering results. Incorporating prior knowledge in clustering process (semi-supervised clustering) has been shown to improve the consistency between the data partitioning and do...
Full Text Available Abstract Background There has been much evidence recently for a link between transcriptional regulation and chromosomal gene order, but the relationship between genomic organization, regulation and gene function in higher eukaryotes remains to be precisely defined. Results Here, we present evidence for organization of a large proportion of a human transcriptome into gene clusters throughout the genome, which are partly regulated by the same transcription factors, share biological functions and are characterized by non-housekeeping genes. This analysis was based on the cardiac transcriptome identified by our genome-wide array analysis of 55 human heart samples. We found 37% of these genes to be arranged mainly in adjacent pairs or triplets. A significant number of pairs of adjacent genes are putatively regulated by common transcription factors (p = 0.02. Furthermore, these gene pairs share a significant number of GO functional classification terms. We show that the human cardiac transcriptome is organized into many small clusters across the whole genome, rather than being concentrated in a few larger clusters. Conclusion Our findings suggest that genes expressed in concert are organized in a linear arrangement for coordinated regulation. Determining the relationship between gene arrangement, regulation and nuclear organization as well as gene function will have broad biological implications.
Santini, Simona; Boore, Jeffrey L.; Meyer, Axel
Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involved in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.
Wotton, Karl R; Weierud, Frida K; Juárez-Morales, José L; Alvares, Lúcia E; Dietrich, Susanne; Lewis, Katharine E
Nk homeobox genes are important regulators of many different developmental processes including muscle, heart, central nervous system and sensory organ development. They are thought to have arisen as part of the ANTP megacluster, which also gave rise to Hox and ParaHox genes, and at least some NK genes remain tightly linked in all animals examined so far. The protostome-deuterostome ancestor probably contained a cluster of nine Nk genes: (Msx)-(Nk4/tinman)-(Nk3/bagpipe)-(Lbx/ladybird)-(Tlx/c15)-(Nk7)-(Nk6/hgtx)-(Nk1/slouch)-(Nk5/Hmx). Of these genes, only NKX2.6-NKX3.1, LBX1-TLX1 and LBX2-TLX2 remain tightly linked in humans. However, it is currently unclear whether this is unique to the human genome as we do not know which of these Nk genes are clustered in other vertebrates. This makes it difficult to assess whether the remaining linkages are due to selective pressures or because chance rearrangements have "missed" certain genes. In this paper, we identify all of the paralogs of these ancestrally clustered NK genes in several distinct vertebrates. We demonstrate that tight linkages of Lbx1-Tlx1, Lbx2-Tlx2 and Nkx3.1-Nkx2.6 have been widely maintained in both the ray-finned and lobe-finned fish lineages. Moreover, the recently duplicated Hmx2-Hmx3 genes are also tightly linked. Finally, we show that Lbx1-Tlx1 and Hmx2-Hmx3 are flanked by highly conserved noncoding elements, suggesting that shared regulatory regions may have resulted in evolutionary pressure to maintain these linkages. Consistent with this, these pairs of genes have overlapping expression domains. In contrast, Lbx2-Tlx2 and Nkx3.1-Nkx2.6, which do not seem to be coexpressed, are also not associated with conserved noncoding sequences, suggesting that an alternative mechanism may be responsible for the continued clustering of these genes.
Schulz, Tizian; Stoye, Jens; Doerr, Daniel
Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.
Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K
Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.
Despres, Jordane; Forano, Evelyne; Lepercq, Pascale; Comtet-Marre, Sophie; Jubelin, Gregory; Chambon, Christophe; Yeoman, Carl J; Berg Miller, Margaret E; Fields, Christopher J; Martens, Eric; Terrapon, Nicolas; Henrissat, Bernard; White, Bryan A; Mosoni, Pascale
Plant cell wall (PCW) polysaccharides and especially xylans constitute an important part of human diet. Xylans are not degraded by human digestive enzymes in the upper digestive tract and therefore reach the colon where they are subjected to extensive degradation by some members of the symbiotic microbiota. Xylanolytic bacteria are the first degraders of these complex polysaccharides and they release breakdown products that can have beneficial effects on human health. In order to understand better how these bacteria metabolize xylans in the colon, this study was undertaken to investigate xylan breakdown by the prominent human gut symbiont Bacteroides xylanisolvens XB1A(T). Transcriptomic analyses of B. xylanisolvens XB1A(T) grown on insoluble oat-spelt xylan (OSX) at mid- and late-log phases highlighted genes in a polysaccharide utilization locus (PUL), hereafter called PUL 43, and genes in a fragmentary remnant of another PUL, hereafter referred to as rPUL 70, which were highly overexpressed on OSX relative to glucose. Proteomic analyses supported the up-regulation of several genes belonging to PUL 43 and showed the important over-production of a CBM4-containing GH10 endo-xylanase. We also show that PUL 43 is organized in two operons and that the knockout of the PUL 43 sensor/regulator HTCS gene blocked the growth of the mutant on insoluble OSX and soluble wheat arabinoxylan (WAX). The mutation not only repressed gene expression in the PUL 43 operons but also repressed gene expression in rPUL 70. This study shows that xylan degradation by B. xylanisolvens XB1A(T) is orchestrated by one PUL and one PUL remnant that are linked at the transcriptional level. Coupled to studies on other xylanolytic Bacteroides species, our data emphasize the importance of one peculiar CBM4-containing GH10 endo-xylanase in xylan breakdown and that this modular enzyme may be used as a functional marker of xylan degradation in the human gut. Our results also suggest that B. xylanisolvens
Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa
Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Simon, Eric J.
Describes the latest advancements and setbacks in human gene therapy to provide reference material for biology teachers to use in their science classes. Focuses on basic concepts such as recombinant DNA technology, and provides examples of human gene therapy such as severe combined immunodeficiency syndrome, familial hypercholesterolemia, and…
Full Text Available Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model, genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters.
Ballouz, Sara; Francis, Andrew R.; Lan, Ruiting; Tanaka, Mark M.
Genes encoding proteins in a common pathway are often found near each other along bacterial chromosomes. Several explanations have been proposed to account for the evolution of these structures. For instance, natural selection may directly favour gene clusters through a variety of mechanisms, such as increased efficiency of coregulation. An alternative and controversial hypothesis is the selfish operon model, which asserts that clustered arrangements of genes are more easily transferred to other species, thus improving the prospects for survival of the cluster. According to another hypothesis (the persistence model), genes that are in close proximity are less likely to be disrupted by deletions. Here we develop computational models to study the conditions under which gene clusters can evolve and persist. First, we examine the selfish operon model by re-implementing the simulation and running it under a wide range of conditions. Second, we introduce and study a Moran process in which there is natural selection for gene clustering and rearrangement occurs by genome inversion events. Finally, we develop and study a model that includes selection and inversion, which tracks the occurrence and fixation of rearrangements. Surprisingly, gene clusters fail to evolve under a wide range of conditions. Factors that promote the evolution of gene clusters include a low number of genes in the pathway, a high population size, and in the case of the selfish operon model, a high horizontal transfer rate. The computational analysis here has shown that the evolution of gene clusters can occur under both direct and indirect selection as long as certain conditions hold. Under these conditions the selfish operon model is still viable as an explanation for the evolution of gene clusters. PMID:20168992
Noar, Roslyn D; Daub, Margaret E
Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity) for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity) to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that they may encode
Roslyn D Noar
Full Text Available Mycosphaerella fijiensis, causal agent of black Sigatoka disease of banana, is a Dothideomycete fungus closely related to fungi that produce polyketides important for plant pathogenicity. We utilized the M. fijiensis genome sequence to predict PKS genes and their gene clusters and make bioinformatics predictions about the types of compounds produced by these clusters. Eight PKS gene clusters were identified in the M. fijiensis genome, placing M. fijiensis into the 23rd percentile for the number of PKS genes compared to other Dothideomycetes. Analysis of the PKS domains identified three of the PKS enzymes as non-reducing and two as highly reducing. Gene clusters contained types of genes frequently found in PKS clusters including genes encoding transporters, oxidoreductases, methyltransferases, and non-ribosomal peptide synthases. Phylogenetic analysis identified a putative PKS cluster encoding melanin biosynthesis. None of the other clusters were closely aligned with genes encoding known polyketides, however three of the PKS genes fell into clades with clusters encoding alternapyrone, fumonisin, and solanapyrone produced by Alternaria and Fusarium species. A search for homologs among available genomic sequences from 103 Dothideomycetes identified close homologs (>80% similarity for six of the PKS sequences. One of the PKS sequences was not similar (< 60% similarity to sequences in any of the 103 genomes, suggesting that it encodes a unique compound. Comparison of the M. fijiensis PKS sequences with those of two other banana pathogens, M. musicola and M. eumusae, showed that these two species have close homologs to five of the M. fijiensis PKS sequences, but three others were not found in either species. RT-PCR and RNA-Seq analysis showed that the melanin PKS cluster was down-regulated in infected banana as compared to growth in culture. Three other clusters, however were strongly upregulated during disease development in banana, suggesting that
Reynolds, Hannah T; Slot, Jason C; Divon, Hege H; Lysøe, Erik; Proctor, Robert H; Brown, Daren W
In fungi, distribution of secondary metabolite (SM) gene clusters is often associated with host- or environment-specific benefits provided by SMs. In the plant pathogen Alternaria brassicicola (Dothideomycetes), the DEP cluster confers an ability to synthesize the SM depudecin, a histone deacetylase inhibitor that contributes weakly to virulence. The DEP cluster includes genes encoding enzymes, a transporter, and a transcription regulator. We investigated the distribution and evolution of the DEP cluster in 585 fungal genomes and found a wide but sporadic distribution among Dothideomycetes, Sordariomycetes, and Eurotiomycetes. We confirmed DEP gene expression and depudecin production in one fungus, Fusarium langsethiae. Phylogenetic analyses suggested 6-10 horizontal gene transfers (HGTs) of the cluster, including a transfer that led to the presence of closely related cluster homologs in Alternaria and Fusarium. The analyses also indicated that HGTs were frequently followed by loss/pseudogenization of one or more DEP genes. Independent cluster inactivation was inferred in at least four fungal classes. Analyses of transitions among functional, pseudogenized, and absent states of DEP genes among Fusarium species suggest enzyme-encoding genes are lost at higher rates than the transporter (DEP3) and regulatory (DEP6) genes. The phenotype of an experimentally-induced DEP3 mutant of Fusarium did not support the hypothesis that selective retention of DEP3 and DEP6 protects fungi from exogenous depudecin. Together, the results suggest that HGT and gene loss have contributed significantly to DEP cluster distribution, and that some DEP genes provide a greater fitness benefit possibly due to a differential tendency to form network connections. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution 2017. This work is written by US Government employees and is in the public domain in the US.
Olszewski Kellen L
Full Text Available Abstract Background The availability of microarrays measuring thousands of genes simultaneously across hundreds of biological conditions represents an opportunity to understand both individual biological pathways and the integrated workings of the cell. However, translating this amount of data into biological insight remains a daunting task. An important initial step in the analysis of microarray data is clustering of genes with similar behavior. A number of classical techniques are commonly used to perform this task, particularly hierarchical and K-means clustering, and many novel approaches have been suggested recently. While these approaches are useful, they are not without drawbacks; these methods can find clusters in purely random data, and even clusters enriched for biological functions can be skewed towards a small number of processes (e.g. ribosomes. Results We developed Nearest Neighbor Networks (NNN, a graph-based algorithm to generate clusters of genes with similar expression profiles. This method produces clusters based on overlapping cliques within an interaction network generated from mutual nearest neighborhoods. This focus on nearest neighbors rather than on absolute distance measures allows us to capture clusters with high connectivity even when they are spatially separated, and requiring mutual nearest neighbors allows genes with no sufficiently similar partners to remain unclustered. We compared the clusters generated by NNN with those generated by eight other clustering methods. NNN was particularly successful at generating functionally coherent clusters with high precision, and these clusters generally represented a much broader selection of biological processes than those recovered by other methods. Conclusion The Nearest Neighbor Networks algorithm is a valuable clustering method that effectively groups genes that are likely to be functionally related. It is particularly attractive due to its simplicity, its success in the
Johnson, Timothy A; Stedtfeld, Robert D; Wang, Qiong; Cole, James R; Hashsham, Syed A; Looft, Torey; Zhu, Yong-Guan; Tiedje, James M
Antibiotic resistance is a worldwide health risk, but the influence of animal agriculture on the genetic context and enrichment of individual antibiotic resistance alleles remains unclear. Using quantitative PCR followed by amplicon sequencing, we quantified and sequenced 44 genes related to antibiotic resistance, mobile genetic elements, and bacterial phylogeny in microbiomes from U.S. laboratory swine and from swine farms from three Chinese regions. We identified highly abundant resistance clusters: groups of resistance and mobile genetic element alleles that cooccur. For example, the abundance of genes conferring resistance to six classes of antibiotics together with class 1 integrase and the abundance of IS6100-type transposons in three Chinese regions are directly correlated. These resistance cluster genes likely colocalize in microbial genomes in the farms. Resistance cluster alleles were dramatically enriched (up to 1 to 10% as abundant as 16S rRNA) and indicate that multidrug-resistant bacteria are likely the norm rather than an exception in these communities. This enrichment largely occurred independently of phylogenetic composition; thus, resistance clusters are likely present in many bacterial taxa. Furthermore, resistance clusters contain resistance genes that confer resistance to antibiotics independently of their particular use on the farms. Selection for these clusters is likely due to the use of only a subset of the broad range of chemicals to which the clusters confer resistance. The scale of animal agriculture and its wastes, the enrichment and horizontal gene transfer potential of the clusters, and the vicinity of large human populations suggest that managing this resistance reservoir is important for minimizing human risk. Agricultural antibiotic use results in clusters of cooccurring resistance genes that together confer resistance to multiple antibiotics. The use of a single antibiotic could select for an entire suite of resistance genes if
Edberg Jeffrey C
Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.
Fungi that have the enzymes cyanase and carbonic anhydrase show a limited capacity to detoxify cyanate, a fungicide employed by both plants and humans. Here, we describe a novel two-gene cluster that comprises duplicated cyanase and carbonic anhydrase copies, which we name the CCA gene cluster, trac...
Bailey, Andy M.; Alberti, Fabrizio; Kilaru, Sreedhar; Collins, Catherine M.; de Mattos-Shipley, Kate; Hartley, Amanda J.; Hayes, Patrick; Griffin, Alison; Lazarus, Colin M.; Cox, Russell J.; Willis, Christine L.; O'Dwyer, Karen; Spence, David W.; Foster, Gary D.
Semi-synthetic derivatives of the tricyclic diterpene antibiotic pleuromutilin from the basidiomycete Clitopilus passeckerianus are important in combatting bacterial infections in human and veterinary medicine. These compounds belong to the only new class of antibiotics for human applications, with novel mode of action and lack of cross-resistance, representing a class with great potential. Basidiomycete fungi, being dikaryotic, are not generally amenable to strain improvement. We report identification of the seven-gene pleuromutilin gene cluster and verify that using various targeted approaches aimed at increasing antibiotic production in C. passeckerianus, no improvement in yield was achieved. The seven-gene pleuromutilin cluster was reconstructed within Aspergillus oryzae giving production of pleuromutilin in an ascomycete, with a significant increase (2106%) in production. This is the first gene cluster from a basidiomycete to be successfully expressed in an ascomycete, and paves the way for the exploitation of a metabolically rich but traditionally overlooked group of fungi.
Pezzani, Lidia; Milani, Donatella; Manzoni, Francesca; Baccarin, Marco; Silipigni, Rosamaria; Guerneri, Silvana; Esposito, Susanna
Background HOXA genes cluster plays a fundamental role in embryologic development. Deletion of the entire cluster is known to cause a clinically recognizable syndrome with mild developmental delay, characteristic facies, small feet with unusually short and big halluces, abnormal thumbs, and urogenital malformations. The clinical manifestations may vary with different ranges of deletions of HOXA cluster and flanking regions. Case presentation We report a girl with the smallest deletion reporte...
Medema, Marnix H; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, John B; Blin, Kai; de Bruijn, Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R Cameron; Cruz-Morales, Pablo; Duddela, Srikanth; Dusterhus, Stephanie; Edwards, Daniel J; Fewer, David P; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S; Helfrich, Eric J N; Hillwig, Matthew L; Ishida, Keishi; Jones, Adam C; Jones, Carla S; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kotter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V; Mantovani, Simone M; Monroe, Emily A; Moore, Marcus; Moss, Nathan; Nutzmann, Hans-Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F Jerry; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K; Balibar, Carl J; Balskus, Emily P; Barona-Gomez, Francisco; Bechthold, Andreas; Bode, Helge B; Borriss, Rainer; Brady, Sean F; Brakhage, Axel A; Caffrey, Patrick; Cheng, Yi-Qiang; Clardy, Jon; Cox, Russell J; De Mot, Rene; Donadio, Stefano; Donia, Mohamed S; van der Donk, Wilfred A; Dorrestein, Pieter C; Doyle, Sean; Driessen, Arnold J M; Ehling-Schulz, Monika; Entian, Karl-Dieter; Fischbach, Michael A; Gerwick, Lena; Gerwick, William H; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Hofte, Monica; Jensen, Susan E; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L; Keller, Nancy P; Kormanec, Jan; Kuipers, Oscar P; Kuzuyama, Tomohisa; Kyrpides, Nikos C; Kwon, Hyung-Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Mendez, Carmen; Metsa-Ketela, Mikko; Micklefield, Jason; Mitchell, Douglas A; Moore, Bradley S; Moreira, Leonilde M; Muller, Rolf; Neilan, Brett A; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S; Ostash, Bohdan; Payne, Shelley M; Pernodet, Jean-Luc; Petricek, Miroslav; Piel, Jorn; Ploux, Olivier; Raaijmakers, Jos M; Salas, Jose A; Schmitt, Esther K; Scott, Barry; Seipke, Ryan F; Shen, Ben; Sherman, David H; Sivonen, Kaarina; Smanski, Michael J; Sosio, Margherita; Stegmann, Evi; Sussmuth, Roderich D; Tahlan, Kapil; Thomas, Christopher M; Tang, Yi; Truman, Andrew W; Viaud, Muriel; Walton, Jonathan D; Walsh, Christopher T; Weber, Tilmann; van Wezel, Gilles P; Wilkinson, Barrie; Willey, Joanne M; Wohlleben, Wolfgang; Wright, Gerard D; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B; Breitling, Rainer; Takano, Eriko; Glockner, Frank Oliver
A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploit.
Koh, Esther G. L.; Lam, Kevin; Christoffels, Alan; Erdmann, Mark V.; Brenner, Sydney; Venkatesh, Byrappa
The Hox genes encode transcription factors that play a key role in specifying body plans of metazoans. They are organized into clusters that contain up to 13 paralogue group members. The complex morphology of vertebrates has been attributed to the duplication of Hox clusters during vertebrate evolution. In contrast to the single Hox cluster in the amphioxus (Branchiostoma floridae), an invertebrate-chordate, mammals have four clusters containing 39 Hox genes. Ray-finned fishes (Actinopterygii) such as zebrafish and fugu possess more than four Hox clusters. The coelacanth occupies a basal phylogenetic position among lobe-finned fishes (Sarcopterygii), which gave rise to the tetrapod lineage. The lobe fins of sarcopterygians are considered to be the evolutionary precursors of tetrapod limbs. Thus, the characterization of Hox genes in the coelacanth should provide insights into the origin of tetrapod limbs. We have cloned the complete second exon of 33 Hox genes from the Indonesian coelacanth, Latimeria menadoensis, by extensive PCR survey and genome walking. Phylogenetic analysis shows that 32 of these genes have orthologs in the four mammalian HOX clusters, including three genes (HoxA6, D1, and D8) that are absent in ray-finned fishes. The remaining coelacanth gene is an ortholog of hoxc1 found in zebrafish but absent in mammals. Our results suggest that coelacanths have four Hox clusters bearing a gene complement more similar to mammals than to ray-finned fishes, but with an additional gene, HoxC1, which has been lost during the evolution of mammals from lobe-finned fishes. PMID:12547909
Cooper James B
Full Text Available Abstract Background Clustering the information content of large high-dimensional gene expression datasets has widespread application in "omics" biology. Unfortunately, the underlying structure of these natural datasets is often fuzzy, and the computational identification of data clusters generally requires knowledge about cluster number and geometry. Results We integrated strategies from machine learning, cartography, and graph theory into a new informatics method for automatically clustering self-organizing map ensembles of high-dimensional data. Our new method, called AutoSOME, readily identifies discrete and fuzzy data clusters without prior knowledge of cluster number or structure in diverse datasets including whole genome microarray data. Visualization of AutoSOME output using network diagrams and differential heat maps reveals unexpected variation among well-characterized cancer cell lines. Co-expression analysis of data from human embryonic and induced pluripotent stem cells using AutoSOME identifies >3400 up-regulated genes associated with pluripotency, and indicates that a recently identified protein-protein interaction network characterizing pluripotency was underestimated by a factor of four. Conclusions By effectively extracting important information from high-dimensional microarray data without prior knowledge or the need for data filtration, AutoSOME can yield systems-level insights from whole genome microarray expression studies. Due to its generality, this new method should also have practical utility for a variety of data-intensive applications, including the results of deep sequencing experiments. AutoSOME is available for download at http://jimcooperlab.mcdb.ucsb.edu/autosome.
Full Text Available Abstract Background Ensemble attribute profile clustering is a novel, text-based strategy for analyzing a user-defined list of genes and/or proteins. The strategy exploits annotation data present in gene-centered corpora and utilizes ideas from statistical information retrieval to discover and characterize properties shared by subsets of the list. The practical utility of this method is demonstrated by employing it in a retrospective study of two non-overlapping sets of genes defined by a published investigation as markers for normal human breast luminal epithelial cells and myoepithelial cells. Results Each genetic locus was characterized using a finite set of biological properties and represented as a vector of features indicating attributes associated with the locus (a gene attribute profile. In this study, the vector space models for a pre-defined list of genes were constructed from the Gene Ontology (GO terms and the Conserved Domain Database (CDD protein domain terms assigned to the loci by the gene-centered corpus LocusLink. This data set of GO- and CDD-based gene attribute profiles, vectors of binary random variables, was used to estimate multiple finite mixture models and each ensuing model utilized to partition the profiles into clusters. The resultant partitionings were combined using a unanimous voting scheme to produce consensus clusters, sets of profiles that co-occured consistently in the same cluster. Attributes that were important in defining the genes assigned to a consensus cluster were identified. The clusters and their attributes were inspected to ascertain the GO and CDD terms most associated with subsets of genes and in conjunction with external knowledge such as chromosomal location, used to gain functional insights into human breast biology. The 52 luminal epithelial cell markers and 89 myoepithelial cell markers are disjoint sets of genes. Ensemble attribute profile clustering-based analysis indicated that both lists
Full Text Available In the genome of the biotrophic plant pathogen Ustilago maydis, many of the genes coding for secreted protein effectors modulating virulence are arranged in gene clusters. The vast majority of these genes encode novel proteins whose expression is coupled to plant colonization. The largest of these gene clusters, cluster 19A, encodes 24 secreted effectors. Deletion of the entire cluster results in severe attenuation of virulence. Here we present the functional analysis of this genomic region. We show that a 19A deletion mutant behaves like an endophyte, i.e. is still able to colonize plants and complete the infection cycle. However, tumors, the most conspicuous symptoms of maize smut disease, are only rarely formed and fungal biomass in infected tissue is significantly reduced. The generation and analysis of strains carrying sub-deletions identified several genes significantly contributing to tumor formation after seedling infection. Another of the effectors could be linked specifically to anthocyanin induction in the infected tissue. As the individual contributions of these genes to tumor formation were small, we studied the response of maize plants to the whole cluster mutant as well as to several individual mutants by array analysis. This revealed distinct plant responses, demonstrating that the respective effectors have discrete plant targets. We propose that the analysis of plant responses to effector mutant strains that lack a strong virulence phenotype may be a general way to visualize differences in effector function.
Full Text Available Abstract Background Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. Results We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Conclusion Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Dai, Zhimin; Guo, Xue; Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan
Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.
Full Text Available Biological nitrogen fixation is an essential function of acid mine drainage (AMD microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community.
Yin, Huaqun; Liang, Yili; Cong, Jing; Liu, Xueduan
Biological nitrogen fixation is an essential function of acid mine drainage (AMD) microbial communities. However, most acidophiles in AMD environments are uncultured microorganisms and little is known about the diversity of nitrogen-fixing genes and structure of nif gene cluster in AMD microbial communities. In this study, we used metagenomic sequencing to isolate nif genes in the AMD microbial community from Dexing Copper Mine, China. Meanwhile, a metagenome microarray containing 7,776 large-insertion fosmids was constructed to screen novel nif gene clusters. Metagenomic analyses revealed that 742 sequences were identified as nif genes including structural subunit genes nifH, nifD, nifK and various additional genes. The AMD community is massively dominated by the genus Acidithiobacillus. However, the phylogenetic diversity of nitrogen-fixing microorganisms is much higher than previously thought in the AMD community. Furthermore, a 32.5-kb genomic sequence harboring nif, fix and associated genes was screened by metagenome microarray. Comparative genome analysis indicated that most nif genes in this cluster are most similar to those of Herbaspirillum seropedicae, but the organization of the nif gene cluster had significant differences from H. seropedicae. Sequence analysis and reverse transcription PCR also suggested that distinct transcription units of nif genes exist in this gene cluster. nifQ gene falls into the same transcription unit with fixABCX genes, which have not been reported in other diazotrophs before. All of these results indicated that more novel diazotrophs survive in the AMD community. PMID:24498417
Full Text Available Secondary metabolites (SMs produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic.
Pi, Borui; Yu, Dongliang; Dai, Fangwei; Song, Xiaoming; Zhu, Congyi; Li, Hongye; Yu, Yunsong
Secondary metabolites (SMs) produced by Aspergillus have been extensively studied for their crucial roles in human health, medicine and industrial production. However, the resulting information is almost exclusively derived from a few model organisms, including A. nidulans and A. fumigatus, but little is known about rare pathogens. In this study, we performed a genomics based discovery of SM biosynthetic gene clusters in Aspergillus ustus, a rare human pathogen. A total of 52 gene clusters were identified in the draft genome of A. ustus 3.3904, such as the sterigmatocystin biosynthesis pathway that was commonly found in Aspergillus species. In addition, several SM biosynthetic gene clusters were firstly identified in Aspergillus that were possibly acquired by horizontal gene transfer, including the vrt cluster that is responsible for viridicatumtoxin production. Comparative genomics revealed that A. ustus shared the largest number of SM biosynthetic gene clusters with A. nidulans, but much fewer with other Aspergilli like A. niger and A. oryzae. These findings would help to understand the diversity and evolution of SM biosynthesis pathways in genus Aspergillus, and we hope they will also promote the development of fungal identification methodology in clinic. PMID:25706180
Bloom, Mark V.; Cutter, Mary Ann; Davidson, Ronald; Dougherty, Michael J.; Drexler, Edward; Gelernter, Joel; McCullough, Laurence B.; McInerney, Joseph D.; Murray, Jeffrey C.; Vogler, George P.; Zola, John
This curriculum module explores genes, environment, and human behavior. This book provides materials to teach about the nature and methods of studying human behavior, raise some of the ethical and public policy dilemmas emerging from the Human Genome Project, and provide professional development for teachers. An extensive Teacher Background…
Egan, Muireann; Jiang, Hao; O'Connell Motherway, Mary; Oscarson, Stefan; van Sinderen, Douwe
Bifidobacteria constitute a specific group of commensal bacteria typically found in the gastrointestinal tract (GIT) of humans and other mammals. Bifidobacterium breve strains are numerically prevalent among the gut microbiota of many healthy breastfed infants. In the present study, we investigated glycosulfatase activity in a bacterial isolate from a nursling stool sample, B. breve UCC2003. Two putative sulfatases were identified on the genome of B. breve UCC2003. The sulfated monosaccharide N-acetylglucosamine-6-sulfate (GlcNAc-6-S) was shown to support the growth of B. breve UCC2003, while N-acetylglucosamine-3-sulfate, N-acetylgalactosamine-3-sulfate, and N-acetylgalactosamine-6-sulfate did not support appreciable growth. By using a combination of transcriptomic and functional genomic approaches, a gene cluster designated ats2 was shown to be specifically required for GlcNAc-6-S metabolism. Transcription of the ats2 cluster is regulated by a repressor open reading frame kinase (ROK) family transcriptional repressor. This study represents the first description of glycosulfatase activity within the Bifidobacterium genus. Bifidobacteria are saccharolytic organisms naturally found in the digestive tract of mammals and insects. Bifidobacterium breve strains utilize a variety of plant- and host-derived carbohydrates that allow them to be present as prominent members of the infant gut microbiota as well as being present in the gastrointestinal tract of adults. In this study, we introduce a previously unexplored area of carbohydrate metabolism in bifidobacteria, namely, the metabolism of sulfated carbohydrates. B. breve UCC2003 was shown to metabolize N-acetylglucosamine-6-sulfate (GlcNAc-6-S) through one of two sulfatase-encoding gene clusters identified on its genome. GlcNAc-6-S can be found in terminal or branched positions of mucin oligosaccharides, the glycoprotein component of the mucous layer that covers the digestive tract. The results of this study provide
Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing
Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.
Kaushik, Mahima; Kukreti, Shrikant
Our previous work on structural polymorphism shown at a single nucleotide polymorphism (SNP) (A → G) site located on HS4 region of locus control region (LCR) of β-globin gene has established a hairpin → duplex equilibrium corresponding to A → B like DNA transition (Kaushik M, Kukreti, R., Grover, D., Brahmachari, S.K. and Kukreti S. Nucleic Acids Res. 2003; Kaushik M, Kukreti S. Nucleic Acids Res. 2006). The G-allele of A → G SNP has been shown to be significantly associated with the occurrence of β-thalassemia. Considering the significance of this 11-nt long quasi-palindromic sequence [5'-TGGGG(G/A)CCCCA; HP(G/A)11] of β-globin gene LCR, we further explored the differential behavior of the same DNA sequence with its RNA counterpart, using various biophysical and biochemical techniques. In contrast to its DNA counterpart exhibiting a A → B structural transition and an equilibrium between duplex and hairpin forms, the studied RNA oligonucleotide sequence [5'-UGGGG(G/A)CCCCA; RHP(G/A)11] existed only in duplex form (A-conformation) and did not form hairpin. The single residue difference from A to G led to the unusual thermal stability of the RNA structure formed by the studied sequence. Since, naturally occurring mutations and various SNP sites may stabilize or destabilize the local DNA/RNA secondary structures, these structural transitions may affect the gene expression by a change in the protein-DNA recognition patterns.
Lorenz, N.; Haarmann, T.; Pažoutová, Sylvie; Jung, M.; Tudzynski, P.
Roč. 70, 15-16 (2009), s. 1822-1832 ISSN 0031-9422 Institutional research plan: CEZ:AV0Z50200510 Keywords : Claviceps purpurea * Ergot fungus * Ergot alkaloid gene cluster Subject RIV: EE - Microbiology, Virology Impact factor: 3.104, year: 2009
Adelson David L
Full Text Available Abstract Background A key open question in biology is if genes are physically clustered with respect to their known functions or phenotypic effects. This is of particular interest for Quantitative Trait Loci (QTL where a QTL region could contain a number of genes that contribute to the trait being measured. Results We observed a significant increase in gene density within QTL regions compared to non-QTL regions and/or the entire bovine genome. By grouping QTL from the Bovine QTL Viewer database into 8 categories of non-redundant regions, we have been able to analyze gene density and gene function distribution, based on Gene Ontology (GO with relation to their location within QTL regions, outside of QTL regions and across the entire bovine genome. We identified a number of GO terms that were significantly over represented within particular QTL categories. Furthermore, select GO terms expected to be associated with the QTL category based on common biological knowledge have also proved to be significantly over represented in QTL regions. Conclusion Our analysis provides evidence of over represented GO terms in QTL regions. This increased GO term density indicates possible clustering of gene functions within QTL regions of the bovine genome. Genes with similar functions may be grouped in specific locales and could be contributing to QTL traits. Moreover, we have identified over-represented GO terminology that from a biological standpoint, makes sense with respect to QTL category type.
Gardiner Donald M
Full Text Available Abstract Background Genes responsible for biosynthesis of fungal secondary metabolites are usually tightly clustered in the genome and co-regulated with metabolite production. Epipolythiodioxopiperazines (ETPs are a class of secondary metabolite toxins produced by disparate ascomycete fungi and implicated in several animal and plant diseases. Gene clusters responsible for their production have previously been defined in only two fungi. Fungal genome sequence data have been surveyed for the presence of putative ETP clusters and cluster data have been generated from several fungal taxa where genome sequences are not available. Phylogenetic analysis of cluster genes has been used to investigate the assembly and heredity of these gene clusters. Results Putative ETP gene clusters are present in 14 ascomycete taxa, but absent in numerous other ascomycetes examined. These clusters are discontinuously distributed in ascomycete lineages. Gene content is not absolutely fixed, however, common genes are identified and phylogenies of six of these are separately inferred. In each phylogeny almost all cluster genes form monophyletic clades with non-cluster fungal paralogues being the nearest outgroups. This relatedness of cluster genes suggests that a progenitor ETP gene cluster assembled within an ancestral taxon. Within each of the cluster clades, the cluster genes group together in consistent subclades, however, these relationships do not always reflect the phylogeny of ascomycetes. Micro-synteny of several of the genes within the clusters provides further support for these subclades. Conclusion ETP gene clusters appear to have a single origin and have been inherited relatively intact rather than assembling independently in the different ascomycete lineages. This progenitor cluster has given rise to a small number of distinct phylogenetic classes of clusters that are represented in a discontinuous pattern throughout ascomycetes. The disjunct heredity of
Krystal, M; D'Eustachio, P; Ruddle, F H; Arnheim, N
The distributions of three human ribosomal gene polymorphisms among individual chromosomes containing nucleolus organizers were analyzed by using mouse--human hybrid cells. Different nucleolus organizers can contain the same variant, suggesting the occurrence of genetic exchanges among ribosomal gene clusters on nonhomologous chromosomes. Such exchanges appear to occur less frequently in mice. This difference is discussed in terms of the nucleolar organization and chromosomal location of ribosomal gene clusters in humans and mice. Images PMID:6272316
Full Text Available The egcSEs comprise five genetically linked staphylococcal enterotoxins, SEG, SEI, SElM, SElN and SElO and two pseudotoxins which constitute an operon present in up to 80% of Staphylococcus aureus isolates. A preparation containing theses proteins was recently used to treat advanced lung cancer with pleural effusion. We investigated the hypothesis that egcSEs induce nitrous oxide (NO and associated cytokine production and that these agents may be involved in tumoricidal effects against a broad panel of clinically relevant human tumor cells. Preliminary studies showed that egcSEs and SEA activated T cells (range: 11-25% in a concentration dependent manner. Peripheral blood mononuclear cells (PBMCs stimulated with equimolar quantities of egcSEs expressed NO synthase and generated robust levels of nitrite (range: 200-250 µM, a breakdown product of NO; this reaction was inhibited by NG-monomethyl-L-arginine (L-NMMA (0.3 mM, an NO synthase antagonist. Cell free supernatants (CSFs of all egcSE-stimulated PBMCs were also equally effective in inducing concentration dependent tumor cell apoptosis in a broad panel of human tumor cells. The latter effect was due in part to the generation of NO and TNF-α since it was significantly abolished by L-NMMA, anti-TNF-α antibodies respectively and a combination thereof. A hierarchy of tumor cell sensitivity to these CFSs was as follows: lung carcinoma>osteogenic sarcoma>melanoma>breast carcinoma>neuroblastoma. Notably, SEG induced robust activation of NO/TNFα-dependent tumor cell apoptosis comparable to the other egcSEs and SEA despite TNF-α and IFN-γ levels that were 2 and 8 fold lower respectively than the other egcSEs and SEA. Thus, egcSEs produced by S. aureus induce NO synthase and the increased NO formation together with TNF-α appear to contribute to egcSE-mediated apoptosis against a broad panel of human tumor cells.
Abreu, G C G; Pinheiro, A; Drummond, R D
DNA array data without a corresponding statistical error measure. We propose an easy-to-implement and simple-to-use technique that uses bootstrap re-sampling to evaluate the statistical error of the nodes provided by SOM-based clustering. Comparisons between SOM and parametric clustering are presented...... for simulated as well as for two real data sets. We also implement a bootstrap-based pre-processing procedure for SOM, that improves the false discovery ratio of differentially expressed genes. Code in Matlab is freely available, as well as some supplementary material, at the following address: https...
Jakobek Judy L
Full Text Available Abstract Background The biosynthesis of aflatoxin (AF involves over 20 enzymatic reactions in a complex polyketide pathway that converts acetate and malonate to the intermediates sterigmatocystin (ST and O-methylsterigmatocystin (OMST, the respective penultimate and ultimate precursors of AF. Although these precursors are chemically and structurally very similar, their accumulation differs at the species level for Aspergilli. Notable examples are A. nidulans that synthesizes only ST, A. flavus that makes predominantly AF, and A. parasiticus that generally produces either AF or OMST. Whether these differences are important in the evolutionary/ecological processes of species adaptation and diversification is unknown. Equally unknown are the specific genomic mechanisms responsible for ordering and clustering of genes in the AF pathway of Aspergillus. Results To elucidate the mechanisms that have driven formation of these clusters, we performed systematic searches of aflatoxin cluster homologs across five Aspergillus genomes. We found a high level of gene duplication and identified seven modules consisting of highly correlated gene pairs (aflA/aflB, aflR/aflS, aflX/aflY, aflF/aflE, aflT/aflQ, aflC/aflW, and aflG/aflL. With the exception of A. nomius, contrasts of mean Ka/Ks values across all cluster genes showed significant differences in selective pressure between section Flavi and non-section Flavi species. A. nomius mean Ka/Ks values were more similar to partial clusters in A. fumigatus and A. terreus. Overall, mean Ka/Ks values were significantly higher for section Flavi than for non-section Flavi species. Conclusion Our results implicate several genomic mechanisms in the evolution of ST, OMST and AF cluster genes. Gene modules may arise from duplications of a single gene, whereby the function of the pre-duplication gene is retained in the copy (aflF/aflE or the copies may partition the ancestral function (aflA/aflB. In some gene modules, the
Jeggo, P.A.; Carr, A.M.; Lehmann, A.R.
Many human genes involved in the repair of UV damage have been cloned using different procedures and they have been of great value in assisting the understanding of the mechanism of nucleotide excision-repair. Genes involved in repair of ionizing radiation damage have proved more difficult to isolate. Positional cloning has localized the XRCC5 gene to a small region of chromosome 2q33-35, and a series of yeast artificial chromosomes covering this region have been isolated. Very recent work has shown that the XRCC5 gene encodes the 80 kDa subunit of the Ku DNA-binding protein. The Ku80 gene also maps to this region. Studies with fission yeast have shown that radiation sensitivity can result not only from defective DNA repair but also from abnormal cell cycle control following DNA damage. Several genes involved in this 'check-point' control in fission yeast have been isolated and characterized in detail. It is likely that a similar checkpoint control mechanism exists in human cells. (author)
Full Text Available One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms, that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia
Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.
Full Text Available Our previous study demonstrated that human KIAA0100 gene was a novel acute monocytic leukemia-associated antigen (MLAA gene. But the functional characterization of human KIAA0100 gene has remained unknown to date. Here, firstly, bioinformatic prediction of human KIAA0100 gene was carried out using online softwares; Secondly, Human KIAA0100 gene expression was downregulated by the clustered regularly interspaced short palindromic repeats (CRISPR/CRISPR-associated (Cas 9 system in U937 cells. Cell proliferation and apoptosis were next evaluated in KIAA0100-knockdown U937 cells. The bioinformatic prediction showed that human KIAA0100 gene was located on 17q11.2, and human KIAA0100 protein was located in the secretory pathway. Besides, human KIAA0100 protein contained a signalpeptide, a transmembrane region, three types of secondary structures (alpha helix, extended strand, and random coil , and four domains from mitochondrial protein 27 (FMP27. The observation on functional characterization of human KIAA0100 gene revealed that its downregulation inhibited cell proliferation, and promoted cell apoptosis in U937 cells. To summarize, these results suggest human KIAA0100 gene possibly comes within mitochondrial genome; moreover, it is a novel anti-apoptotic factor related to carcinogenesis or progression in acute monocytic leukemia, and may be a potential target for immunotherapy against acute monocytic leukemia.
Full Text Available Plant pathogenic fungi in the Fusarium genus cause severe damage to crops, resulting in great financial losses and health hazards. Specialized metabolites synthesized by these fungi are known to play key roles in the infection process, and to provide survival advantages inside and outside the host. However, systematic studies of the evolution of specialized metabolite-coding potential across Fusarium have been scarce. Here, we apply a combination of bioinformatic approaches to identify biosynthetic gene clusters (BGCs across publicly available genomes from Fusarium, to group them into annotated families and to study gain/loss events of BGC families throughout the history of the genus. Comparison with MIBiG reference BGCs allowed assignment of 29 gene cluster families (GCFs to pathways responsible for the production of known compounds, while for 57 GCFs, the molecular products remain unknown. Comparative analysis of BGC repertoires using ancestral state reconstruction raised several new hypotheses on how BGCs contribute to Fusarium pathogenicity or host specificity, sometimes surprisingly so: for example, a gene cluster for the biosynthesis of hexadehydro-astechrome was identified in the genome of the biocontrol strain Fusarium oxysporum Fo47, while being absent in that of the tomato pathogen F. oxysporum f.sp. lycopersici. Several BGCs were also identified on supernumerary chromosomes; heterologous expression of genes for three terpene synthases encoded on the Fusarium poae supernumerary chromosome and subsequent GC/MS analysis showed that these genes are functional and encode enzymes that each are able to synthesize koraiol; this observed functional redundancy supports the hypothesis that localization of copies of BGCs on supernumerary chromosomes provides freedom for evolutionary innovations to occur, while the original function remains conserved. Altogether, this systematic overview of biosynthetic diversity in Fusarium paves the way for
Kjærbølling, Inge; Vesth, Tammi Camilla; Frisvad, Jens Christian
Secondary metabolite gene cluster evolution is mainly driven by two events: gene duplication and annexation and horizontal gene transfer. Here we use comparative genomics of Aspergillus species to investigate the evolution of secondary metabolite (SM) gene clusters across a wide spectrum of speci....... We investigate the dynamic evolutionary relationship between the cluster and the host by examining the genes within the cluster and the number of homologous genes found within the host and in closely related species.......Secondary metabolite gene cluster evolution is mainly driven by two events: gene duplication and annexation and horizontal gene transfer. Here we use comparative genomics of Aspergillus species to investigate the evolution of secondary metabolite (SM) gene clusters across a wide spectrum of species...
Li, Sheng; Li, Peng; Fu, Yun
Discovering the variations in human torso shape plays a key role in many design-oriented applications, such as suit designing. With recent advances in 3D surface imaging technologies, people can obtain 3D human torso data that provide more information than traditional measurements. However, how to find different human shapes from 3D torso data is still an open problem. In this paper, we propose to use spectral clustering approach on torso manifold to address this problem. We first represent high-dimensional torso data in a low-dimensional space using manifold learning algorithm. Then the spectral clustering method is performed to get several disjoint clusters. Experimental results show that the clusters discovered by our approach can describe the discrepancies in both genders and human shapes, and our approach achieves better performance than the compared clustering method.
Lee Bernett TK
Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.
Full Text Available To further understand the potential expression relationships of miRNAs in miRNA gene clusters and gene families, a global analysis was performed in 4 paired tumor (breast cancer and adjacent normal tissue samples using deep sequencing datasets. The compositions of miRNA gene clusters and families are not random, and clustered and homologous miRNAs may have close relationships with overlapped miRNA species. Members in the miRNA group always had various expression levels, and even some showed larger expression divergence. Despite the dynamic expression as well as individual difference, these miRNAs always indicated consistent or similar deregulation patterns. The consistent deregulation expression may contribute to dynamic and coordinated interaction between different miRNAs in regulatory network. Further, we found that those clustered or homologous miRNAs that were also identified as sense and antisense miRNAs showed larger expression divergence. miRNA gene clusters and families indicated important biological roles, and the specific distribution and expression further enrich and ensure the flexible and robust regulatory network.
Zeng, Lin; Martino, Nicole C.
Streptococcus gordonii is an early colonizer of the human oral cavity and an abundant constituent of oral biofilms. Two tandemly arranged gene clusters, designated lac and gal, were identified in the S. gordonii DL1 genome, which encode genes of the tagatose pathway (lacABCD) and sugar phosphotransferase system (PTS) enzyme II permeases. Genes encoding a predicted phospho-β-galactosidase (LacG), a DeoR family transcriptional regulator (LacR), and a transcriptional antiterminator (LacT) were also present in the clusters. Growth and PTS assays supported that the permease designated EIILac transports lactose and galactose, whereas EIIGal transports galactose. The expression of the gene for EIIGal was markedly upregulated in cells growing on galactose. Using promoter-cat fusions, a role for LacR in the regulation of the expressions of both gene clusters was demonstrated, and the gal cluster was also shown to be sensitive to repression by CcpA. The deletion of lacT caused an inability to grow on lactose, apparently because of its role in the regulation of the expression of the genes for EIILac, but had little effect on galactose utilization. S. gordonii maintained a selective advantage over Streptococcus mutans in a mixed-species competition assay, associated with its possession of a high-affinity galactose PTS, although S. mutans could persist better at low pHs. Collectively, these results support the concept that the galactose and lactose systems of S. gordonii are subject to complex regulation and that a high-affinity galactose PTS may be advantageous when S. gordonii is competing against the caries pathogen S. mutans in oral biofilms. PMID:22660715
Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
DeBernardi, M.A.; Crowe, R.R.; Mocchetti, I.; Shows, T.B.; Eddy, R.L.; Costa, E.
The authors have used in situ chromosome hybridization and human-mouse somatic cell hybrids to map the gene(s) for human diazepam binding inhibitor (DBI), an endogenous putative modulator of the γ-aminobutyric acid receptor acting at the allosteric regulatory center of this receptor that includes the benzodiazepine recognition site. In 784 chromosome spreads hybridized with human DBI cDNA, the distribution of 1,476 labeled sites revealed a significant clustering of autoradiographic grains (11.3% of total label) on the long arm of chromosome 2 (2q). Furthermore, 63.5% of the grains found on 2q were located on 2q12-21, suggesting regional mapping of DBI gene(s) to this segment. Secondary hybridization signals were frequently observed on other chromosomes and they were statistically significant mainly for chromosomes 5, 6, 11, and 14. In addition, DNA from 32 human-mouse cell hybrids was digested with BamHI and probed with human DBI cDNA. A 3.5-kilobase band, which probably represents the human DBI gene, was assigned to chromosome 2. Four higher molecular weight bands, also detected in BamHI digests, could not be unequivocally assigned. A chromosome 2 location was excluded for the 27-, 13-, and 10-kilobase bands. These results assign a human DBI gene to chromosome 2 (2q12-21) and indicate that three of the four homologous sequences detected by the human DBI probe are located on three other chromosomes
Full Text Available Retinoic acid (RA can induce growth arrest and neuronal differentiation of neuroblastoma cells and has been used in clinic for treatment of neuroblastoma. It has been reported that RA induces the expression of several HOXD genes in human neuroblastoma cell lines, but their roles in RA action are largely unknown. The HOXD cluster contains nine genes (HOXD1, HOXD3, HOXD4, and HOXD8-13 that are positioned sequentially from 3' to 5', with HOXD1 at the 3' end and HOXD13 the 5' end. Here we show that all HOXD genes are induced by RA in the human neuroblastoma BE(2-C cells, with the genes located at the 3' end being activated generally earlier than those positioned more 5' within the cluster. Individual induction of HOXD8, HOXD9, HOXD10 or HOXD12 is sufficient to induce both growth arrest and neuronal differentiation, which is associated with downregulation of cell cycle-promoting genes and upregulation of neuronal differentiation genes. However, induction of other HOXD genes either has no effect (HOXD1 or has partial effects (HOXD3, HOXD4, HOXD11 and HOXD13 on BE(2-C cell proliferation or differentiation. We further show that knockdown of HOXD8 expression, but not that of HOXD9 expression, significantly inhibits the differentiation-inducing activity of RA. HOXD8 directly activates the transcription of HOXC9, a key effector of RA action in neuroblastoma cells. These findings highlight the distinct functions of HOXD genes in RA induction of neuroblastoma cell differentiation.
Full Text Available Abstract Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA.
Rechtsteiner, A. (Andreas); Rocha, L. M. (Luis Mateus)
associated with MeSH terms. MeSH terms can be associated with genes through co-occurrence of these in MEDLINE citations, i.e. the genes occur in titles or abstracts and the MeSH terms are assigned by experts. To identify MeSH terms associated with a group of genes we used the tool MESHGENE developed at the Information Dynamics Lab at HP Labs (http://www-idl.hpl.hp.com/meshgene/). When presented with a list of human genes, MESHGENE uses some sophisticated techniques to search for these gene symbols in the titles and abstracts of all MEDLINE citations. MeSH terms and the number of co-occurrences can be retrieved. Gene symbols that are aliases of each other are pooled from several databases. This addresses the problem of synonymy, the fact that several symbols can refer to the same gene. MESHGENE employs some sophisticated algorithms that disregards symbols that are likely to be acronyms for other concepts than a gene. This addresses the problem of polysemy, i.e. possible multiple meanings of a gene symbol. We applied our approach to gene expression data from herpes virus infected human fibroblast cells. The data contains 12 time-points, between 1/2 hrs and 48 hrs after infection. Singular Value Decomposition was used to identify the dominant modes of expression. 75% of the variance in the expression data was captured by the first two modes, the first exhibiting a monotonly increasing expression pattern and the second a more transient pattern. Projection of the gene expression vectors onto this first two modes identified 3 statistically significant clusters of co-expressed genes. 500 genes from cluster 1 and 300 genes from clusters 2 and 3 each were uploaded to MESHGENE and the MeSH terms and co-occurrence values were retrieved. MeSH terms were also obtained for 5 groups of randomly selected genes with similar numbers of genes. The log was taken of the co-occurrence values and for each MeSH term these log co-occurrence values were summed for each group over the genes in that
Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun
The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867
Shaer, Anahita; Azarpira, Negar; Vahdati, Akbar; Karimi, Mohammad Hosein; Shariati, Mehrdad
In diabetes mellitus type 1, beta cells are mostly destroyed; while in diabetes mellitus type 2, beta cells are reduced by 40% to 60%. We hope that soon, stem cells can be used in diabetes therapy via pancreatic beta cell replacement. Induced pluripotent stem cells are a kind of stem cell taken from an adult somatic cell by "stimulating" certain genes. These induced pluripotent stem cells may be a promising source of cell therapy. This study sought to produce isletlike clusters of insulin-producing cells taken from induced pluripotent stem cells. A human-induced pluripotent stem cell line was induced into isletlike clusters via a 4-step protocol, by adding insulin, transferrin, and selenium (ITS), N2, B27, fibroblast growth factor, and nicotinamide. During differentiation, expression of pancreatic β-cell genes was evaluated by reverse transcriptase-polymerase chain reaction; the morphologic changes of induced pluripotent stem cells toward isletlike clusters were observed by a light microscope. Dithizone staining was used to stain these isletlike clusters. Insulin produced by these clusters was evaluated by radio immunosorbent assay, and the secretion capacity was analyzed with a glucose challenge test. Differentiation was evaluated by analyzing the morphology, dithizone staining, real-time quantitative polymerase chain reaction, and immunocytochemistry. Gene expression of insulin, glucagon, PDX1, NGN3, PAX4, PAX6, NKX6.1, KIR6.2, and GLUT2 were documented by analyzing real-time quantitative polymerase chain reaction. Dithizone-stained cellular clusters were observed after 23 days. The isletlike clusters significantly produced insulin. The isletlike clusters could increase insulin secretion after a glucose challenge test. This work provides a model for studying the differentiation of human-induced pluripotent stem cells to insulin-producing cells.
Full Text Available One of the most intriguing dynamics in biological systems is the emergence of clustering, in the sense that individuals self-organize into separate agglomerations in physical or behavioral space. Several theories have been developed to explain clustering in, for instance, multi-cellular organisms, ant colonies, bee hives, flocks of birds, schools of fish, and animal herds. A persistent puzzle, however, is the clustering of opinions in human populations, particularly when opinions vary continuously, such as the degree to which citizens are in favor of or against a vaccination program. Existing continuous opinion formation models predict "monoculture" in the long run, unless subsets of the population are perfectly separated from each other. Yet, social diversity is a robust empirical phenomenon, although perfect separation is hardly possible in an increasingly connected world. Considering randomness has not overcome the theoretical shortcomings so far. Small perturbations of individual opinions trigger social influence cascades that inevitably lead to monoculture, while larger noise disrupts opinion clusters and results in rampant individualism without any social structure. Our solution to the puzzle builds on recent empirical research, combining the integrative tendencies of social influence with the disintegrative effects of individualization. A key element of the new computational model is an adaptive kind of noise. We conduct computer simulation experiments demonstrating that with this kind of noise a third phase besides individualism and monoculture becomes possible, characterized by the formation of metastable clusters with diversity between and consensus within clusters. When clusters are small, individualization tendencies are too weak to prohibit a fusion of clusters. When clusters grow too large, however, individualization increases in strength, which promotes their splitting. In summary, the new model can explain cultural clustering in
The patenting of human genes has been the subject of debate for decades. While China has gradually come to play an important role in the global genomics-based testing and treatment market, little is known about Chinese scholars' perspectives on patent protection for human genes. A content analysis of academic literature was conducted to identify Chinese scholars' concerns regarding gene patents, including benefits and risks of patenting human genes, attitudes that researchers hold towards gene patenting, and any legal and policy recommendations offered for the gene patent regime in China. 57.2% of articles were written by law professors, but scholars from health sciences, liberal arts, and ethics also participated in discussions on gene patent issues. While discussions of benefits and risks were relatively balanced in the articles, 63.5% of the articles favored gene patenting in general and, of the articles (n = 41) that explored gene patents in the Chinese context, 90.2% supported patent protections for human genes in China. The patentability of human genes was discussed in 33 articles, and 75.8% of these articles reached the conclusion that human genes are patentable. Chinese scholars view the patent regime as an important legal tool to protect the interests of inventors and inventions as well as the genetic resources of China. As such, many scholars support a gene patent system in China. These attitudes towards gene patents remain unchanged following the court ruling in the Myriad case in 2013, but arguments have been raised about the scope of gene patents, in particular that the increasing numbers of gene patents may negatively impact public health in China.
Full Text Available Abstract Background Microcystins are small cyclic heptapeptide toxins produced by a range of distantly related cyanobacteria. Microcystins are synthesized on large NRPS-PKS enzyme complexes. Many structural variants of microcystins are produced simulatenously. A recombination event between the first module of mcyB (mcyB1 and mcyC in the microcystin synthetase gene cluster is linked to the simultaneous production of microcystin variants in strains of the genus Microcystis. Results Here we undertook a phylogenetic study to investigate the order and timing of recombination between the mcyB1 and mcyC genes in a diverse selection of microcystin producing cyanobacteria. Our results provide support for complex evolutionary processes taking place at the mcyB1 and mcyC adenylation domains which recognize and activate the amino acids found at X and Z positions. We find evidence for recent recombination between mcyB1 and mcyC in strains of the genera Anabaena, Microcystis, and Hapalosiphon. We also find clear evidence for independent adenylation domain conversion of mcyB1 by unrelated peptide synthetase modules in strains of the genera Nostoc and Microcystis. The recombination events replace only the adenylation domain in each case and the condensation domains of mcyB1 and mcyC are not transferred together with the adenylation domain. Our findings demonstrate that the mcyB1 and mcyC adenylation domains are recombination hotspots in the microcystin synthetase gene cluster. Conclusion Recombination is thought to be one of the main mechanisms driving the diversification of NRPSs. However, there is very little information on how recombination takes place in nature. This study demonstrates that functional peptide synthetases are created in nature through transfer of adenylation domains without the concomitant transfer of condensation domains.
Sekigami, Yuka; Kobayashi, Takuya; Omi, Ai; Nishitsuji, Koki; Ikuta, Tetsuro; Fujiyama, Asao; Satoh, Noriyuki; Saiga, Hidetoshi
Hox gene clusters with at least 13 paralog group (PG) members are common in vertebrate genomes and in that of amphioxus. Ascidians, which belong to the subphylum Tunicata (Urochordata), are phylogenetically positioned between vertebrates and amphioxus, and traditionally divided into two groups: the Pleurogona and the Enterogona. An enterogonan ascidian, Ciona intestinalis ( Ci ), possesses nine Hox genes localized on two chromosomes; thus, the Hox gene cluster is disintegrated. We investigated the Hox gene cluster of a pleurogonan ascidian, Halocynthia roretzi ( Hr ) to investigate whether Hox gene cluster disintegration is common among ascidians, and if so, how such disintegration occurred during ascidian or tunicate evolution. Our phylogenetic analysis reveals that the Hr Hox gene complement comprises nine members, including one with a relatively divergent Hox homeodomain sequence. Eight of nine Hr Hox genes were orthologous to Ci-Hox1 , 2, 3, 4, 5, 10, 12 and 13. Following the phylogenetic classification into 13 PGs, we designated Hr Hox genes as Hox1, 2, 3, 4, 5, 10, 11/12/13.a , 11/12/13.b and HoxX . To address the chromosomal arrangement of the nine Hox genes, we performed two-color chromosomal fluorescent in situ hybridization, which revealed that the nine Hox genes are localized on a single chromosome in Hr , distinct from their arrangement in Ci . We further examined the order of the nine Hox genes on the chromosome by chromosome/scaffold walking. This analysis suggested a gene order of Hox1 , 11/12/13.b, 11/12/13.a, 10, 5, X, followed by either Hox4, 3, 2 or Hox2, 3, 4 on the chromosome. Based on the present results and those previously reported in Ci , we discuss the establishment of the Hox gene complement and disintegration of Hox gene clusters during the course of ascidian or tunicate evolution. The Hox gene cluster and the genome must have experienced extensive reorganization during the course of evolution from the ancestral tunicate to Hr and Ci
Full Text Available After the radiation of eukaryotes, the NUO operon, controlling the transcription of the NADH dehydrogenase complex of the oxidative phosphorylation system (OXPHOS complex I, was broken down and genes encoding this protein complex were dispersed across the nuclear genome. Seven genes, however, were retained in the genome of the mitochondrion, the ancient symbiote of eukaryotes. This division, in combination with the three-fold increase in subunit number from bacteria (N = approximately 14 to man (N = 45, renders the transcription regulation of OXPHOS complex I a challenge. Recently bioinformatics analysis of the promoter regions of all OXPHOS genes in mammals supported patterns of co-regulation, suggesting that natural selection favored a mechanism facilitating the transcriptional regulatory control of genes encoding subunits of these large protein complexes. Here, using real time PCR of mitochondrial (mtDNA- and nuclear DNA (nDNA-encoded transcripts in a panel of 13 different human tissues, we show that the expression pattern of OXPHOS complex I genes is regulated in several clusters. Firstly, all mtDNA-encoded complex I subunits (N = 7 share a similar expression pattern, distinct from all tested nDNA-encoded subunits (N = 10. Secondly, two sub-clusters of nDNA-encoded transcripts with significantly different expression patterns were observed. Thirdly, the expression patterns of two nDNA-encoded genes, NDUFA4 and NDUFA5, notably diverged from the rest of the nDNA-encoded subunits, suggesting a certain degree of tissue specificity. Finally, the expression pattern of the mtDNA-encoded ND4L gene diverged from the rest of the tested mtDNA-encoded transcripts that are regulated by the same promoter, consistent with post-transcriptional regulation. These findings suggest, for the first time, that the regulation of complex I subunits expression in humans is complex rather than reflecting global co-regulation.
Full Text Available Oncogenic transformation of normal cells often involves epigenetic alterations, including histone modification and DNA methylation. We conducted whole-genome bisulfite sequencing to determine the DNA methylomes of normal breast, fibroadenoma, invasive ductal carcinomas and MCF7. The emergence, disappearance, expansion and contraction of kilobase-sized hypomethylated regions (HMRs and the hypomethylation of the megabase-sized partially methylated domains (PMDs are the major forms of methylation changes observed in breast tumor samples. Hierarchical clustering of HMR revealed tumor-specific hypermethylated clusters and differential methylated enhancers specific to normal or breast cancer cell lines. Joint analysis of gene expression and DNA methylation data of normal breast and breast cancer cells identified differentially methylated and expressed genes associated with breast and/or ovarian cancers in cancer-specific HMR clusters. Furthermore, aberrant patterns of X-chromosome inactivation (XCI was found in breast cancer cell lines as well as breast tumor samples in the TCGA BRCA (breast invasive carcinoma dataset. They were characterized with differentially hypermethylated XIST promoter, reduced expression of XIST, and over-expression of hypomethylated X-linked genes. High expressions of these genes were significantly associated with lower survival rates in breast cancer patients. Comprehensive analysis of the normal and breast tumor methylomes suggests selective targeting of DNA methylation changes during breast cancer progression. The weak causal relationship between DNA methylation and gene expression observed in this study is evident of more complex role of DNA methylation in the regulation of gene expression in human epigenetics that deserves further investigation.
McDowell, Ian C; Manandhar, Dinesh; Vockley, Christopher M; Schmid, Amy K; Reddy, Timothy E; Engelhardt, Barbara E
Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP), which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.
Ian C McDowell
Full Text Available Transcriptome-wide time series expression profiling is used to characterize the cellular response to environmental perturbations. The first step to analyzing transcriptional response data is often to cluster genes with similar responses. Here, we present a nonparametric model-based method, Dirichlet process Gaussian process mixture model (DPGP, which jointly models data clusters with a Dirichlet process and temporal dependencies with Gaussian processes. We demonstrate the accuracy of DPGP in comparison to state-of-the-art approaches using hundreds of simulated data sets. To further test our method, we apply DPGP to published microarray data from a microbial model organism exposed to stress and to novel RNA-seq data from a human cell line exposed to the glucocorticoid dexamethasone. We validate our clusters by examining local transcription factor binding and histone modifications. Our results demonstrate that jointly modeling cluster number and temporal dependencies can reveal shared regulatory mechanisms. DPGP software is freely available online at https://github.com/PrincetonUniversity/DP_GP_cluster.
Full Text Available Abstract Background The radiation bystander effect is an important component of the overall biological response of tissues and organisms to ionizing radiation, but the signaling mechanisms between irradiated and non-irradiated bystander cells are not fully understood. In this study, we measured a time-series of gene expression after α-particle irradiation and applied the Feature Based Partitioning around medoids Algorithm (FBPA, a new clustering method suitable for sparse time series, to identify signaling modules that act in concert in the response to direct irradiation and bystander signaling. We compared our results with those of an alternate clustering method, Short Time series Expression Miner (STEM. Results While computational evaluations of both clustering results were similar, FBPA provided more biological insight. After irradiation, gene clusters were enriched for signal transduction, cell cycle/cell death and inflammation/immunity processes; but only FBPA separated clusters by function. In bystanders, gene clusters were enriched for cell communication/motility, signal transduction and inflammation processes; but biological functions did not separate as clearly with either clustering method as they did in irradiated samples. Network analysis confirmed p53 and NF-κB transcription factor-regulated gene clusters in irradiated and bystander cells and suggested novel regulators, such as KDM5B/JARID1B (lysine (K-specific demethylase 5B and HDACs (histone deacetylases, which could epigenetically coordinate gene expression after irradiation. Conclusions In this study, we have shown that a new time series clustering method, FBPA, can provide new leads to the mechanisms regulating the dynamic cellular response to radiation. The findings implicate epigenetic control of gene expression in addition to transcription factor networks.
Lovell, Peter V; Wirthlin, Morgan; Wilhelm, Larry; Minx, Patrick; Lazar, Nathan H; Carbone, Lucia; Warren, Wesley C; Mello, Claudio V
Birds are one of the most highly successful and diverse groups of vertebrates, having evolved a number of distinct characteristics, including feathers and wings, a sturdy lightweight skeleton and unique respiratory and urinary/excretion systems. However, the genetic basis of these traits is poorly understood. Using comparative genomics based on extensive searches of 60 avian genomes, we have found that birds lack approximately 274 protein coding genes that are present in the genomes of most vertebrate lineages and are for the most part organized in conserved syntenic clusters in non-avian sauropsids and in humans. These genes are located in regions associated with chromosomal rearrangements, and are largely present in crocodiles, suggesting that their loss occurred subsequent to the split of dinosaurs/birds from crocodilians. Many of these genes are associated with lethality in rodents, human genetic disorders, or biological functions targeting various tissues. Functional enrichment analysis combined with orthogroup analysis and paralog searches revealed enrichments that were shared by non-avian species, present only in birds, or shared between all species. Together these results provide a clearer definition of the genetic background of extant birds, extend the findings of previous studies on missing avian genes, and provide clues about molecular events that shaped avian evolution. They also have implications for fields that largely benefit from avian studies, including development, immune system, oncogenesis, and brain function and cognition. With regards to the missing genes, birds can be considered ‘natural knockouts’ that may become invaluable model organisms for several human diseases.
Baltussen, Tim J H; Coolen, Jordy P M; Zoll, Jan; Verweij, Paul E; Melchers, Willem J G
Aspergillus fumigatus is a saprophytic fungus that extensively produces conidia. These microscopic asexually reproductive structures are small enough to reach the lungs. Germination of conidia followed by hyphal growth inside human lungs is a key step in the establishment of infection in immunocompromised patients. RNA-Seq was used to analyze the transcriptome of dormant and germinating A. fumigatus conidia. Construction of a gene co-expression network revealed four gene clusters (modules) correlated with a growth phase (dormant, isotropic growth, polarized growth). Transcripts levels of genes encoding for secondary metabolites were high in dormant conidia. During isotropic growth, transcript levels of genes involved in cell wall modifications increased. Two modules encoding for growth and cell cycle/DNA processing were associated with polarized growth. In addition, the co-expression network was used to identify highly connected intermodular hub genes. These genes may have a pivotal role in the respective module and could therefore be compelling therapeutic targets. Generally, cell wall remodeling is an important process during isotropic and polarized growth, characterized by an increase of transcripts coding for hyphal growth and cell cycle/DNA processing when polarized growth is initiated. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.
Emily J. Parker
Full Text Available The indole-diterpene paxilline is an abundant secondary metabolite synthesized by Penicillium paxilli. In total, 21 genes have been identified at the PAX locus of which six have been previously confirmed to have a functional role in paxilline biosynthesis. A combination of bioinformatics, gene expression and targeted gene replacement analyses were used to define the boundaries of the PAX gene cluster. Targeted gene replacement identified seven genes, paxG, paxA, paxM, paxB, paxC, paxP and paxQ that were all required for paxilline production, with one additional gene, paxD, required for regular prenylation of the indole ring post paxilline synthesis. The two putative transcription factors, PP104 and PP105, were not co-regulated with the pax genes and based on targeted gene replacement, including the double knockout, did not have a role in paxilline production. The relationship of indole dimethylallyl transferases involved in prenylation of indole-diterpenes such as paxilline or lolitrem B, can be found as two disparate clades, not supported by prenylation type (e.g., regular or reverse. This paper provides insight into the P. paxilli indole-diterpene locus and reviews the recent advances identified in paxilline biosynthesis.
Gautier, Aude; Le Gac, Florence; Lareyre, Jean-Jacques
display a different cellular localization compared to that of the gsdf gene indicating that the later gene is not co-regulated. Interestingly, our study identifies new clustered genes that are specifically expressed in previtellogenic oocytes (nup54, aff1, klhl8, sdad1). Copyright Â© 2010 Elsevier B.V. All rights reserved.
Pyeon, Hye-Rim; Nah, Hee-Ju; Kang, Seung-Hoon; Choi, Si-Sun; Kim, Eung-Soo
Heterologous expression of biosynthetic gene clusters of natural microbial products has become an essential strategy for titer improvement and pathway engineering of various potentially-valuable natural products. A Streptomyces artificial chromosomal conjugation vector, pSBAC, was previously successfully applied for precise cloning and tandem integration of a large polyketide tautomycetin (TMC) biosynthetic gene cluster (Nah et al. in Microb Cell Fact 14(1):1, 2015), implying that this strategy could be employed to develop a custom overexpression scheme of natural product pathway clusters present in actinomycetes. To validate the pSBAC system as a generally-applicable heterologous overexpression system for a large-sized polyketide biosynthetic gene cluster in Streptomyces, another model polyketide compound, the pikromycin biosynthetic gene cluster, was preciously cloned and heterologously expressed using the pSBAC system. A unique HindIII restriction site was precisely inserted at one of the border regions of the pikromycin biosynthetic gene cluster within the chromosome of Streptomyces venezuelae, followed by site-specific recombination of pSBAC into the flanking region of the pikromycin gene cluster. Unlike the previous cloning process, one HindIII site integration step was skipped through pSBAC modification. pPik001, a pSBAC containing the pikromycin biosynthetic gene cluster, was directly introduced into two heterologous hosts, Streptomyces lividans and Streptomyces coelicolor, resulting in the production of 10-deoxymethynolide, a major pikromycin derivative. When two entire pikromycin biosynthetic gene clusters were tandemly introduced into the S. lividans chromosome, overproduction of 10-deoxymethynolide and the presence of pikromycin, which was previously not detected, were both confirmed. Moreover, comparative qRT-PCR results confirmed that the transcription of pikromycin biosynthetic genes was significantly upregulated in S. lividans containing tandem
Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E
Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
Jacobson, M R; Brigle, K E; Bennett, L T; Setterquist, R A; Wilson, M S; Cash, V L; Beynon, J; Newton, W E; Dean, D R
Determination of a 28,793-base-pair DNA sequence of a region from the Azotobacter vinelandii genome that includes and flanks the nitrogenase structural gene region was completed. This information was used to revise the previously proposed organization of the major nif cluster. The major nif cluster from A. vinelandii encodes 15 nif-specific genes whose products bear significant structural identity to the corresponding nif-specific gene products from Klebsiella pneumoniae. These genes include ...
Full Text Available Abstract Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered, missing value imputation (2, standardization of data (2, gene selection (19 or clustering method (11. The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that
Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is
Showe Louise C
Full Text Available Abstract Background Classification studies using gene expression datasets are usually based on small numbers of samples and tens of thousands of genes. The selection of those genes that are important for distinguishing the different sample classes being compared, poses a challenging problem in high dimensional data analysis. We describe a new procedure for selecting significant genes as recursive cluster elimination (RCE rather than recursive feature elimination (RFE. We have tested this algorithm on six datasets and compared its performance with that of two related classification procedures with RFE. Results We have developed a novel method for selecting significant genes in comparative gene expression studies. This method, which we refer to as SVM-RCE, combines K-means, a clustering method, to identify correlated gene clusters, and Support Vector Machines (SVMs, a supervised machine learning classification method, to identify and score (rank those gene clusters for the purpose of classification. K-means is used initially to group genes into clusters. Recursive cluster elimination (RCE is then applied to iteratively remove those clusters of genes that contribute the least to the classification performance. SVM-RCE identifies the clusters of correlated genes that are most significantly differentially expressed between the sample classes. Utilization of gene clusters, rather than individual genes, enhances the supervised classification accuracy of the same data as compared to the accuracy when either SVM or Penalized Discriminant Analysis (PDA with recursive feature elimination (SVM-RFE and PDA-RFE are used to remove genes based on their individual discriminant weights. Conclusion SVM-RCE provides improved classification accuracy with complex microarray data sets when it is compared to the classification accuracy of the same datasets using either SVM-RFE or PDA-RFE. SVM-RCE identifies clusters of correlated genes that when considered together
Ivan G. Costa
Full Text Available This work performs a data driven comparative study of clustering methods used in the analysis of gene expression time courses (or time series. Five clustering methods found in the literature of gene expression analysis are compared: agglomerative hierarchical clustering, CLICK, dynamical clustering, k-means and self-organizing maps. In order to evaluate the methods, a k-fold cross-validation procedure adapted to unsupervised methods is applied. The accuracy of the results is assessed by the comparison of the partitions obtained in these experiments with gene annotation, such as protein function and series classification.
Dehal, Paramvir S.; Boore, Jeffrey L.
We present here the PhIGs database, a phylogenomic resource for sequenced genomes. Although many methods exist for clustering gene families, very few attempt to create truly orthologous clusters sharing descent from a single ancestral gene across a range of evolutionary depths. Although these non-phylogenetic gene family clusters have been used broadly for gene annotation, errors are known to be introduced by the artifactual association of slowly evolving paralogs and lack of annotation for those more rapidly evolving. A full phylogenetic framework is necessary for accurate inference of function and for many studies that address pattern and mechanism of the evolution of the genome. The automated generation of evolutionary gene clusters, creation of gene trees, determination of orthology and paralogy relationships, and the correlation of this information with gene annotations, expression information, and genomic context is an important resource to the scientific community.
Santos, dos F.; Vera, J.L.; Heijden, van der R.; Valdez, G.F.; Vos, de W.M.; Sesma, F.; Hugenholtz, J.
The coenzyme B12 production pathway in Lactobacillus reuteri has been deduced using a combination of genetic, biochemical and bioinformatics approaches. The coenzyme B12 gene cluster of Lb. reuteri CRL1098 has the unique feature of clustering together the cbi, cob and hem genes. It consists of 29
Proctor, R.H.; Hove, van F.; Susca, A.; Stea, A.; Busman, M.; Lee, van der T.A.J.; Waalwijk, C.; Moretti, A.
In Fusarium, the ability to produce fumonisins is governed by a 17-gene fumonisin biosynthetic gene (FUM) cluster. Here, we examined the cluster in F. oxysporum strain O-1890 and nine other species selected to represent a wide range of the genetic diversity within the GFSC.
Botka, C. W.; Wittig, T. W.; Graul, R. C.
The proton-dependent oligopeptide transporters (POT) gene family currently consists of approximately 70 cloned cDNAs derived from diverse organisms. In mammals, two genes encoding peptide transporters, PepT1 and PepT2 have been cloned in several species including humans, in addition to a rat...... histidine/peptide transporter (rPHT1). Because the Candida elegans genome contains five putative POT genes, we searched the available protein and nucleic acid databases for additional mammalian/human POT genes, using iterative BLAST runs and the human expressed sequence tags (EST) database. The apparent...... and introns of the likely human orthologue (termed hPHT2). Northern analyses with EST clones indicated that hPHT1 is primarily expressed in skeletal muscle and spleen, whereas hPHT2 is found in spleen, placenta, lung, leukocytes, and heart. These results suggest considerable complexity of the human POT gene...
Full Text Available The skin microbial community is regarded as essential for human health and well-being, but likewise plays an important role in the formation of body odor in, for instance, the axillae. Few molecular-based research was done on the axillary microbiome. This study typified the axillary microbiome of a group of 53 healthy subjects. A profound view was obtained of the interpersonal, intrapersonal and temporal diversity of the human axillary microbiota. Denaturing gradient gel electrophoresis (DGGE and next generation sequencing on 16S rRNA gene region were combined and used as extent to each other. Two important clusters were characterized, where Staphylococcus and Corynebacterium species were the abundant species. Females predominantly clustered within the Staphylococcus cluster (87%, n = 17, whereas males clustered more in the Corynebacterium cluster (39%, n = 36. The axillary microbiota was unique to each individual. Left-right asymmetry occurred in about half of the human population. For the first time, an elaborate study was performed on the dynamics of the axillary microbiome. A relatively stable axillary microbiome was noticed, although a few subjects evolved towards another stable community. The deodorant usage had a proportional linear influence on the species diversity of the axillary microbiome.
Mattila-Sandholm, T.; Blaut, M.; Daly, C.; Vuyst, de L.; Dore, J.; Gibson, G.; Goossens, H.; Knorr, D.; Lucas, J.; Lahteenmaki, L.; Mercenier, A.M.E.; Saarela, M.; Shanahan, F.; Vos, de W.M.
The Food, GI-tract Functionality and Human Health (PROEUHEALTH) Cluster brings together eight complementary, multicentre interdisciplinary research projects. All have the common aim of improving the health and quality of life of European comsumers. The collaboration involves 64 different research
Landolfo, Sara; Ianiri, Giuseppe; Camiolo, Salvatore; Porceddu, Andrea; Mulas, Giuliana; Chessa, Rossella; Zara, Giacomo; Mannazzu, Ilaria
A molecular approach was applied to the study of the carotenoid biosynthetic pathway of Rhodotorula mucilaginosa. At first, functional annotation of the genome of R. mucilaginosa C2.5t1 was carried out and gene ontology categories were assigned to 4033 predicted proteins. Then, a set of genes involved in different steps of carotenogenesis was identified and those coding for phytoene desaturase, phytoene synthase/lycopene cyclase and carotenoid dioxygenase (CAR genes) proved to be clustered within a region of ~10 kb. Quantitative PCR of the genes involved in carotenoid biosynthesis showed that genes coding for 3-hydroxy-3-methylglutharyl-CoA reductase and mevalonate kinase are induced during exponential phase while no clear trend of induction was observed for phytoene synthase/lycopene cyclase and phytoene dehydrogenase encoding genes. Thus, in R. mucilaginosa the induction of genes involved in the early steps of carotenoid biosynthesis is transient and accompanies the onset of carotenoid production, while that of CAR genes does not correlate with the amount of carotenoids produced. The transcript levels of genes coding for carotenoid dioxygenase, superoxide dismutase and catalase A increased during the accumulation of carotenoids, thus suggesting the activation of a mechanism aimed at the protection of cell structures from oxidative stress during carotenoid biosynthesis. The data presented herein, besides being suitable for the elucidation of the mechanisms that underlie carotenoid biosynthesis, will contribute to boosting the biotechnological potential of this yeast by improving the outcome of further research efforts aimed at also exploring other features of interest.
Background Lateral Gene Transfer (LGT) has recently gained recognition as an important contributor to some eukaryote proteomes, but the mechanisms of acquisition and fixation in eukaryotic genomes are still uncertain. A previously defined norm for LGTs in microbial eukaryotes states that the majority are genes involved in metabolism, the LGTs are typically localized one by one, surrounded by vertically inherited genes on the chromosome, and phylogenetics shows that a broad collection of bacterial lineages have contributed to the transferome. Results A unique 34 kbp long fragment with 27 clustered genes (TvLF) of prokaryote origin was identified in the sequenced genome of the protozoan parasite Trichomonas vaginalis. Using a PCR based approach we confirmed the presence of the orthologous fragment in four additional T. vaginalis strains. Detailed sequence analyses unambiguously suggest that TvLF is the result of one single, recent LGT event. The proposed donor is a close relative to the firmicute bacterium Peptoniphilus harei. High nucleotide sequence similarity between T. vaginalis strains, as well as to P. harei, and the absence of homologs in other Trichomonas species, suggests that the transfer event took place after the radiation of the genus Trichomonas. Some genes have undergone pseudogenization and degradation, indicating that they may not be retained in the future. Functional annotations reveal that genes involved in informational processes are particularly prone to degradation. Conclusions We conclude that, although the majority of eukaryote LGTs are single gene occurrences, they may be acquired in clusters of several genes that are subsequently cleansed of evolutionarily less advantageous genes. PMID:24898731
Dufresne, Karine; Saulnier-Bellemare, Julie; Daigle, France
The human-specific pathogen Salmonella enterica serovar Typhi causes typhoid, a major public health issue in developing countries. Several aspects of its pathogenesis are still poorly understood. S . Typhi possesses 14 fimbrial gene clusters including 12 chaperone-usher fimbriae ( stg, sth, bcf , fim, saf , sef , sta, stb, stc, std, ste , and tcf ). These fimbriae are weakly expressed in laboratory conditions and only a few are actually characterized. In this study, expression of all S . Typhi chaperone-usher fimbriae and their potential roles in pathogenesis such as interaction with host cells, motility, or biofilm formation were assessed. All S . Typhi fimbriae were better expressed in minimal broth. Each system was overexpressed and only the fimbrial gene clusters without pseudogenes demonstrated a putative major subunits of about 17 kDa on SDS-PAGE. Six of these (Fim, Saf, Sta, Stb, Std, and Tcf) also show extracellular structure by electron microscopy. The impact of fimbrial deletion in a wild-type strain or addition of each individual fimbrial system to an S . Typhi afimbrial strain were tested for interactions with host cells, biofilm formation and motility. Several fimbriae modified bacterial interactions with human cells (THP-1 and INT-407) and biofilm formation. However, only Fim fimbriae had a deleterious effect on motility when overexpressed. Overall, chaperone-usher fimbriae seem to be an important part of the balance between the different steps (motility, adhesion, host invasion and persistence) of S . Typhi pathogenesis.
Full Text Available Abstract Background Clustering is a widely used technique for analysis of gene expression data. Most clustering methods group genes based on the distances, while few methods group genes according to the similarities of the distributions of the gene expression levels. Furthermore, as the biological annotation resources accumulated, an increasing number of genes have been annotated into functional categories. As a result, evaluating the performance of clustering methods in terms of the functional consistency of the resulting clusters is of great interest. Results In this paper, we proposed the WDCM (Weibull Distribution-based Clustering Method, a robust approach for clustering gene expression data, in which the gene expressions of individual genes are considered as the random variables following unique Weibull distributions. Our WDCM is based on the concept that the genes with similar expression profiles have similar distribution parameters, and thus the genes are clustered via the Weibull distribution parameters. We used the WDCM to cluster three cancer gene expression data sets from the lung cancer, B-cell follicular lymphoma and bladder carcinoma and obtained well-clustered results. We compared the performance of WDCM with k-means and Self Organizing Map (SOM using functional annotation information given by the Gene Ontology (GO. The results showed that the functional annotation ratios of WDCM are higher than those of the other methods. We also utilized the external measure Adjusted Rand Index to validate the performance of the WDCM. The comparative results demonstrate that the WDCM provides the better clustering performance compared to k-means and SOM algorithms. The merit of the proposed WDCM is that it can be applied to cluster incomplete gene expression data without imputing the missing values. Moreover, the robustness of WDCM is also evaluated on the incomplete data sets. Conclusions The results demonstrate that our WDCM produces clusters
Liu, Ying; Ciliax, Brian J; Borges, Karin; Dasigi, Venu; Ram, Ashwin; Navathe, Shamkant B; Dingledine, Ray
One of the key challenges of microarray studies is to derive biological insights from the unprecedented quatities of data on gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the nature of the functional links among genes within the derived clusters. However, the quality of the keyword lists extracted from biomedical literature for each gene significantly affects the clustering results. We extracted keywords from MEDLINE that describes the most prominent functions of the genes, and used the resulting weights of the keywords as feature vectors for gene clustering. By analyzing the resulting cluster quality, we compared two keyword weighting schemes: normalized z-score and term frequency-inverse document frequency (TFIDF). The best combination of background comparison set, stop list and stemming algorithm was selected based on precision and recall metrics. In a test set of four known gene groups, a hierarchical algorithm correctly assigned 25 of 26 genes to the appropriate clusters based on keywords extracted by the TDFIDF weighting scheme, but only 23 og 26 with the z-score method. To evaluate the effectiveness of the weighting schemes for keyword extraction for gene clusters from microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle were used as a second test set. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords had higher purity, lower entropy, and higher mutual information than those produced from normalized z-score weighted keywords. The optimized algorithms should be useful for sorting genes from microarray lists into functionally discrete clusters.
Do, Jin Hwan; Choi, Dong-Kug
The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.
Zhang, Han; Rokas, Antonis; Slot, Jason C
Dermatophyte fungi of the family Arthrodermataceae (Eurotiomycetes) colonize keratinized tissue, such as skin, frequently causing superficial mycoses in humans and other mammals, reptiles, and birds. Competition with native microflora likely underlies the propensity of these dermatophytes to produce a diversity of antibiotics and compounds for scavenging iron, which is extremely scarce, as well as the presence of an unusually large number of putative secondary metabolism gene clusters, most of which contain non-ribosomal peptide synthetases (NRPS), in their genomes. To better understand the historical origins and diversification of NRPS-containing gene clusters we examined the evolution of a variable locus (VL) that exists in one of three alternative conformations among the genomes of seven dermatophyte species. The first conformation of the VL (termed VLA) contains only 539 base pairs of sequence and lacks protein-coding genes, whereas the other two conformations (termed VLB and VLC) span 36 Kb and 27 Kb and contain 12 and 10 genes, respectively. Interestingly, both VLB and VLC appear to contain distinct secondary metabolism gene clusters; VLB contains a NRPS gene as well as four porphyrin metabolism genes never found to be physically linked in the genomes of 128 other fungal species, whereas VLC also contains a NRPS gene as well as several others typically found associated with secondary metabolism gene clusters. Phylogenetic evidence suggests that the VL locus was present in the ancestor of all seven species achieving its present distribution through subsequent differential losses or retentions of specific conformations. We propose that the existence of variable loci, similar to the one we studied, in fungal genomes could potentially explain the dramatic differences in secondary metabolic diversity between closely related species of filamentous fungi, and contribute to host adaptation and the generation of metabolic diversity.
Živković, J.; Tadić, B.; Wick, N.; Thurner, S.
We analyze gene expression time-series data of yeast (S. cerevisiae) measured along two full cell-cycles. We quantify these data by using q-exponentials, gene expression ranking and a temporal mean-variance analysis. We construct gene interaction networks based on correlation coefficients and study the formation of the corresponding giant components and minimum spanning trees. By coloring genes according to their cell function we find functional clusters in the correlation networks and functional branches in the associated trees. Our results suggest that a percolation point of functional clusters can be identified on these gene expression correlation networks.
Full Text Available Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS is introduced to automatically determine the boundary threshold. Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.
Full Text Available Pattern recognition receptors are crucial in initiating and shaping innate and adaptive immune responses and often belong to families of structurally and evolutionarily related proteins. The human C-type lectin-like receptors encoded in the DECTIN-1 cluster within the NK gene complex contain prominent receptors with pattern recognition function, such as DECTIN-1 and LOX-1. All members of this cluster share significant homology and are considered to have arisen from subsequent gene duplications. Recent developments in sequencing and the availability of comprehensive sequence data comprising many species showed that the receptors of the DECTIN-1 cluster are not only homologous to each other but also highly conserved between species. Even in Caenorhabditis elegans, genes displaying homology to the mammalian C-type lectin-like receptors have been detected. In this paper, we conduct a comprehensive phylogenetic survey and give an up-to-date overview of the currently available data on the evolutionary emergence of the DECTIN-1 cluster genes.
Host-pathogen interactions are of prime importance to modern agriculture. Plants utilize various types of resistance genes to mitigate pathogen damage. Identification of the specific gene responsible for a specific resistance can be difficult due to duplication and clustering within R-gene families....
Ma, Yuanyuan; Hu, Xiaohua; He, Tingting; Jiang, Xingpeng
Nonnegative matrix factorization (NMF) has received considerable attention due to its interpretation of observed samples as combinations of different components, and has been successfully used as a clustering method. As an extension of NMF, Symmetric NMF (SNMF) inherits the advantages of NMF. Unlike NMF, however, SNMF takes a nonnegative similarity matrix as an input, and two lower rank nonnegative matrices (H, H T ) are computed as an output to approximate the original similarity matrix. Laplacian regularization has improved the clustering performance of NMF and SNMF. However, Laplacian regularization (LR), as a classic manifold regularization method, suffers some problems because of its weak extrapolating ability. In this paper, we propose a novel variant of SNMF, called Hessian regularization based symmetric nonnegative matrix factorization (HSNMF), for this purpose. In contrast to Laplacian regularization, Hessian regularization fits the data perfectly and extrapolates nicely to unseen data. We conduct extensive experiments on several datasets including text data, gene expression data and HMP (Human Microbiome Project) data. The results show that the proposed method outperforms other methods, which suggests the potential application of HSNMF in biological data clustering. Copyright Â© 2016. Published by Elsevier Inc.
Ishida, Miho; Moore, Gudrun E.
Detailed comprehensive molecular analysis using families and multiple matched tissues is essential to determine whether imprinted genes have a functional role in humans. See research article: http://genomebiology.com/2011/12/3/R25
Full Text Available Abstract Background Gene expression technologies have opened up new ways to diagnose and treat cancer and other diseases. Clustering algorithms are a useful approach with which to analyze genome expression data. They attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. An important problem associated with gene classification is to discern whether the clustering process can find a relevant partition as well as the identification of new genes classes. There are two key aspects to classification: the estimation of the number of clusters, and the decision as to whether a new unit (gene, tumor sample... belongs to one of these previously identified clusters or to a new group. Results ICGE is a user-friendly R package which provides many functions related to this problem: identify the number of clusters using mixed variables, usually found by applied biomedical researchers; detect whether the data have a cluster structure; identify whether a new unit belongs to one of the pre-identified clusters or to a novel group, and classify new units into the corresponding cluster. The functions in the ICGE package are accompanied by help files and easy examples to facilitate its use. Conclusions We demonstrate the utility of ICGE by analyzing simulated and real data sets. The results show that ICGE could be very useful to a broad research community.
Jason C Slot
Full Text Available High affinity nitrate assimilation genes in fungi occur in a cluster (fHANT-AC that can be coordinately regulated. The clustered genes include nrt2, which codes for a high affinity nitrate transporter; euknr, which codes for nitrate reductase; and NAD(PH-nir, which codes for nitrite reductase. Homologs of genes in the fHANT-AC occur in other eukaryotes and prokaryotes, but they have only been found clustered in the oomycete Phytophthora (heterokonts. We performed independent and concatenated phylogenetic analyses of homologs of all three genes in the fHANT-AC. Phylogenetic analyses limited to fungal sequences suggest that the fHANT-AC has been transferred horizontally from a basidiomycete (mushrooms and smuts to an ancestor of the ascomycetous mold Trichoderma reesei. Phylogenetic analyses of sequences from diverse eukaryotes and eubacteria, and cluster structure, are consistent with a hypothesis that the fHANT-AC was assembled in a lineage leading to the oomycetes and was subsequently transferred to the Dikarya (Ascomycota+Basidiomycota, which is a derived fungal clade that includes the vast majority of terrestrial fungi. We propose that the acquisition of high affinity nitrate assimilation contributed to the success of Dikarya on land by allowing exploitation of nitrate in aerobic soils, and the subsequent transfer of a complete assimilation cluster improved the fitness of T. reesei in a new niche. Horizontal transmission of this cluster of functionally integrated genes supports the "selfish operon" hypothesis for maintenance of gene clusters.
Full Text Available Immunity-related GTPases (IRG play an important role in defense against intracellular pathogens. One member of this gene family in humans, IRGM, has been recently implicated as a risk factor for Crohn's disease. We analyzed the detailed structure of this gene family among primates and showed that most of the IRG gene cluster was deleted early in primate evolution, after the divergence of the anthropoids from prosimians ( about 50 million years ago. Comparative sequence analysis of New World and Old World monkey species shows that the single-copy IRGM gene became pseudogenized as a result of an Alu retrotransposition event in the anthropoid common ancestor that disrupted the open reading frame (ORF. We find that the ORF was reestablished as a part of a polymorphic stop codon in the common ancestor of humans and great apes. Expression analysis suggests that this change occurred in conjunction with the insertion of an endogenous retrovirus, which altered the transcription initiation, splicing, and expression profile of IRGM. These data argue that the gene became pseudogenized and was then resurrected through a series of complex structural events and suggest remarkable functional plasticity where alleles experience diverse evolutionary pressures over time. Such dynamism in structure and evolution may be critical for a gene family locked in an arms race with an ever-changing repertoire of intracellular parasites.
Louw Abraham I
Full Text Available Abstract Background Microarray technology makes it possible to identify changes in gene expression of an organism, under various conditions. Data mining is thus essential for deducing significant biological information such as the identification of new biological mechanisms or putative drug targets. While many algorithms and software have been developed for analysing gene expression, the extraction of relevant information from experimental data is still a substantial challenge, requiring significant time and skill. Description MADIBA (MicroArray Data Interface for Biological Annotation facilitates the assignment of biological meaning to gene expression clusters by automating the post-processing stage. A relational database has been designed to store the data from gene to pathway for Plasmodium, rice and Arabidopsis. Tools within the web interface allow rapid analyses for the identification of the Gene Ontology terms relevant to each cluster; visualising the metabolic pathways where the genes are implicated, their genomic localisations, putative common transcriptional regulatory elements in the upstream sequences, and an analysis specific to the organism being studied. Conclusion MADIBA is an integrated, online tool that will assist researchers in interpreting their results and understand the meaning of the co-expression of a cluster of genes. Functionality of MADIBA was validated by analysing a number of gene clusters from several published experiments – expression profiling of the Plasmodium life cycle, and salt stress treatments of Arabidopsis and rice. In most of the cases, the same conclusions found by the authors were quickly and easily obtained after analysing the gene clusters with MADIBA.
Duffy, Michael F; Tang, Jingyi; Sumardy, Fransisca; Nguyen, Hanh H T; Selvarajah, Shamista A; Josling, Gabrielle A; Day, Karen P; Petter, Michaela; Brown, Graham V
The Plasmodium falciparum var multigene family encodes the cytoadhesive, variant antigen PfEMP1. P. falciparum antigenic variation and cytoadhesion specificity are controlled by epigenetic switching between the single, or few, simultaneously expressed var genes. Most var genes are maintained in perinuclear clusters of heterochromatic telomeres. The active var gene(s) occupy a single, perinuclear var expression site. It is unresolved whether the var expression site forms in situ at a telomeric cluster or whether it is an extant compartment to which single chromosomes travel, thus controlling var switching. Here we show that transcription of a var gene did not require decreased colocalisation with clusters of telomeres, supporting var expression site formation in situ. However following recombination within adjacent subtelomeric sequences, the same var gene was persistently activated and did colocalise less with telomeric clusters. Thus, participation in stable, heterochromatic, telomere clusters and var switching are independent but are both affected by subtelomeric sequences. The var expression site colocalised with the euchromatic mark H3K27ac to a greater extent than it did with heterochromatic H3K9me3. H3K27ac was enriched within the active var gene promoter even when the var gene was transiently repressed in mature parasites and thus H3K27ac may contribute to var gene epigenetic memory. © 2016 Federation of European Biochemical Societies.
Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S.; Qian, Pei-Yuan
validating this direct cloning plug-and-playa approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation
Susca, Antonia; Proctor, Robert H; Butchko, Robert A E; Haidukowski, Miriam; Stea, Gaetano; Logrieco, Antonio; Moretti, Antonio
The ability to produce fumonisin mycotoxins varies among members of the black aspergilli. Previously, analyses of selected genes in the fumonisin biosynthetic gene (fum) cluster in black aspergilli from California grapes indicated that fumonisin-nonproducing isolates of Aspergillus welwitschiae lack six fum genes, but nonproducing isolates of Aspergillus niger do not. In the current study, analyses of black aspergilli from grapes from the Mediterranean Basin indicate that the genomic context of the fum cluster is the same in isolates of A. niger and A. welwitschiae regardless of fumonisin-production ability and that full-length clusters occur in producing isolates of both species and nonproducing isolates of A. niger. In contrast, the cluster has undergone an eight-gene deletion in fumonisin-nonproducing isolates of A. welwitschiae. Phylogenetic analyses suggest each species consists of a mixed population of fumonisin-producing and nonproducing individuals, and that existence of both production phenotypes may provide a selective advantage to these species. Differences in gene content of fum cluster homologues and phylogenetic relationships of fum genes suggest that the mutation(s) responsible for the nonproduction phenotype differs, and therefore arose independently, in the two species. Partial fum cluster homologues were also identified in genome sequences of four other black Aspergillus species. Gene content of these partial clusters and phylogenetic relationships of fum sequences indicate that non-random partial deletion of the cluster has occurred multiple times among the species. This in turn suggests that an intact cluster and fumonisin production were once more widespread among black aspergilli. Copyright © 2014 Elsevier Inc. All rights reserved.
Aspergillus niger and A. awamori strains isolated from grapes cultivated in Mediterranean basin were examined for fumonisin B2 (FB2) production and presence/absence of sequences within the fumonisin biosynthetic gene (fum) cluster. Presence of 13 regions in the fum cluster was evaluated by PCR assay...
Marenholz, Ingo; Grosche, Sarah; Kalb, Birgit; Rüschendorf, Franz; Blümchen, Katharina; Schlags, Rupert; Harandi, Neda; Price, Mareike; Hansen, Gesine; Seidenberg, Jürgen; Röblitz, Holger; Yürek, Songül; Tschirner, Sebastian; Hong, Xiumei; Wang, Xiaobin; Homuth, Georg; Schmidt, Carsten O; Nöthen, Markus M; Hübner, Norbert; Niggemann, Bodo; Beyer, Kirsten; Lee, Young-Ae
Genetic factors and mechanisms underlying food allergy are largely unknown. Due to heterogeneity of symptoms a reliable diagnosis is often difficult to make. Here, we report a genome-wide association study on food allergy diagnosed by oral food challenge in 497 cases and 2387 controls. We identify five loci at genome-wide significance, the clade B serpin (SERPINB) gene cluster at 18q21.3, the cytokine gene cluster at 5q31.1, the filaggrin gene, the C11orf30/LRRC32 locus, and the human leukocyte antigen (HLA) region. Stratifying the results for the causative food demonstrates that association of the HLA locus is peanut allergy-specific whereas the other four loci increase the risk for any food allergy. Variants in the SERPINB gene cluster are associated with SERPINB10 expression in leukocytes. Moreover, SERPINB genes are highly expressed in the esophagus. All identified loci are involved in immunological regulation or epithelial barrier function, emphasizing the role of both mechanisms in food allergy.
Casey, Céline; Stölting, Kai N.; Barbará, Thelma; González-Martínez, Santiago C.; Lexer, Christian
Resistance genes (R-genes) are essential for long-lived organisms such as forest trees, which are exposed to diverse herbivores and pathogens. In short-lived model species, R-genes have been shown to be involved in species isolation. Here, we studied more than 400 trees from two natural hybrid zones of the European Populus species Populus alba and Populus tremula for microsatellite markers located in three R-gene clusters, including one cluster situated in the incipient sex chromosome region....
Data Analysis and Visualization (IDAV) and the Department of Computer Science, University of California, Davis, One Shields Avenue, Davis CA 95616, USA,; nternational Research Training Group ``Visualization of Large and Unstructured Data Sets,' ' University of Kaiserslautern, Germany; Computational Research Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA; Genomics Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA; Life Sciences Division, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley CA 94720, USA,; Computer Science Division,University of California, Berkeley, CA, USA,; Computer Science Department, University of California, Irvine, CA, USA,; All authors are with the Berkeley Drosophila Transcription Network Project, Lawrence Berkeley National Laboratory,; Rubel, Oliver; Weber, Gunther H.; Huang, Min-Yu; Bethel, E. Wes; Biggin, Mark D.; Fowlkes, Charless C.; Hendriks, Cris L. Luengo; Keranen, Soile V. E.; Eisen, Michael B.; Knowles, David W.; Malik, Jitendra; Hagen, Hans; Hamann, Bernd
The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clustering to aid exploration of these new complex datasets. The interplay of data visualization and clustering-based data classification leads to improved visualization and enables a more detailed analysis than previously possible. We discuss (i) integration of data clustering and visualization into one framework; (ii) application of data clustering to 3D gene expression data; (iii) evaluation of the number of clusters k in the context of 3D gene expression clustering; and (iv) improvement of overall analysis quality via dedicated post-processing of clustering results based on visualization. We discuss the use of this framework to objectively define spatial pattern boundaries and temporal profiles of genes and to analyze how mRNA patterns are controlled by their regulatory transcription factors.
Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I
Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.
Richardson, Paul M.; Lucas, Susan; Cameron, R. Andrew; Rowen,Lee; Nesbitt, Ryan; Bloom, Scott; Rast, Jonathan P.; Berney, Kevin; Arenas-Mena, Cesar; Martinez, Pedro; Davidson, Eric H.; Peterson, KevinJ.; Hood, Leroy
The highly consistent gene order and axial colinear expression patterns found in vertebrate hox gene clusters are less well conserved across the rest of bilaterians. We report the first deuterostome instance of an intact hox cluster with a unique gene order where the paralog groups are not expressed in a sequential manner. The finished sequence from BAC clones from the genome of the sea urchin, Strongylocentrotus purpuratus, reveals a gene order wherein the anterior genes (Hox1, Hox2 and Hox3) lie nearest the posterior genes in the cluster such that the most 3' gene is Hox5. (The gene order is : 5'-Hox1,2, 3, 11/13c, 11/13b, '11/13a, 9/10, 8, 7, 6, 5 - 3)'. The finished sequence result is corroborated by restriction mapping evidence and BAC-end scaffold analyses. Comparisons with a putative ancestral deuterostome Hox gene cluster suggest that the rearrangements leading to the sea urchin gene order were many and complex.
Zhang, Huixian; Ravi, Vydianathan; Tay, Boon-Hui; Tohari, Sumanty; Pillai, Nisha E; Prasad, Aravind; Lin, Qiang; Brenner, Sydney; Venkatesh, Byrappa
ParaHox genes ( Gsx , Pdx , and Cdx ) are an ancient family of developmental genes closely related to the Hox genes. They play critical roles in the patterning of brain and gut. The basal chordate, amphioxus, contains a single ParaHox cluster comprising one member of each family, whereas nonteleost jawed vertebrates contain four ParaHox genomic loci with six or seven ParaHox genes. Teleosts, which have experienced an additional whole-genome duplication, contain six ParaHox genomic loci with six ParaHox genes. Jawless vertebrates, represented by lampreys and hagfish, are the most ancient group of vertebrates and are crucial for understanding the origin and evolution of vertebrate gene families. We have previously shown that lampreys contain six Hox gene loci. Here we report that lampreys contain only two ParaHox gene clusters (designated as α- and β-clusters) bearing five ParaHox genes ( Gsxα , Pdxα , Cdxα , Gsxβ , and Cdxβ ). The order and orientation of the three genes in the α-cluster are identical to that of the single cluster in amphioxus. However, the orientation of Gsxβ in the β-cluster is inverted. Interestingly, Gsxβ is expressed in the eye, unlike its homologs in jawed vertebrates, which are expressed mainly in the brain. The lamprey Pdxα is expressed in the pancreas similar to jawed vertebrate Pdx genes, indicating that the pancreatic expression of Pdx was acquired before the divergence of jawless and jawed vertebrate lineages. It is likely that the lamprey Pdxα plays a crucial role in pancreas specification and insulin production similar to the Pdx of jawed vertebrates.
Wang, S-N; Shan, S; Zheng, Y; Peng, Y; Lu, Z-Y; Yang, Y-Q; Li, R-J; Zhang, Y-J; Guo, Y-Y
Odorant receptors (ORs) expressed in the antennae of parasitoid wasps are responsible for detection of various lipophilic airborne molecules. In the present study, 107 novel OR genes were identified from Microplitis mediator antennal transcriptome data. Phylogenetic analysis of the set of OR genes from M. mediator and Microplitis demolitor revealed that M. mediator OR (MmedOR) genes can be classified into different subfamilies, and the majority of MmedORs in each subfamily shared high sequence identities and clear orthologous relationships to M. demolitor ORs. Within a subfamily, six MmedOR genes, MmedOR98, 124, 125, 126, 131 and 155, shared a similar gene structure and were tightly linked in the genome. To evaluate whether the clustered MmedOR genes share common regulatory features, the transcription profile and expression characteristics of the six closely related OR genes were investigated in M. mediator. Rapid amplification of cDNA ends-PCR experiments revealed that the OR genes within the cluster were transcribed as single mRNAs, and a bicistronic mRNA for two adjacent genes (MmedOR124 and MmedOR98) was also detected in female antennae by reverse transcription PCR. In situ hybridization experiments indicated that each OR gene within the cluster was expressed in a different number of cells. Moreover, there was no co-expression of the two highly related OR genes, MmedOR124 and MmedOR98, which appeared to be individually expressed in a distinct population of neurons. Overall, there were distinct expression profiles of closely related MmedOR genes from the same cluster in M. mediator. These data provide a basic understanding of the olfactory coding in parasitoid wasps. © 2017 The Royal Entomological Society.
The native nature of high dimension low sample size of gene expression data make the classification task more challenging. Therefore, feature (gene) selection become an apparent need. Selecting a meaningful and relevant genes for classifier not only decrease the computational time and cost, but also improve the classification performance. Among different approaches of feature selection methods, however most of them suffer from several problems such as lack of robustness, validation issues etc. Here, we present a new feature selection technique that takes advantage of clustering both samples and genes. Materials and methods We used leukemia gene expression dataset . The effectiveness of the selected features were evaluated by four different classification methods; support vector machines, k-nearest neighbor, random forest, and linear discriminate analysis. The method evaluate the importance and relevance of each gene cluster by summing the expression level for each gene belongs to this cluster. The gene cluster consider important, if it satisfies conditions depend on thresholds and percentage otherwise eliminated. Results Initial analysis identified 7120 differentially expressed genes of leukemia (Fig. 15a), after applying our feature selection methodology we end up with specific 1117 genes discriminating two classes of leukemia (Fig. 15b). Further applying the same method with more stringent higher positive and lower negative threshold condition, number reduced to 58 genes have be tested to evaluate the effectiveness of the method (Fig. 15c). The results of the four classification methods are summarized in Table 11. Conclusions The feature selection method gave good results with minimum classification error. Our heat-map result shows distinct pattern of refines genes discriminating between two classes of leukemia.
Full Text Available Abstract It is difficult from possibilities to select a most suitable effective way of clustering algorithm and its dataset for a defined set of gene expression data because we have a huge number of ways and huge number of gene expressions. At present many researchers are preferring to use hierarchical clustering in different forms this is no more totally optimal. Cluster ensemble research can solve this type of problem by automatically merging multiple data partitions from a wide range of different clusterings of any dimensions to improve both the quality and robustness of the clustering result. But we have many existing ensemble approaches using an association matrix to condense sample-cluster and co-occurrence statistics and relations within the ensemble are encapsulated only at raw level while the existing among clusters are totally discriminated. Finding these missing associations can greatly expand the capability of those ensemble methodologies for microarray data clustering. We propose general K-means cluster ensemble approach for the clustering of general categorical data into required number of partitions.
BACKGROUND: There is increasing interest in the evolution of protein-protein interactions because this should ultimately be informative of the patterns of evolution of new protein functions within the cell. One model proposes that the evolution of new protein-protein interactions and protein complexes proceeds through the duplication of self-interacting genes. This model is supported by data from yeast. We examined the relationship between gene duplication and self-interaction in the human genome. RESULTS: We investigated the patterns of self-interaction and duplication among 34808 interactions encoded by 8881 human genes, and show that self-interacting proteins are encoded by genes with higher duplicability than genes whose proteins lack this type of interaction. We show that this result is robust against the system used to define duplicate genes. Finally we compared the presence of self-interactions amongst proteins whose genes have duplicated either through whole-genome duplication (WGD) or small-scale duplication (SSD), and show that the former tend to have more interactions in general. After controlling for age differences between the two sets of duplicates this result can be explained by the time since the gene duplication. CONCLUSIONS: Genes encoding self-interacting proteins tend to have higher duplicability than proteins lacking self-interactions. Moreover these duplicate genes have more often arisen through whole-genome rather than small-scale duplication. Finally, self-interacting WGD genes tend to have more interaction partners in general in the PIN, which can be explained by their overall greater age. This work adds to our growing knowledge of the importance of contextual factors in gene duplicability.
Thompson, L.H.; Weber, C.A.; Jones, N.J.
Several genes involved in mammalian DNA repair pathways were identified by complementation analysis and chromosomal mapping based on hybrid cells. Eight complementation groups of rodent mutants defective in the repair of uv radiation damage are now identified. At least seven of these genes are probably essential for repair and at least six of them control the incision step. The many genes required for repair of DNA cross-linking damage show overlap with those involved in the repair of uv damage, but some of these genes appear to be unique for cross-link repair. Two genes residing on human chromosome 19 were cloned from genomic transformants using a cosmid vector, and near full-length cDNA clones of each gene were isolated and sequenced. Gene ERCC2 efficiently corrects the defect in CHO UV5, a nucleotide excision repair mutant. Gene XRCC1 normalizes repair of strand breaks and the excessive sister chromatid exchange in CHO mutant EM9. ERCC2 shows a remarkable /approximately/52% overall homology at both the amino acid and nucleotide levels with the yeast RAD3 gene. Evidence based on mutation induction frequencies suggests that ERCC2, like RAD3, might also be an essential gene for viability. 100 refs., 4 tabs
Sassa, Akira; Kamoshita, Nagisa; Kanemaru, Yuki; Honma, Masamitsu; Yasui, Manabu
Clustered DNA damage is defined as multiple sites of DNA damage within one or two helical turns of the duplex DNA. This complex damage is often formed by exposure of the genome to ionizing radiation and is difficult to repair. The mutagenic potential and repair mechanisms of clustered DNA damage in human cells remain to be elucidated. In this study, we investigated the involvement of nucleotide excision repair (NER) in clustered oxidative DNA adducts. To identify the in vivo protective roles of NER, we established a human cell line lacking the NER gene xeroderma pigmentosum group A (XPA). XPA knockout (KO) cells were generated from TSCER122 cells derived from the human lymphoblastoid TK6 cell line. To analyze the mutagenic events in DNA adducts in vivo, we previously employed a system of tracing DNA adducts in the targeted mutagenesis (TATAM), in which DNA adducts were site-specifically introduced into intron 4 of thymidine kinase genes. Using the TATAM system, one or two tandem 7,8-dihydro-8-oxoguanine (8-oxoG) adducts were introduced into the genomes of TSCER122 or XPA KO cells. In XPA KO cells, the proportion of mutants induced by a single 8-oxoG (7.6%) was comparable with that in TSCER122 cells (8.1%). In contrast, the lack of XPA significantly enhanced the mutant proportion of tandem 8-oxoG in the transcribed strand (12%) compared with that in TSCER122 cells (7.4%) but not in the non-transcribed strand (12% and 11% in XPA KO and TSCER122 cells, respectively). By sequencing the tandem 8-oxoG-integrated loci in the transcribed strand, we found that the proportion of tandem mutations was markedly increased in XPA KO cells. These results indicate that NER is involved in repairing clustered DNA adducts in the transcribed strand in vivo. PMID:26559182
Sassa, Akira; Kamoshita, Nagisa; Kanemaru, Yuki; Honma, Masamitsu; Yasui, Manabu
Clustered DNA damage is defined as multiple sites of DNA damage within one or two helical turns of the duplex DNA. This complex damage is often formed by exposure of the genome to ionizing radiation and is difficult to repair. The mutagenic potential and repair mechanisms of clustered DNA damage in human cells remain to be elucidated. In this study, we investigated the involvement of nucleotide excision repair (NER) in clustered oxidative DNA adducts. To identify the in vivo protective roles of NER, we established a human cell line lacking the NER gene xeroderma pigmentosum group A (XPA). XPA knockout (KO) cells were generated from TSCER122 cells derived from the human lymphoblastoid TK6 cell line. To analyze the mutagenic events in DNA adducts in vivo, we previously employed a system of tracing DNA adducts in the targeted mutagenesis (TATAM), in which DNA adducts were site-specifically introduced into intron 4 of thymidine kinase genes. Using the TATAM system, one or two tandem 7,8-dihydro-8-oxoguanine (8-oxoG) adducts were introduced into the genomes of TSCER122 or XPA KO cells. In XPA KO cells, the proportion of mutants induced by a single 8-oxoG (7.6%) was comparable with that in TSCER122 cells (8.1%). In contrast, the lack of XPA significantly enhanced the mutant proportion of tandem 8-oxoG in the transcribed strand (12%) compared with that in TSCER122 cells (7.4%) but not in the non-transcribed strand (12% and 11% in XPA KO and TSCER122 cells, respectively). By sequencing the tandem 8-oxoG-integrated loci in the transcribed strand, we found that the proportion of tandem mutations was markedly increased in XPA KO cells. These results indicate that NER is involved in repairing clustered DNA adducts in the transcribed strand in vivo.
Full Text Available Dyslexia is a heritable neurodevelopmental disorder characterized by difficulties in reading and writing. In this study, we describe the identification of a set of 17 polymorphisms located across 1.9 Mb region on chromosome 5q31.3, encompassing genes of the PCDHG cluster, TAF7, PCDH1 and ARHGAP26, dominantly inherited with dyslexia in a multi-incident family. Strikingly, the non-risk form of seven variations of the PCDHG cluster, are preponderant in the human lineage, while risk alleles are ancestral and conserved across Neanderthals to non-human primates. Four of these seven ancestral variations (c.460A > C [p.Ile154Leu], c.541G > A [p.Ala181Thr], c.2036G > C [p.Arg679Pro] and c.2059A > G [p.Lys687Glu] result in amino acid alterations. p.Ile154Leu and p.Ala181Thr are present at EC2: EC3 interacting interface of γA3-PCDH and γA4-PCDH respectively might affect trans-homophilic interaction and hence neuronal connectivity. p.Arg679Pro and p.Lys687Glu are present within the linker region connecting trans-membrane to extracellular domain. Sequence analysis indicated the importance of p.Ile154, p.Arg679 and p.Lys687 in maintaining class specificity. Thus the observed association of PCDHG genes encoding neural adhesion proteins reinforces the hypothesis of aberrant neuronal connectivity in the pathophysiology of dyslexia. Additionally, the striking conservation of the identified variants indicates a role of PCDHG in the evolution of highly specialized cognitive skills critical to reading.
Glenn, Anthony E.; Davis, C. Britton; Gao, Minglu; Gold, Scott E.; Mitchell, Trevor R.; Proctor, Robert H.; Stewart, Jane E.; Snook, Maurice E.
Microbes encounter a broad spectrum of antimicrobial compounds in their environments and often possess metabolic strategies to detoxify such xenobiotics. We have previously shown that Fusarium verticillioides, a fungal pathogen of maize known for its production of fumonisin mycotoxins, possesses two unlinked loci, FDB1 and FDB2, necessary for detoxification of antimicrobial compounds produced by maize, including the γ-lactam 2-benzoxazolinone (BOA). In support of these earlier studies, microarray analysis of F. verticillioides exposed to BOA identified the induction of multiple genes at FDB1 and FDB2, indicating the loci consist of gene clusters. One of the FDB1 cluster genes encoded a protein having domain homology to the metallo-β-lactamase (MBL) superfamily. Deletion of this gene (MBL1) rendered F. verticillioides incapable of metabolizing BOA and thus unable to grow on BOA-amended media. Deletion of other FDB1 cluster genes, in particular AMD1 and DLH1, did not affect BOA degradation. Phylogenetic analyses and topology testing of the FDB1 and FDB2 cluster genes suggested two horizontal transfer events among fungi, one being transfer of FDB1 from Fusarium to Colletotrichum, and the second being transfer of the FDB2 cluster from Fusarium to Aspergillus. Together, the results suggest that plant-derived xenobiotics have exerted evolutionary pressure on these fungi, leading to horizontal transfer of genes that enhance fitness or virulence. PMID:26808652
Wu, Joseph C.; Yla-Herttuala, Seppo
This review discusses the basics of cardiovascular gene therapy, the results of recent human clinical trials, and the rapid progress in imaging techniques in cardiology. Improved understanding of the molecular and genetic basis of coronary heart disease has made gene therapy a potential new alternative for the treatment of cardiovascular diseases. Experimental studies have established the proof-of-principle that gene transfer to the cardiovascular system can achieve therapeutic effects. First human clinical trials provided initial evidence of feasibility and safety of cardiovascular gene therapy. However, phase II/III clinical trials have so far been rather disappointing and one of the major problems in cardiovascular gene therapy has been the inability to verify gene expression in the target tissue. New imaging techniques could significantly contribute to the development of better gene therapeutic approaches. Although the exact choice of imaging modality will depend on the biological question asked, further improvement in image resolution and detection sensitivity will be needed for all modalities as we move from imaging of organs and tissues to imaging of cells and genes. (orig.)
Bushley, Kathryn E.; Raja, Rajani; Jaiswal, Pankaj; Cumbie, Jason S.; Nonogaki, Mariko; Boyd, Alexander E.; Owensby, C. Alisha; Knaus, Brian J.; Elser, Justin; Miller, Daniel; Di, Yanming; McPhail, Kerry L.; Spatafora, Joseph W.
The ascomycete fungus Tolypocladium inflatum, a pathogen of beetle larvae, is best known as the producer of the immunosuppressant drug cyclosporin. The draft genome of T. inflatum strain NRRL 8044 (ATCC 34921), the isolate from which cyclosporin was first isolated, is presented along with comparative analyses of the biosynthesis of cyclosporin and other secondary metabolites in T. inflatum and related taxa. Phylogenomic analyses reveal previously undetected and complex patterns of homology between the nonribosomal peptide synthetase (NRPS) that encodes for cyclosporin synthetase (simA) and those of other secondary metabolites with activities against insects (e.g., beauvericin, destruxins, etc.), and demonstrate the roles of module duplication and gene fusion in diversification of NRPSs. The secondary metabolite gene cluster responsible for cyclosporin biosynthesis is described. In addition to genes necessary for cyclosporin biosynthesis, it harbors a gene for a cyclophilin, which is a member of a family of immunophilins known to bind cyclosporin. Comparative analyses support a lineage specific origin of the cyclosporin gene cluster rather than horizontal gene transfer from bacteria or other fungi. RNA-Seq transcriptome analyses in a cyclosporin-inducing medium delineate the boundaries of the cyclosporin cluster and reveal high levels of expression of the gene cluster cyclophilin. In medium containing insect hemolymph, weaker but significant upregulation of several genes within the cyclosporin cluster, including the highly expressed cyclophilin gene, was observed. T. inflatum also represents the first reference draft genome of Ophiocordycipitaceae, a third family of insect pathogenic fungi within the fungal order Hypocreales, and supports parallel and qualitatively distinct radiations of insect pathogens. The T. inflatum genome provides additional insight into the evolution and biosynthesis of cyclosporin and lays a foundation for further investigations of the role
Full Text Available Apolipoprotein A1 (APOA1 is the major protein component of high-density lipoprotein (HDL in plasma. We have identified an endogenously expressed long noncoding natural antisense transcript, APOA1-AS, which acts as a negative transcriptional regulator of APOA1 both in vitro and in vivo. Inhibition of APOA1-AS in cultured cells resulted in the increased expression of APOA1 and two neighboring genes in the APO cluster. Chromatin immunoprecipitation (ChIP analyses of a ∼50 kb chromatin region flanking the APOA1 gene demonstrated that APOA1-AS can modulate distinct histone methylation patterns that mark active and/or inactive gene expression through the recruitment of histone-modifying enzymes. Targeting APOA1-AS with short antisense oligonucleotides also enhanced APOA1 expression in both human and monkey liver cells and induced an increase in hepatic RNA and protein expression in African green monkeys. Furthermore, the results presented here highlight the significant local modulatory effects of long noncoding antisense RNAs and demonstrate the therapeutic potential of manipulating the expression of these transcripts both in vitro and in vivo.
Wada, Masayoshi; Takahashi, Hiroki; Altaf-Ul-Amin, Md; Nakamura, Kensuke; Hirai, Masami Y; Ohta, Daisaku; Kanaya, Shigehiko
Operon-like arrangements of genes occur in eukaryotes ranging from yeasts and filamentous fungi to nematodes, plants, and mammals. In plants, several examples of operon-like gene clusters involved in metabolic pathways have recently been characterized, e.g. the cyclic hydroxamic acid pathways in maize, the avenacin biosynthesis gene clusters in oat, the thalianol pathway in Arabidopsis thaliana, and the diterpenoid momilactone cluster in rice. Such operon-like gene clusters are defined by their co-regulation or neighboring positions within immediate vicinity of chromosomal regions. A comprehensive analysis of the expression of neighboring genes therefore accounts a crucial step to reveal the complete set of operon-like gene clusters within a genome. Genome-wide prediction of operon-like gene clusters should contribute to functional annotation efforts and provide novel insight into evolutionary aspects acquiring certain biological functions as well. We predicted co-expressed gene clusters by comparing the Pearson correlation coefficient of neighboring genes and randomly selected gene pairs, based on a statistical method that takes false discovery rate (FDR) into consideration for 1469 microarray gene expression datasets of A. thaliana. We estimated that A. thaliana contains 100 operon-like gene clusters in total. We predicted 34 statistically significant gene clusters consisting of 3 to 22 genes each, based on a stringent FDR threshold of 0.1. Functional relationships among genes in individual clusters were estimated by sequence similarity and functional annotation of genes. Duplicated gene pairs (determined based on BLAST with a cutoff of EOperon-like clusters tend to include genes encoding bio-machinery associated with ribosomes, the ubiquitin/proteasome system, secondary metabolic pathways, lipid and fatty-acid metabolism, and the lipid transfer system. Copyright © 2012 Elsevier B.V. All rights reserved.
MacGregor, James N.
Most models of human performance on the traveling salesperson problem involve clustering of nodes, but few empirical studies have examined effects of clustering in the stimulus array. A recent exception varied degree of clustering and concluded that the more clustered a stimulus array, the easier a TSP is to solve (Dry, Preiss, & Wagemans,…
In accordance with the concept of the book and the assigned scope of the contribution, this chapter describes the European law with respect to the patent-eligibility of isolated DNA sequences. This chapter will further include a brief comparison with recent developments from the US and Australia....... It will, however, not focus on the important debates regarding the patent-eligibility of other biological material, diagnostic methods patents (as data aggregators) or abstract ideas which will be addressed by other contributions. Moreover, the analysis will merely concentrate on patent-eligibility. Other...... patentability requirement will only be briefly touched upon in the discussion part. The paper starts out in section 1.5.2 by discussing the patent-eligibility of isolated human DNA sequences on the European national level and under the Biotechnology Directive. Then the patent-eligibility of isolated human DNA...
Full Text Available An unsupervised data clustering method, called the local maximum clustering (LMC method, is proposed for identifying clusters in experiment data sets based on research interest. A magnitude property is defined according to research purposes, and data sets are clustered around each local maximum of the magnitude property. By properly defining a magnitude property, this method can overcome many difficulties in microarray data clustering such as reduced projection in similarities, noises, and arbitrary gene distribution. To critically evaluate the performance of this clustering method in comparison with other methods, we designed three model data sets with known cluster distributions and applied the LMC method as well as the hierarchic clustering method, the -mean clustering method, and the self-organized map method to these model data sets. The results show that the LMC method produces the most accurate clustering results. As an example of application, we applied the method to cluster the leukemia samples reported in the microarray study of Golub et al. (1999.
Ehrlich, Kenneth C.; Mack, Brian M.
Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help ...
Sura Zaki Alrashid; Muhammad Arifur Rahman; Nabeel H Al-Aaraji; Neil D Lawrence; Paul R Heath
Clustering of gene expression time series gives insight into which genes may be co-regulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different conditions or genetic background. This paper develops a new clustering method that allows each cluster to be parameterised according to whether the behaviour of the genes across conditions is correlated or anti-correlated. By specifying correlati...
Full Text Available Abstract Background The identification and study of proteins from metagenomic datasets can shed light on the roles and interactions of the source organisms in their communities. However, metagenomic datasets are characterized by the presence of organisms with varying GC composition, codon usage biases etc., and consequently gene identification is challenging. The vast amount of sequence data also requires faster protein family classification tools. Results We present a computational improvement to a sequence clustering approach that we developed previously to identify and classify protein coding genes in large microbial metagenomic datasets. The clustering approach can be used to identify protein coding genes in prokaryotes, viruses, and intron-less eukaryotes. The computational improvement is based on an incremental clustering method that does not require the expensive all-against-all compute that was required by the original approach, while still preserving the remote homology detection capabilities. We present evaluations of the clustering approach in protein-coding gene identification and classification, and also present the results of updating the protein clusters from our previous work with recent genomic and metagenomic sequences. The clustering results are available via CAMERA, (http://camera.calit2.net. Conclusion The clustering paradigm is shown to be a very useful tool in the analysis of microbial metagenomic data. The incremental clustering method is shown to be much faster than the original approach in identifying genes, grouping sequences into existing protein families, and also identifying novel families that have multiple members in a metagenomic dataset. These clusters provide a basis for further studies of protein families.
Tessema, Sofonias K; Monk, Stephanie L; Schultz, Mark B; Tavul, Livingstone; Reeder, John C; Siba, Peter M; Mueller, Ivo; Barry, Alyssa E
Plasmodium falciparum malaria is a major global health problem that is being targeted for progressive elimination. Knowledge of local disease transmission patterns in endemic countries is critical to these elimination efforts. To investigate fine-scale patterns of malaria transmission, we have compared repertoires of rapidly evolving var genes in a highly endemic area. A total of 3680 high-quality DBLα-sequences were obtained from 68 P. falciparum isolates from ten villages spread over two distinct catchment areas on the north coast of Papua New Guinea (PNG). Modelling of the extent of var gene diversity in the two parasite populations predicts more than twice as many var gene alleles circulating within each catchment (Mugil = 906; Wosera = 1094) than previously recognized in PNG (Amele = 369). In addition, there were limited levels of var gene sharing between populations, consistent with local parasite population structure. Phylogeographic analyses demonstrate that while neutrally evolving microsatellite markers identified population structure only at the catchment level, var gene repertoires reveal further fine-scale geospatial clustering of parasite isolates. The clustering of parasite isolates by village in Mugil, but not in Wosera was consistent with the physical and cultural isolation of the human populations in the two catchments. The study highlights the microheterogeneity of P. falciparum transmission in highly endemic areas and demonstrates the potential of var genes as markers of local patterns of parasite population structure. © 2014 John Wiley & Sons Ltd.
The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organization, transcription, various post-transcriptional processes, and translation. In this study, the Transcriptional Interference Network (TIN) hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighboring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronized cascade of gene expression in functionally linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular organisms too.
Pan, Hai-Yan; Zhu, Jun; Han, Dan-Fu
Microarray has become a popular biotechnology in biological and medical research. However, systematic and stochastic variabilities in microarray data are expected and unavoidable, resulting in the problem that the raw measurements have inherent "noise" within microarray experiments. Currently, logarithmic ratios are usually analyzed by various clustering methods directly, which may introduce bias interpretation in identifying groups of genes or samples. In this paper, a statistical method based on mixed model approaches was proposed for microarray data cluster analysis. The underlying rationale of this method is to partition the observed total gene expression level into various variations caused by different factors using an ANOVA model, and to predict the differential effects of GV (gene by variety) interaction using the adjusted unbiased prediction (AUP) method. The predicted GV interaction effects can then be used as the inputs of cluster analysis. We illustrated the application of our method with a gene expression dataset and elucidated the utility of our approach using an external validation.
Rahman, Muhammad Arifur; Heath, Paul R.; Lawrence, Neil D.
Clustering of gene expression time series gives insight into which genes may be coregulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different model conditions or genetic background. Amyotrophic lateral sclerosis (ALS), an irreversible diverse neurodegenerative disorder showed consistent phenotypic differences and the disease progression is heterogeneous with significant variability. Thi...
Full Text Available Powdery mildew caused by (DC. f. sp. ( is a globally devastating foliar disease of wheat ( L.. More than a dozen genes against this disease, identified from wheat germplasms of different ploidy levels, have been mapped to the region surrounding the locus on the long arm of chromosome 7A, which forms a resistance (-gene cluster. and from einkorn wheat ( L. were two of the genes belonging to this cluster. This study was initiated to fine map these two genes toward map-based cloning. Comparative genomics study showed that macrocolinearity exists between L. chromosome 1 (Bd1 and the – region, which allowed us to develop markers based on the wheat sequences orthologous to genes contained in the Bd1 region. With these and other newly developed and published markers, high-resolution maps were constructed for both and using large F populations. Moreover, a physical map of was constructed through chromosome walking with bacterial artificial chromosome (BAC clones and comparative mapping. Eventually, and were restricted to a 0.12- and 0.86-cM interval, respectively. Based on the closely linked common markers, , , and (another powdery mildew resistance gene in the cluster were not allelic to one another. Severe recombination suppression and disruption of synteny were noted in the region encompassing . These results provided useful information for map-based cloning of the genes in the cluster and interpretation of their evolution.
Roberts, S Craig; Little, Anthony C
The past decade has witnessed a rapidly growing interest in the biological basis of human mate choice. Here we review recent studies that demonstrate preferences for traits which might reveal genetic quality to prospective mates, with potential but still largely unknown influence on offspring fitness. These include studies assessing visual, olfactory and auditory preferences for potential good-gene indicator traits, such as dominance or bilateral symmetry. Individual differences in these robust preferences mainly arise through within and between individual variation in condition and reproductive status. Another set of studies have revealed preferences for traits indicating complementary genes, focussing on discrimination of dissimilarity at genes in the major histocompatibility complex (MHC). As in animal studies, we are only just beginning to understand how preferences for specific traits vary and inter-relate, how consideration of good and compatible genes can lead to substantial variability in individual mate choice decisions and how preferences expressed in one sensory modality may reflect those in another. Humans may be an ideal model species in which to explore these interesting complexities.
Full Text Available Abstract Background The superfamily of serine proteinase inhibitors (serpins is involved in numerous fundamental biological processes as inflammation, blood coagulation and apoptosis. Our interest is focused on the SERPINA3 sub-family. The major human plasma protease inhibitor, α1-antichymotrypsin, encoded by the SERPINA3 gene, is homologous to genes organized in clusters in several mammalian species. However, although there is a similar genic organization with a high degree of sequence conservation, the reactive-centre-loop domains, which are responsible for the protease specificity, show significant divergences. Results We provide additional information by analyzing the situation of SERPINA3 in the bovine genome. A cluster of eight genes and one pseudogene sharing a high degree of identity and the same structural organization was characterized. Bovine SERPINA3 genes were localized by radiation hybrid mapping on 21q24 and only spanned over 235 Kilobases. For all these genes, we propose a new nomenclature from SERPINA3-1 to SERPINA3-8. They share approximately 70% of identity with the human SERPINA3 homologue. In the cluster, we described an original sub-group of six members with an unexpected high degree of conservation for the reactive-centre-loop domain, suggesting a similar peptidase inhibitory pattern. Preliminary expression analyses of these bovSERPINA3s showed different tissue-specific patterns and diverse states of glycosylation and phosphorylation. Finally, in the context of phylogenetic analyses, we improved our knowledge on mammalian SERPINAs evolution. Conclusion Our experimental results update data of the bovine genome sequencing, substantially increase the bovSERPINA3 sub-family and enrich the phylogenetic tree of serpins. We provide new opportunities for future investigations to approach the biological functions of this unusual subset of serine proteinase inhibitors.
Stach, Christopher S; Vu, Bao G; Merriman, Joseph A; Herrera, Alfa; Cahill, Michael P; Schlievert, Patrick M; Salgado-Pabón, Wilmara
Superantigens are indispensable virulence factors for Staphylococcus aureus in disease causation. Superantigens stimulate massive immune cell activation, leading to toxic shock syndrome (TSS) and contributing to other illnesses. However, superantigens differ in their capacities to induce body-wide effects. For many, their production, at least as tested in vitro, is not high enough to reach the circulation, or the proteins are not efficient in crossing epithelial and endothelial barriers, thus remaining within tissues or localized on mucosal surfaces where they exert only local effects. In this study, we address the role of TSS toxin-1 (TSST-1) and most importantly the enterotoxin gene cluster (egc) in infective endocarditis and sepsis, gaining insights into the body-wide versus local effects of superantigens. We examined S. aureus TSST-1 gene (tstH) and egc deletion strains in the rabbit model of infective endocarditis and sepsis. Importantly, we also assessed the ability of commercial human intravenous immunoglobulin (IVIG) plus vancomycin to alter the course of infective endocarditis and sepsis. TSST-1 contributed to infective endocarditis vegetations and lethal sepsis, while superantigens of the egc, a cluster with uncharacterized functions in S. aureus infections, promoted vegetation formation in infective endocarditis. IVIG plus vancomycin prevented lethality and stroke development in infective endocarditis and sepsis. Our studies support the local tissue effects of egc superantigens for establishment and progression of infective endocarditis providing evidence for their role in life-threatening illnesses. In contrast, TSST-1 contributes to both infective endocarditis and lethal sepsis. IVIG may be a useful adjunct therapy for infective endocarditis and sepsis.
Silaghi Gheorghe Cosmin
Full Text Available Previously we employed the Gene Trajectory Clustering methodology to search for different associations of the stocks composing the DJA index, with the aim of finding different, logic clusters, supported by economic reasons, preferably different than the
Full Text Available Abstract Background The human 6–16 and ISG12 genes are transcriptionally upregulated in a variety of cell types in response to type I interferon (IFN. The predicted products of these genes are small (12.9 and 11.5 kDa respectively, hydrophobic proteins that share 36% overall amino acid identity. Gene disruption and over-expression studies have so far failed to reveal any biochemical or cellular roles for these proteins. Results We have used in silico analyses to identify a novel family of genes (the ISG12 gene family related to both the human 6–16 and ISG12 genes. Each ISG12 family member codes for a small hydrophobic protein containing a conserved ~80 amino-acid motif (the ISG12 motif. So far we have detected 46 family members in 25 organisms, ranging from unicellular eukaryotes to humans. Humans have four ISG12 genes: the 6–16 gene at chromosome 1p35 and three genes (ISG12(a, ISG12(b and ISG12(c clustered at chromosome 14q32. Mice have three family members (ISG12(a, ISG12(b1 and ISG12(b2 clustered at chromosome 12F1 (syntenic with human chromosome 14q32. There does not appear to be a murine 6–16 gene. On the basis of phylogenetic analyses, genomic organisation and intron-alignments we suggest that this family has arisen through divergent inter- and intra-chromosomal gene duplication events. The transcripts from human and mouse genes are detectable, all but two (human ISG12(b and ISG12(c being upregulated in response to type I IFN in the cell lines tested. Conclusions Members of the eukaryotic ISG12 gene family encode a small hydrophobic protein with at least one copy of a newly defined motif of ~80 amino-acids (the ISG12 motif. In higher eukaryotes, many of the genes have acquired a responsiveness to type I IFN during evolution suggesting that a role in resisting cellular or environmental stress may be a unifying property of all family members. Analysis of gene-function in higher eukaryotes is complicated by the possibility of
Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W
The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.
McManus Donald P
Full Text Available Abstract Background The schistosome blood flukes are complex trematodes and cause a chronic parasitic disease of significant public health importance worldwide, schistosomiasis. Their life cycle is characterised by distinct parasitic and free-living phases involving mammalian and snail hosts and freshwater. Microarray analysis was used to profile developmental gene expression in the Asian species, Schistosoma japonicum. Total RNAs were isolated from the three distinct environmental phases of the lifecycle – aquatic/snail (eggs, miracidia, sporocysts, cercariae, juvenile (lung schistosomula and paired but pre-egg laying adults and adult (paired, mature males and egg-producing females, both examined separately. Advanced analyses including ANOVA, principal component analysis, and hierarchal clustering provided a global synopsis of gene expression relationships among the different developmental stages of the schistosome parasite. Results Gene expression profiles were linked to the major environmental settings through which the developmental stages of the fluke have to adapt during the course of its life cycle. Gene ontologies of the differentially expressed genes revealed a wide range of functions and processes. In addition, stage-specific, differentially expressed genes were identified that were involved in numerous biological pathways and functions including calcium signalling, sphingolipid metabolism and parasite defence. Conclusion The findings provide a comprehensive database of gene expression in an important human pathogen, including transcriptional changes in genes involved in evasion of the host immune response, nutrient acquisition, energy production, calcium signalling, sphingolipid metabolism, egg production and tegumental function during development. This resource should help facilitate the identification and prioritization of new anti-schistosome drug and vaccine targets for the control of schistosomiasis.
Lin, S.D.; Cooper, P.; Fung, J.; Weier, H.U.G.; Rubin, E.M.
Genetic factors affecting post-natal g-globin expression - a major modifier of the severity of both b-thalassemia and sickle cell anemia, have been difficult to study. This is especially so in mice, an organism lacking a globin gene with an expression pattern equivalent to that of human g-globin. To model the human b-cluster in mice, with the goal of screening for loci affecting human g-globin expression in vivo, we introduced a human b-globin cluster YAC transgene into the genome of FVB mice . The b-cluster contained a Greek hereditary persistence of fetal hemoglobin (HPFH) g allele resulting in postnatal expression of human g-globin in transgenic mice. The level of human g-globin for various F1 hybrids derived from crosses between the FVB transgenics and other inbred mouse strains was assessed. The g-globin level of the C3HeB/FVB transgenic mice was noted to be significantly elevated. To map genes affecting postnatal g-globin expression, a 20 centiMorgan (cM) genome scan of a C3HeB/F VB transgenics [prime] FVB backcross was performed, followed by high-resolution marker analysis of promising loci. From this analysis we mapped a locus within a 2.2 cM interval of mouse chromosome 1 at a LOD score of 4.2 that contributes 10.4% of variation in g-globin expression level. Combining transgenic modeling of the human b-globin gene cluster with quantitative trait analysis, we have identified and mapped a murine locus that impacts on human g-globin expression in vivo.
Diana X Zhou
Full Text Available Alcohol consumption affects human health in part by compromising the immune system. In this study, we examined the expression of the Cd14 (cluster of differentiation 14 gene, which is involved in the immune system through a proinflammatory cascade. Expression was evaluated in BXD mice treated with saline or acute 1.8 g/kg i.p. ethanol (12.5% v/v. Hippocampal gene expression data were generated to examine differential expression and to perform systems genetics analyses. The Cd14 gene expression showed significant changes among the BXD strains after ethanol treatment, and eQTL mapping revealed that Cd14 is a cis-regulated gene. We also identified eighteen ethanol-related phenotypes correlated with Cd14 expression related to either ethanol responses or ethanol consumption. Pathway analysis was performed to identify possible biological pathways involved in the response to ethanol and Cd14. We also constructed a genetic network for Cd14 using the top 20 correlated genes and present several genes possibly involved in Cd14 and ethanol responses based on differential gene expression. In conclusion, we found Cd14, along with several other genes and pathways, to be involved in ethanol responses in the hippocampus, such as increased susceptibility to lipopolysaccharides and neuroinflammation.
Full Text Available Abstract Background The hierarchical clustering tree (HCT with a dendrogram 1 and the singular value decomposition (SVD with a dimension-reduced representative map 2 are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures. Results This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose seriation by Chen 3 as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends. Conclusion We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at http://gap.stat.sinica.edu.tw/Software/GAP.
Liu, Xiao; Shi, Jun; Wang, Congzhi
Since a key step in the analysis of gene expression data is to detect groups of genes that have similar expression patterns, clustering technique is then commonly used to analyze gene expression data. Data representation plays an important role in clustering analysis. The non-negative matrix factorization (NMF) is a widely used data representation method with great success in machine learning. Although the traditional manifold regularization method, Laplacian regularization (LR), can improve the performance of NMF, LR still suffers from the problem of its weak extrapolating power. Hessian regularization (HR) is a newly developed manifold regularization method, whose natural properties make it more extrapolating, especially for small sample data. In this work, we propose the HR-based NMF (HR-NMF) algorithm, and then apply it to represent gene expression data for further clustering task. The clustering experiments are conducted on five commonly used gene datasets, and the results indicate that the proposed HR-NMF outperforms LR-based NMM and original NMF, which suggests the potential application of HR-NMF for gene expression data.
Full Text Available Abstract Background First identified in fruit flies with temperature-sensitive paralysis phenotypes, the Drosophila melanogaster TipE locus encodes four voltage-gated sodium (NaV channel auxiliary subunits. This cluster of TipE-like genes on chromosome 3L, and a fifth family member on chromosome 3R, are important for the optional expression and functionality of the Para NaV channel but appear quite distinct from auxiliary subunits in vertebrates. Here, we exploited available arthropod genomic resources to trace the origin of TipE-like genes by mapping their evolutionary histories and examining their genomic architectures. Results We identified a remarkably conserved synteny block of TipE-like orthologues with well-maintained local gene arrangements from 21 insect species. Homologues in the water flea, Daphnia pulex, suggest an ancestral pancrustacean repertoire of four TipE-like genes; a subsequent gene duplication may have generated functional redundancy allowing gene losses in the silk moth and mosquitoes. Intronic nesting of the insect TipE gene cluster probably occurred following the divergence from crustaceans, but in the flour beetle and silk moth genomes the clusters apparently escaped from nesting. Across Pancrustacea, TipE gene family members have experienced intronic nesting, escape from nesting, retrotransposition, translocation, and gene loss events while generally maintaining their local gene neighbourhoods. D. melanogaster TipE-like genes exhibit coordinated spatial and temporal regulation of expression distinct from their host gene but well-correlated with their regulatory target, the Para NaV channel, suggesting that functional constraints may preserve the TipE gene cluster. We identified homology between TipE-like NaV channel regulators and vertebrate Slo-beta auxiliary subunits of big-conductance calcium-activated potassium (BKCa channels, which suggests that ion channel regulatory partners have evolved distinct lineage
Full Text Available Abstract Background Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Results Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. Conclusions This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper
Ryge, Jesper; Winther, Ole; Wienecke, Jacob; Sandelin, Albin; Westerdahl, Ann-Charlotte; Hultborn, Hans; Kiehn, Ole
Spinal cord injury leads to neurological dysfunctions affecting the motor, sensory as well as the autonomic systems. Increased excitability of motor neurons has been implicated in injury-induced spasticity, where the reappearance of self-sustained plateau potentials in the absence of modulatory inputs from the brain correlates with the development of spasticity. Here we examine the dynamic transcriptional response of motor neurons to spinal cord injury as it evolves over time to unravel common gene expression patterns and their underlying regulatory mechanisms. For this we use a rat-tail-model with complete spinal cord transection causing injury-induced spasticity, where gene expression profiles are obtained from labeled motor neurons extracted with laser microdissection 0, 2, 7, 21 and 60 days post injury. Consensus clustering identifies 12 gene clusters with distinct time expression profiles. Analysis of these gene clusters identifies early immunological/inflammatory and late developmental responses as well as a regulation of genes relating to neuron excitability that support the development of motor neuron hyper-excitability and the reappearance of plateau potentials in the late phase of the injury response. Transcription factor motif analysis identifies differentially expressed transcription factors involved in the regulation of each gene cluster, shaping the expression of the identified biological processes and their associated genes underlying the changes in motor neuron excitability. This analysis provides important clues to the underlying mechanisms of transcriptional regulation responsible for the increased excitability observed in motor neurons in the late chronic phase of spinal cord injury suggesting alternative targets for treatment of spinal cord injury. Several transcription factors were identified as potential regulators of gene clusters containing elements related to motor neuron hyper-excitability, the manipulation of which potentially could be
Chen, Dengkai; Ding, Jingjing; Gao, Minzhuo; Ma, Danping; Liu, Donghui
The use of pan-ethnic-group products form knowledge primarily depends on a designer's subjective experience without user participation. The majority of studies primarily focus on the detection of the perceptual demands of consumers from the target product category. A pan-ethnic-group products form gene clustering method based on emotional semantic is constructed. Consumers' perceptual images of the pan-ethnic-group products are obtained by means of product form gene extraction and coding and computer aided product form clustering technology. A case of form gene clustering about the typical pan-ethnic-group products is investigated which indicates that the method is feasible. This paper opens up a new direction for the future development of product form design which improves the agility of product design process in the era of Industry 4.0.
Fletcher, J C; Richter, G
This paper examines some key ethical issues raised by trials of human gene therapy in the perinatal period--i.e., in infants, young children, and the human fetus. It describes five resources in ethics for researchers' considerations prior to such trials: (1) the history of ethical debate about gene therapy, (2) a literature on the relevance of major ethical principles for clinical research, (3) a body of widely accepted norms and practices, (4) knowledge of paradigm cases, and (5) researchers' own professional integrity. The paper also examines ethical concerns that must be met prior to any trial: benefits to and safety of subjects, informed assent of children and informed parental permission, informed consent of pregnant women in fetal gene therapy, protection of privacy, and concerns about fairness in the selection of subjects. The paper criticizes the position that cases of fetal gene therapy should be restricted only to those where the pregnant woman has explicitly refused abortion. Additional topics include concerns about genetic enhancement and germ-line gene therapy.
Full Text Available Abstract Background The definition of a distance measure plays a key role in the evaluation of different clustering solutions of gene expression profiles. In this empirical study we compare different clustering solutions when using the Mutual Information (MI measure versus the use of the well known Euclidean distance and Pearson correlation coefficient. Results Relying on several public gene expression datasets, we evaluate the homogeneity and separation scores of different clustering solutions. It was found that the use of the MI measure yields a more significant differentiation among erroneous clustering solutions. The proposed measure was also used to analyze the performance of several known clustering algorithms. A comparative study of these algorithms reveals that their "best solutions" are ranked almost oppositely when using different distance measures, despite the found correspondence between these measures when analysing the averaged scores of groups of solutions. Conclusion In view of the results, further attention should be paid to the selection of a proper distance measure for analyzing the clustering of gene expression data.
Background Secondary metabolite production, a hallmark of filamentous fungi, is an expanding area of research for the Aspergilli. These compounds are potent chemicals, ranging from deadly toxins to therapeutic antibiotics to potential anti-cancer drugs. The genome sequences for multiple Aspergilli have been determined, and provide a wealth of predictive information about secondary metabolite production. Sequence analysis and gene overexpression strategies have enabled the discovery of novel secondary metabolites and the genes involved in their biosynthesis. The Aspergillus Genome Database (AspGD) provides a central repository for gene annotation and protein information for Aspergillus species. These annotations include Gene Ontology (GO) terms, phenotype data, gene names and descriptions and they are crucial for interpreting both small- and large-scale data and for aiding in the design of new experiments that further Aspergillus research. Results We have manually curated Biological Process GO annotations for all genes in AspGD with recorded functions in secondary metabolite production, adding new GO terms that specifically describe each secondary metabolite. We then leveraged these new annotations to predict roles in secondary metabolism for genes lacking experimental characterization. As a starting point for manually annotating Aspergillus secondary metabolite gene clusters, we used antiSMASH (antibiotics and Secondary Metabolite Analysis SHell) and SMURF (Secondary Metabolite Unknown Regions Finder) algorithms to identify potential clusters in A. nidulans, A. fumigatus, A. niger and A. oryzae, which we subsequently refined through manual curation. Conclusions This set of 266 manually curated secondary metabolite gene clusters will facilitate the investigation of novel Aspergillus secondary metabolites. PMID:23617571
gene order is nonrandomly distributed in eukaryote genomes. (Lercher et al. 2002 ... Birth in a birth-and-death process relates to the origin of paralogues, presumably ... are small, or the rate of concerted evolution is very slow (Nei et al. 2000).
Salmond, G P; Lutkenhaus, J F; Donachie, W D
We report the identification, cloning, and mapping of a new cell envelope gene, murG. This lies in a group of five genes of similar phenotype (in the order murE murF murG murC ddl) all concerned with peptidoglycan biosynthesis. This group is in a larger cluster of at least 10 genes, all of which are involved in some way with cell envelope growth. Images PMID:6998962
Lin, Dong; Shi, Y.; Miller, W.L.
Adrenodoxin reductase is a flavoprotein mediating electron transport to all mitochondrial forms of cytochrome P450. The authors cloned the human adrenodoxin reductase gene and characterized it by restriction endonuclease mapping and DNA sequencing. The entire gene is approximately 12 kilobases long and consists of 12 exons. The first exon encodes the first 26 of the 32 amino acids of the signal peptide, and the second exon encodes the remainder of signal peptide and the apparent FAD binding site. The remaining 10 exons are clustered in a region of only 4.3 kilobases, separated from the first two exons by a large intron of about 5.6 kilobases. Two forms of human adrenodoxin reductase mRNA, differing by the presence or absence of 18 bases in the middle of the sequence, arise from alternate splicing at the 5' end of exon 7. This alternately spliced region is directly adjacent to the NADPH binding site, which is entirely contained in exon 6. The immediate 5' flanking region lacks TATA and CAAT boxes; however, this region is rich in G+C and contains six copies of the sequence GGGCGGG, resembling promoter sequences of housekeeping genes. RNase protection experiments show that transcription is initiated from multiple sites in the 5' flanking region, located about 21-91 base pairs upstream from the AUG translational initiation codon
Nederbragt Alexander J
Full Text Available Abstract Background Cyanobacteria often produce several different oligopeptides, with unknown biological functions, by nonribosomal peptide synthetases (NRPS. Although some cyanobacterial NRPS gene cluster types are well described, the entire NRPS genomic content within a single cyanobacterial strain has never been investigated. Here we have combined a genome-wide analysis using massive parallel pyrosequencing ("454" and mass spectrometry screening of oligopeptides produced in the strain Planktothrix rubescens NIVA CYA 98 in order to identify all putative gene clusters for oligopeptides. Results Thirteen types of oligopeptides were uncovered by mass spectrometry (MS analyses. Microcystin, cyanopeptolin and aeruginosin synthetases, highly similar to already characterized NRPS, were present in the genome. Two novel NRPS gene clusters were associated with production of anabaenopeptins and microginins, respectively. Sequence-depth of the genome and real-time PCR data revealed three copies of the microginin gene cluster. Since NRPS gene cluster candidates for microviridin and oscillatorin synthesis could not be found, putative (gene encoded precursor peptide sequences to microviridin and oscillatorin were found in the genes mdnA and oscA, respectively. The genes flanking the microviridin and oscillatorin precursor genes encode putative modifying enzymes of the precursor oligopeptides. We therefore propose ribosomal pathways involving modifications and cyclisation for microviridin and oscillatorin. The microviridin, anabaenopeptin and cyanopeptolin gene clusters are situated in close proximity to each other, constituting an oligopeptide island. Conclusion Altogether seven nonribosomal peptide synthetase (NRPS gene clusters and two gene clusters putatively encoding ribosomal oligopeptide biosynthetic pathways were revealed. Our results demonstrate that whole genome shotgun sequencing combined with MS-directed determination of oligopeptides successfully
Tannous, J.; El Khoury, R.; El Khoury, A.; Lteif, R.; Snini, S.; Lippi, Y.; Oswald, I.; Olivier, P.; Atoui, A.
Patulin is a polyketide-derived mycotoxin produced by numerous filamentous fungi. Among them, Penicillium expansum is by far the most problematic species. This fungus is a destructive phytopathogen capable of growing on fruit, provoking the blue mold decay of apples and producing significant amounts of patulin. The biosynthetic pathway of this mycotoxin is chemically well-characterized, but its genetic bases remain largely unknown with only few characterized genes in less economic relevant species. The present study consisted of the identification and positional organization of the patulin gene cluster in P. expansum strain NRRL 35695. Several amplification reactions were performed with degenerative primers that were designed based on sequences from the orthologous genes available in other species. An improved genome Walking approach was used in order to sequence the remaining adjacent genes of the cluster. RACE-PCR was also carried out from mRNAs to determine the start and stop codons of the coding sequences. The patulin gene cluster in P. expansum consists of 15 genes in the following order: patH, patG, patF, patE, patD, patC, patB, patA, patM, patN, patO, patL, patI, patJ, and patK. These genes share 60–70% of identity with orthologous genes grouped differently, within a putative patulin cluster described in a non-producing strain of Aspergillus clavatus. The kinetics of patulin cluster genes expression was studied under patulin-permissive conditions (natural apple-based medium) and patulin-restrictive conditions (Eagle's minimal essential medium), and demonstrated a significant association between gene expression and patulin production. In conclusion, the sequence of the patulin cluster in P. expansum constitutes a key step for a better understanding of themechanisms leading to patulin production in this fungus. It will allow the role of each gene to be elucidated, and help to define strategies to reduce patulin production in apple-based products
Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
Ghosh, Swagata; Rao, K Hanumantha; Sengupta, Manjistha; Bhattacharya, Sujit K; Datta, Asis
Pathogenic microorganisms like Vibrio cholerae are capable of adapting to diverse living conditions, especially when they transit from their environmental reservoirs to human host. V. cholerae attaches to N-acetylglucosamine (GlcNAc) residues in glycoproteins and lipids present in the intestinal epithelium and chitinous surface of zoo-phytoplanktons in the aquatic environment for its survival and colonization. GlcNAc utilization thus appears to be important for the pathogen to reach sufficient titres in the intestine for producing clinical symptoms of cholera. We report here the involvement of a second cluster of genes working in combination with the classical genes of GlcNAc catabolism, suggesting the occurrence of a novel variant of the process of biochemical conversion of GlcNAc to Fructose-6-phosphate as has been described in other organisms. Colonization was severely attenuated in mutants that were incapable of utilizing GlcNAc. It was also shown that N-acetylglucosamine specific repressor (NagC) performs a dual role - while the classical GlcNAc catabolic genes are under its negative control, the genes belonging to the second cluster are positively regulated by it. Further application of tandem affinity purification to NagC revealed its interaction with a novel partner. Our results provide a genetic program that probably enables V. cholerae to successfully utilize amino - sugars and also highlights a new mode of transcriptional regulation, not described in this organism. © 2011 Blackwell Publishing Ltd.
Sack, G.H.; Lease, J.J.
Three clones containing human genes for serum amyloid A protein (SAA) have been isolated and characterized. Each of two clones, GSAA 1 and 2 (of 12.8 and 15.9 kilobases, respectively), contains two exons, accouting for amino acids 12-58 and 58-103 of mature SAA; the extreme 5' termini and 5' untranslated regions have not yet been defined but are anticipated to be close based on studies of murine SAA genes. Initial amino acid sequence comparisons show 78/89 identical residues. At 4 of the 11 discrepant residues, the amino acid specified by the codon is the same as the corresponding residue in murine SAA. Identification of regions containing coding regions has permitted use of selected subclones for blot hybridization studies of larger human SAA chromosomal gene organization. The third clone, GSAA 3 also contains SAA coding information by DNA sequence analysis but has a different organization which has not yet been fully described. We have reported the isolation of clones of human DNA hybridizing with pRS48 - a plasmid containing a complementary DNA (cDNA) clone for murine serum amyloid A (SAA; 1, 2). We now present more detailed data confirming the identity and defining some of the organizational features of these clones
Andersen, Mikael Rørdam; Nielsen, Jakob Blæsbjerg; Klitgaard, Andreas
Biosynthetic pathways of secondary metabolites from fungi are currently subject to an intense effort to elucidate the genetic basis for these compounds due to their large potential within pharmaceutics and synthetic biochemistry. The preferred method is methodical gene deletions to identify...... used A. nidulans for our method development and validation due to the wealth of available biochemical data, but the method can be applied to any fungus with a sequenced and assembled genome, thus supporting further secondary metabolite pathway elucidation in the fungal kingdom....
Sutherland, Betsy M.; Bennett, Paula V.; Cintron-Torres, Nela; Hada, Megumi; Trunk, John; Monteleone, Denise; Sutherland, John C.; Laval, Jacques; Stanislaus, Marisha; Gewirtz, Alan
Ionizing radiation induces clusters of DNA damages--oxidized bases, abasic sites and strand breaks--on opposing strands within a few helical turns. Such damages have been postulated to be difficult to repair, as are double strand breaks (one type of cluster). We have shown that low doses of low and high linear energy transfer (LET) radiation induce such damage clusters in human cells. In human cells, DSB are about 30% of the total of complex damages, and the levels of DSBs and oxidized pyrimidine clusters are similar. The dose responses for cluster induction in cells can be described by a linear relationship, implying that even low doses of ionizing radiation can produce clustered damages. Studies are in progress to determine whether clusters can be produced by mechanisms other than ionizing radiation, as well as the levels of various cluster types formed by low and high LET radiation.
Kautsar, Satria A.; Suarez Duran, Hernando G.; Blin, Kai
exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery. The plantiSMASH web server, precalculated results...
Waalwijk, C.; Lee, van der T.A.J.; Vries, de P.M.; Hesselink, T.; Arts, J.; Kema, G.H.J.
A comparative genomic approach was used to study the mating type locus and the gene cluster involved in toxin production ( fumonisin) in Fusarium proliferatum, a pathogen with a wide host range and a complex toxin profile. A BAC library, generated from F. proliferatum isolate ITEM 2287, was used to
Cimermancic, P.; Medema, Marnix; Claesen, J.; Kurika, K.; Wieland Brown, L.C.; Mavrommatis, K.; Pati, A.; Godfrey, P.A.; Koehrsen, M.; Clardy, J.; Birren, B. W.; Takano, Eriko; Sali, A.; Linington, R.G.; Fischbach, M.A.
Although biosynthetic gene clusters (BGCs) have been discovered for hundreds of bacterial metabolites, our knowledge of their diversity remains limited. Here, we used a novel algorithm to systematically identify BGCs in the extensive extant microbial sequencing data. Network analysis of the
Moynihan, J.A.; Morrissey, J.P.; Coppoolse, E.; Stiekema, W.J.; O'Gara, F.; Boyd, E.F.
Pseudomonas fluorescens is of agricultural and economic importance as a biological control agent largely because of its plant-association and production of secondary metabolites, in particular 2, 4-diacetylphloroglucinol (2, 4-DAPG). This polyketide, which is encoded by the eight gene phl cluster,
We suggest that the demographic history (bottleneck and admixture of genetically differentiated populations) is the major factor shaping the pattern of nucleotide polymorphism in the -esterase gene cluster. However there are some 'footprints' of directional and balancing selection shaping specific distribution of nucleotide ...
Wolf Yuri I; Novichkov Pavel S; Sorokin Alexander V; Makarova Kira S; Koonin Eugene V
Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs ...
A large number of proteins are specifically synthesized in the hepatocyte. Only the adult liver expresses the complete repertoire of functions which are required at various stages during development. There is therefore a complex series of regulatory mechanisms responsible for the maintenance of the differentiated state and for the developmental and physiological variations in the pattern of gene expression. Human hepatoma cell lines HepG2 and Hep3B display a pattern of gene expression similar to adult and fetal liver, respectively; in contrast, cultured fibroblasts or HeLa cells do not express most of the liver specific genes. They have used these cell lines for transfection experiments with cloned human liver specific genes. DNA segments coding for alpha1-antitrypsin and retinol binding protein (two proteins synthesized both in fetal and adult liver) are expressed in the hepatoma cell lines HepG2 and Hep3B, but not in HeLa cells or fibroblasts. A DNA segment coding for haptoglobin (a protein synthesized only after birth) is only expressed in the hepatoma cell line HepG2 but not in Hep3B nor in non hepatic cell lines. The information for tissue specific expression is located in the 5' flanking region of all three genes. In vivo competition experiments show that these DNA segments bind to a common, apparently limiting, transacting factor. Conventional techniques (Bal deletions, site directed mutagenesis, etc.) have been used to precisely identify the DNA sequences responsible for these effects. The emerging picture is complex: they have identified multiple, separate transcriptional signals, essential for maximal promoter activation and tissue specific expression. Some of these signals show a negative effect on transcription in fibroblast cell lines.
Full Text Available Abstract Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a a novel coding of the search space that is simple, compact and easy to update; (b it can be used naturally in conjunction with data driven internal validation methods. We have experimented with the FOM methodology, specifically conceived for validating clusters of gene expression data. The validity of GenClust has been assessed experimentally on real data sets, both with the use of validation measures and in comparison with other algorithms, i.e., Average Link, Cast, Click and K-means. Conclusion Experiments show that none of the algorithms we have used is markedly superior to the others across data sets and validation measures; i.e., in many cases the observed differences between the worst and best performing algorithm may be statistically insignificant and they could be considered equivalent. However, there are cases in which an algorithm may be better than others and therefore worthwhile. In particular, experiments for GenClust show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms. In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology.
Peltier, Johann; Courtin, Pascal; El Meouche, Imane; Catel-Ferreira, Manuella; Chapot-Chartier, Marie-Pierre; Lemée, Ludovic; Pons, Jean-Louis
Primary antibiotic treatment of Clostridium difficile intestinal diseases requires metronidazole or vancomycin therapy. A cluster of genes homologous to enterococcal glycopeptides resistance vanG genes was found in the genome of C. difficile 630, although this strain remains sensitive to vancomycin. This vanG-like gene cluster was found to consist of five ORFs: the regulatory region consisting of vanR and vanS and the effector region consisting of vanG, vanXY and vanT. We found that 57 out of 83 C. difficile strains, representative of the main lineages of the species, harbour this vanG-like cluster. The cluster is expressed as an operon and, when present, is found at the same genomic location in all strains. The vanG, vanXY and vanT homologues in C. difficile 630 are co-transcribed and expressed to a low level throughout the growth phases in the absence of vancomycin. Conversely, the expression of these genes is strongly induced in the presence of subinhibitory concentrations of vancomycin, indicating that the vanG-like operon is functional at the transcriptional level in C. difficile. Hydrophilic interaction liquid chromatography (HILIC-HPLC) and MS analysis of cytoplasmic peptidoglycan precursors of C. difficile 630 grown without vancomycin revealed the exclusive presence of a UDP-MurNAc-pentapeptide with an alanine at the C terminus. UDP-MurNAc-pentapeptide [d-Ala] was also the only peptidoglycan precursor detected in C. difficile grown in the presence of vancomycin, corroborating the lack of vancomycin resistance. Peptidoglycan structures of a vanG-like mutant strain and of a strain lacking the vanG-like cluster did not differ from the C. difficile 630 strain, indicating that the vanG-like cluster also has no impact on cell-wall composition.
Full Text Available Abstract Background Genes specifically expressed in the oocyte play key roles in oogenesis, ovarian folliculogenesis, fertilization and/or early embryonic development. In an attempt to identify novel oocyte-specific genes in the mouse, we have used an in silico subtraction methodology, and we have focused our attention on genes that are organized in genomic clusters. Results In the present work, five clusters have been studied: a cluster of thirteen genes characterized by an F-box domain localized on chromosome 9, a cluster of six genes related to T-cell leukaemia/lymphoma protein 1 (Tcl1 on chromosome 12, a cluster composed of a SPErm-associated glutamate (E-Rich (Speer protein expressed in the oocyte in the vicinity of four unknown genes specifically expressed in the testis on chromosome 14, a cluster composed of the oocyte secreted protein-1 (Oosp-1 gene and two Oosp-related genes on chromosome 19, all three being characterized by a partial N-terminal zona pellucida-like domain, and another small cluster of two genes on chromosome 19 as well, composed of a TWIK-Related spinal cord K+ channel encoding-gene, and an unknown gene predicted in silico to be testis-specific. The specificity of expression was confirmed by RT-PCR and in situ hybridization for eight and five of them, respectively. Finally, we showed by comparing all of the isolated and clustered oocyte-specific genes identified so far in the mouse genome, that the oocyte-specific clusters are significantly closer to telomeres than isolated oocyte-specific genes are. Conclusion We have studied five clusters of genes specifically expressed in female, some of them being also expressed in male germ-cells. Moreover, contrarily to non-clustered oocyte-specific genes, those that are organized in clusters tend to map near chromosome ends, suggesting that this specific near-telomere position of oocyte-clusters in rodents could constitute an evolutionary advantage. Understanding the biological
Weber, Tilmann; Blin, Kai; Duddela, Srikanth
Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we...... introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration...... of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products...
Full Text Available The paulomycins are a group of glycosylated compounds featuring a unique paulic acid moiety. To locate their biosynthetic gene clusters, the genomes of two paulomycin producers, Streptomyces paulus NRRL 8115 and Streptomyces sp. YN86, were sequenced. The paulomycin biosynthetic gene clusters were defined by comparative analyses of the two genomes together with the genome of the third paulomycin producer Streptomyces albus J1074. Subsequently, the identity of the paulomycin biosynthetic gene cluster was confirmed by inactivation of two genes involved in biosynthesis of the paulomycose branched chain (pau11 and the ring A moiety (pau18 in Streptomyces paulus NRRL 8115. After determining the gene cluster boundaries, a convergent biosynthetic model was proposed for paulomycin based on the deduced functions of the pau genes. Finally, a paulomycin high-producing strain was constructed by expressing an activator-encoding gene (pau13 in S. paulus, setting the stage for future investigations.
Sutherland, Tara D.; Campbell, Peter M.; Weisman, Sarah; Trueman, Holly E.; Sriskantha, Alagacone; Wanjura, Wolfgang J.; Haritos, Victoria S.
The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1–4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-r...
Randise-Hinchliff, Carlo; Coukos, Robert; Sood, Varun; Sumner, Michael Chas; Zdraljevic, Stefan; Meldi Sholl, Lauren; Garvey Brickner, Donna; Ahmed, Sara; Watchmaker, Lauren; Brickner, Jason H
In budding yeast, targeting of active genes to the nuclear pore complex (NPC) and interchromosomal clustering is mediated by transcription factor (TF) binding sites in the gene promoters. For example, the binding sites for the TFs Put3, Ste12, and Gcn4 are necessary and sufficient to promote positioning at the nuclear periphery and interchromosomal clustering. However, in all three cases, gene positioning and interchromosomal clustering are regulated. Under uninducing conditions, local recruitment of the Rpd3(L) histone deacetylase by transcriptional repressors blocks Put3 DNA binding. This is a general function of yeast repressors: 16 of 21 repressors blocked Put3-mediated subnuclear positioning; 11 of these required Rpd3. In contrast, Ste12-mediated gene positioning is regulated independently of DNA binding by mitogen-activated protein kinase phosphorylation of the Dig2 inhibitor, and Gcn4-dependent targeting is up-regulated by increasing Gcn4 protein levels. These different regulatory strategies provide either qualitative switch-like control or quantitative control of gene positioning over different time scales. © 2016 Randise-Hinchliff et al.
Scherer Stephen W
Full Text Available Abstract Background Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. Results We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. Conclusions The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.
Scheuermann, Markus O.; Tajbakhsh, Jian; Kurz, Anette; Saracoglu, Kaan; Eils, Roland; Lichter, Peter
Knowledge about the functional impact of the topological organization of DNA sequences within interphase chromosome territories is still sparse. Of the few analyzed single copy genomic DNA sequences, the majority had been found to localize preferentially at the chromosome periphery or to loop out from chromosome territories. By means of dual-color fluorescence in situ hybridization (FISH), immunolabeling, confocal microscopy, and three-dimensional (3D) image analysis, we analyzed the intraterritorial and nuclear localization of 10 genomic fragments of different sequence classes in four different human cell types. The localization of three muscle-specific genes FLNA, NEB, and TTN, the oncogene BCL2, the tumor suppressor gene MADH4, and five putatively nontranscribed genomic sequences was predominantly in the periphery of the respective chromosome territories, independent from transcriptional status and from GC content. In interphase nuclei, the noncoding sequences were only rarely found associated with heterochromatic sites marked by the satellite III DNA D1Z1 or clusters of mammalian heterochromatin proteins (HP1α, HP1β, HP1γ). However, the nontranscribed sequences were found predominantly at the nuclear periphery or at the nucleoli, whereas genes tended to localize on chromosome surfaces exposed to the nuclear interior
Hahn, F M; Baker, J A; Poulter, C D
Isopentenyl diphosphate (IPP) isomerase catalyzes an essential activation step in the isoprenoid biosynthetic pathway. A database search based on probes from the highly conserved regions in three eukaryotic IPP isomerases revealed substantial similarity with ORF176 in the photosynthesis gene cluster in Rhodobacter capsulatus. The open reading frame was cloned into an Escherichia coli expression vector. The encoded 20-kDa protein, which was purified in two steps by ion exchange and hydrophobic...
McFadyen, D A; Addison, W; Locke, J
The alpha 2u-globulin are a group of similar proteins, belonging to the lipocalin superfamily of proteins, that are synthesized in a subset of secretory tissues in rats. The many alpha 2u-globulin isoforms are encoded by a multigene family that exhibits extensive homology. Despite a high degree of sequence identity, individual family members show diverse expression patterns involving complex hormonal, tissue-specific, and developmental regulation. Analysis suggests that there are approximately 20 alpha 2u-globulin genes in the rat genome. We have used fluorescence in situ hybridization (FISH) to show that the alpha 2u-globulin genes are clustered at a single site on rat Chromosome (Chr) 5 (5q22-24). Southern blots of rat genomic DNA separated by pulsed field gel electrophoresis indicated that the alpha 2u-globulin genes are contained on two NruI fragments with a total size of 880 kbp. Analysis of three P1 clones containing alpha 2u-globulin genes indicated that the alpha 2u-globulin genes are tandemly arranged in a head-to-tail fashion. The organization of the alpha 2u-globulin genes in the rat as a tandem array of single genes differs from the homologous major urinary protein genes in the mouse, which are organized as tandem arrays of divergently oriented gene pairs. The structure of these gene clusters may have consequences for the proposed function, as a pheromone transporter, for the protein products encoded by these genes.
Booma, P M; Prabhakaran, S; Dhanalakshmi, R
Microarray gene expression datasets has concerned great awareness among molecular biologist, statisticians, and computer scientists. Data mining that extracts the hidden and usual information from datasets fails to identify the most significant biological associations between genes. A search made with heuristic for standard biological process measures only the gene expression level, threshold, and response time. Heuristic search identifies and mines the best biological solution, but the association process was not efficiently addressed. To monitor higher rate of expression levels between genes, a hierarchical clustering model was proposed, where the biological association between genes is measured simultaneously using proximity measure of improved Pearson's correlation (PCPHC). Additionally, the Seed Augment algorithm adopts average linkage methods on rows and columns in order to expand a seed PCPHC model into a maximal global PCPHC (GL-PCPHC) model and to identify association between the clusters. Moreover, a GL-PCPHC applies pattern growing method to mine the PCPHC patterns. Compared to existing gene expression analysis, the PCPHC model achieves better performance. Experimental evaluations are conducted for GL-PCPHC model with standard benchmark gene expression datasets extracted from UCI repository and GenBank database in terms of execution time, size of pattern, significance level, biological association efficiency, and pattern quality.
Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A
To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.
Anastasia V Shindyapina
Full Text Available Methanol (MeOH is considered to be a poison in humans because of the alcohol dehydrogenase (ADH-mediated conversion of MeOH to formaldehyde (FA, which is toxic. Our recent genome-wide analysis of the mouse brain demonstrated that an increase in endogenous MeOH after ADH inhibition led to a significant increase in the plasma MeOH concentration and a modification of mRNA synthesis. These findings suggest endogenous MeOH involvement in homeostasis regulation by controlling mRNA levels. Here, we demonstrate directly that study volunteers displayed increasing concentrations of MeOH and FA in their blood plasma when consuming citrus pectin, ethanol and red wine. A microarray analysis of white blood cells (WBC from volunteers after pectin intake showed various responses for 30 significantly differentially regulated mRNAs, most of which were somehow involved in the pathogenesis of Alzheimer's disease (AD. There was also a decreased synthesis of hemoglobin mRNA, HBA and HBB, the presence of which in WBC RNA was not a result of red blood cells contamination because erythrocyte-specific marker genes were not significantly expressed. A qRT-PCR analysis of volunteer WBCs after pectin and red wine intake confirmed the complicated relationship between the plasma MeOH content and the mRNA accumulation of both genes that were previously identified, namely, GAPDH and SNX27, and genes revealed in this study, including MME, SORL1, DDIT4, HBA and HBB. We hypothesized that human plasma MeOH has an impact on the WBC mRNA levels of genes involved in cell signaling.
Nidheesh, N; Abdul Nazeer, K A; Ameer, P M
Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Khaitovich, Philipp; Tang, Kun; Franz, Henriette
Recent work has shown that the expression levels of genes transcribed in the brains of humans and chimpanzees have changed less than those of genes transcribed in other tissues  . However, when gene expression changes are mapped onto the evolutionary lineage in which they occurred, the brain...... shows more changes than other tissues in the human lineage compared to the chimpanzee lineage  ,  and  . There are two possible explanations for this: either positive selection drove more gene expression changes to fixation in the human brain than in the chimpanzee brain, or genes expressed...... in the brain experienced less purifying selection in humans than in chimpanzees, i.e. gene expression in the human brain is functionally less constrained. The first scenario would be supported if genes that changed their expression in the brain in the human lineage showed more selective sweeps than other genes...
Full Text Available Introduction: DNA microarray technique is one of the most important categories in bioinformatics,which allows the possibility of monitoring thousands of expressed genes has been resulted in creatinggiant data bases of gene expression data, recently. Statistical analysis of such databases includednormalization, clustering, classification and etc.Materials and Methods: Golub et al (1999 collected data bases of leukemia based on the method ofoligonucleotide. The data is on the internet. In this paper, we analyzed gene expression data. It wasclustered by several methods including multi-dimensional scaling, hierarchical and non-hierarchicalclustering. Data set included 20 Acute Lymphoblastic Leukemia (ALL patients and 14 Acute MyeloidLeukemia (AML patients. The results of tow methods of clustering were compared with regard to realgrouping (ALL & AML. R software was used for data analysis.Results: Specificity and sensitivity of divisive hierarchical clustering in diagnosing of ALL patientswere 75% and 92%, respectively. Specificity and sensitivity of partitioning around medoids indiagnosing of ALL patients were 90% and 93%, respectively. These results showed a wellaccomplishment of both methods of clustering. It is considerable that, due to clustering methodsresults, one of the samples was placed in ALL groups, which was in AML group in clinical test.Conclusion: With regard to concordance of the results with real grouping of data, therefore we canuse these methods in the cases where we don't have accurate information of real grouping of data.Moreover, Results of clustering might distinct subgroups of data in such a way that would be necessaryfor concordance with clinical outcomes, laboratory results and so on.
Somanath Bhat; Xi Luo; Zhiqiang Xu; Lixia Liu; Ren Zhang
Contamination of soil and water by arsenic is a global problem.In Australia, the dipping of cattle in arsenic-containing solution to control cattle ticks in last centenary has left many sites heavily contaminated with arsenic and other toxicants.We had previously isolated five soil bacterial strains (CDB1-5) highly resistant to arsenic.To understand the resistance mechanism, molecular studies have been carried out.Two chromosome-encoded arsenic resistance (ars) gene clusters have been cloned from CDB3 (Bacillus sp.).They both function in Escherichia coli and cluster 1 exerts a much higher resistance to the toxic metalloid.Cluster 2 is smaller possessing four open reading frames (ORFs) arsRorf2BC, similar to that identified in Bacillus subtilis Skin element.Among the eight ORFs in cluster 1 five are analogs of common ars genes found in other bacteria, however, organized in a unique order arsRBCDA instead of arsRDABC.Three other putative genes are located directly downstream and designated as arsTIP based on the homologies of their theoretical translation sequences respectively to thioredoxin reductases, iron-sulphur cluster proteins and protein phosphatases.The latter two are novel of any known ars operons.The arsD gene from Bacillus species was cloned for the first time and the predict protein differs from the well studied E.coli ArsD by lacking two pairs of C-terrninal cysteine residues.Its functional involvement in arsenic resistance has been confirmed by a deletion experiment.There exists also an inverted repeat in the intergenic region between arsC and arsD implying some unknown transcription regulation.
Full Text Available Vertebrates require tremendous molecular diversity to defend against numerous small hydrophobic chemicals. UDP-glucuronosyltransferases (UGTs are a large family of detoxification enzymes that glucuronidate xenobiotics and endobiotics, facilitating their excretion from the body. The UGT1 gene cluster contains a tandem array of variable first exons, each preceded by a specific promoter, and a common set of downstream constant exons, similar to the genomic organization of the protocadherin (Pcdh, immunoglobulin, and T-cell receptor gene clusters. To assist pharmacogenomics studies in Chinese, we sequenced nine first exons, promoter and intronic regions, and five common exons of the UGT1 gene cluster in a population sample of 253 unrelated Chinese individuals. We identified 101 polymorphisms and found 15 novel SNPs. We then computed allele frequencies for each polymorphism and reconstructed their linkage disequilibrium (LD map. The UGT1 cluster can be divided into five linkage blocks: Block 9 (UGT1A9, Block 9/7/6 (UGT1A9, UGT1A7, and UGT1A6, Block 5 (UGT1A5, Block 4/3 (UGT1A4 and UGT1A3, and Block 3' UTR. Furthermore, we inferred haplotypes and selected their tagSNPs. Finally, comparing our data with those of three other populations of the HapMap project revealed ethnic specificity of the UGT1 genetic diversity in Chinese. These findings have important implications for future molecular genetic studies of the UGT1 gene cluster as well as for personalized medical therapies in Chinese.
Calles-Enríquez, Marina; Hjort, Benjamin Benn; Andersen, Pia Skov
to produce histamine. The hdc clusters of S. thermophilus CHCC1524 and CHCC6483 were sequenced, and the factors that affect histamine biosynthesis and histidine-decarboxylating gene (hdcA) expression were studied. The hdc cluster began with the hdcA gene, was followed by a transporter (hdcP), and ended...... with the hdcB gene, which is of unknown function. The three genes were orientated in the same direction. The genetic organization of the hdc cluster showed a unique organization among the lactic acid bacterial group and resembled those of Staphylococcus and Clostridium species, thus indicating possible...... acquisition through a horizontal transfer mechanism. Transcriptional analysis of the hdc cluster revealed the existence of a polycistronic mRNA covering the three genes. The histidine-decarboxylating gene (hdcA) of S. thermophilus demonstrated maximum expression during the stationary growth phase, with high...
Targeting an exogenous gene into a favorable gene locus and for expression under endogenous regulators is an ideal method in mammary gland bioreactor research. For this purpose, a gene targeting vector was constructed to targeting the human lysozyme gene on bovine αs1-casein gene locus. In this case, the ...
Maes, Michael; Flache, Andreas; Helbing, Dirk
One of the most intriguing dynamics in biological systems is the emergence of clustering, in the sense that individuals self-organize into separate agglomerations in physical or behavioral space. Several theories have been developed to explain clustering in, for instance, multi-cellular organisms,
Raphael, Brian H; Luquez, Carolina; McCroskey, Loretta M; Joseph, Lavin A; Jacobson, Mark J; Johnson, Eric A; Maslanka, Susan E; Andreadis, Joanne D
A group of five clonally related Clostridium botulinum type A strains isolated from different sources over a period of nearly 40 years harbored several conserved genetic properties. These strains contained a variant bont/A1 with five nucleotide polymorphisms compared to the gene in C. botulinum strain ATCC 3502. The strains also had a common toxin gene cluster composition (ha-/orfX+) similar to that associated with bont/A in type A strains containing an unexpressed bont/B [termed A(B) strains]. However, bont/B was not identified in the strains examined. Comparative genomic hybridization demonstrated identical genomic content among the strains relative to C. botulinum strain ATCC 3502. In addition, microarray data demonstrated the absence of several genes flanking the toxin gene cluster among the ha-/orfX+ A1 strains, suggesting the presence of genomic rearrangements with respect to this region compared to the C. botulinum ATCC 3502 strain. All five strains were shown to have identical flaA variable region nucleotide sequences. The pulsed-field gel electrophoresis patterns of the strains were indistinguishable when digested with SmaI, and a shift in the size of at least one band was observed in a single strain when digested with XhoI. These results demonstrate surprising genomic homogeneity among a cluster of unique C. botulinum type A strains of diverse origin.
Arenas-Mena, C.; Cameron, A. R.; Davidson, E. H.
The Hox cluster of the sea urchin Strongylocentrous purpuratus contains ten genes in a 500 kb span of the genome. Only two of these genes are expressed during embryogenesis, while all of eight genes tested are expressed during development of the adult body plan in the larval stage. We report the spatial expression during larval development of the five 'posterior' genes of the cluster: SpHox7, SpHox8, SpHox9/10, SpHox11/13a and SpHox11/13b. The five genes exhibit a dynamic, largely mesodermal program of expression. Only SpHox7 displays extensive expression within the pentameral rudiment itself. A spatially sequential and colinear arrangement of expression domains is found in the somatocoels, the paired posterior mesodermal structures that will become the adult perivisceral coeloms. No such sequential expression pattern is observed in endodermal, epidermal or neural tissues of either the larva or the presumptive juvenile sea urchin. The spatial expression patterns of the Hox genes illuminate the evolutionary process by which the pentameral echinoderm body plan emerged from a bilateral ancestor.
Full Text Available The cysteine rich prostate and testis expressed (Pate proteins identified till date are thought to resemble the three fingered protein/urokinase-type plasminogen activator receptor proteins. In this study, for the first time, we report the identification, cloning and characterization of rat Pate gene cluster and also determine the expression pattern. The rat Pate genes are clustered on chromosome 8 and their predicted proteins retained the ten cysteine signature characteristic to TFP/Ly-6 protein family. PATE and PATE-F three dimensional protein structure was found to be similar to that of the toxin bucandin. Though Pate gene expression is thought to be prostate and testis specific, we observed that rat Pate genes are also expressed in seminal vesicle and epididymis and in tissues beyond the male reproductive tract. In the developing rats (20-60 day old, expression of Pate genes seem to be androgen dependent in the epididymis and testis. In the adult rat, androgen ablation resulted in down regulation of the majority of Pate genes in the epididymides. PATE and PATE-F proteins were found to be expressed abundantly in the male reproductive tract of rats and on the sperm. Recombinant PATE protein exhibited potent antibacterial activity, whereas PATE-F did not exhibit any antibacterial activity. Pate expression was induced in the epididymides when challenged with LPS. Based on our results, we conclude that rat PATE proteins may contribute to the reproductive and defense functions.
Jones, Lauren B; Ghosh, Pallab; Lee, Jung-Hyun; Chou, Chia-Ni; Kunz, Daniel A
A genetic linkage between a conserved gene cluster (Nit1C) and the ability of bacteria to utilize cyanide as the sole nitrogen source was demonstrated for nine different bacterial species. These included three strains whose cyanide nutritional ability has formerly been documented (Pseudomonas fluorescens Pf11764, Pseudomonas putida BCN3 and Klebsiella pneumoniae BCN33), and six not previously known to have this ability [Burkholderia (Paraburkholderia) xenovorans LB400, Paraburkholderia phymatum STM815, Paraburkholderia phytofirmans PsJN, Cupriavidus (Ralstonia) eutropha H16, Gluconoacetobacter diazotrophicus PA1 5 and Methylobacterium extorquens AM1]. For all bacteria, growth on or exposure to cyanide led to the induction of the canonical nitrilase (NitC) linked to the gene cluster, and in the case of Pf11764 in particular, transcript levels of cluster genes (nitBCDEFGH) were raised, and a nitC knock-out mutant failed to grow. Further studies demonstrated that the highly conserved nitB gene product was also significantly elevated. Collectively, these findings provide strong evidence for a genetic linkage between Nit1C and bacterial growth on cyanide, supporting use of the term cyanotrophy in describing what may represent a new nutritional paradigm in microbiology. A broader search of Nit1C genes in presently available genomes revealed its presence in 270 different bacteria, all contained within the domain Bacteria, including Gram-positive Firmicutes and Actinobacteria, and Gram-negative Proteobacteria and Cyanobacteria. Absence of the cluster in the Archaea is congruent with events that may have led to the inception of Nit1C occurring coincidentally with the first appearance of cyanogenic species on Earth, dating back 400-500 million years.
Sutherland, Tara D; Campbell, Peter M; Weisman, Sarah; Trueman, Holly E; Sriskantha, Alagacone; Wanjura, Wolfgang J; Haritos, Victoria S
The pupal cocoon of the domesticated silk moth Bombyx mori is the best known and most extensively studied insect silk. It is not widely known that Apis mellifera larvae also produce silk. We have used a combination of genomic and proteomic techniques to identify four honey bee fiber genes (AmelFibroin1-4) and two silk-associated genes (AmelSA1 and 2). The four fiber genes are small, comprise a single exon each, and are clustered on a short genomic region where the open reading frames are GC-rich amid low GC intergenic regions. The genes encode similar proteins that are highly helical and predicted to form unusually tight coiled coils. Despite the similarity in size, structure, and composition of the encoded proteins, the genes have low primary sequence identity. We propose that the four fiber genes have arisen from gene duplication events but have subsequently diverged significantly. The silk-associated genes encode proteins likely to act as a glue (AmelSA1) and involved in silk processing (AmelSA2). Although the silks of honey bees and silkmoths both originate in larval labial glands, the silk proteins are completely different in their primary, secondary, and tertiary structures as well as the genomic arrangement of the genes encoding them. This implies independent evolutionary origins for these functionally related proteins.
genes in circulating and resident human immune cells can be studied in mice after the transplantation and engraft- ment of human hemato- lymphoid immune...Martinek J, Strowig T, Gearty SV, Teichmann LL, et al. Development and function of human innate immune cells in a humanized mouse model. Nat Bio...normal wound repair and regeneration, we hypothesize that the preponderance of human-specific genes expressed in human inflammatory cells is commensurate
Nedwin, G.E.; Jarrett-Nedwin, J.; Smith, D.H.; Naylor, S.L.; Sakaguchi, A.Y.; Goeddel, D.V.; Gray, P.W.
The authors have isolated, sequenced, and determined the chromosomal localization of the gene encoding human lymphotoxin (LT). The single copy gene was isolated from a human genomic library using a /sup 32/P-labeled 116 bp synthetic DNA fragment whose sequence was based on the NH/sub 2/-terminal amino acid sequence of LT. The gene spans 3 kb of DNA and is interrupted by three intervening sequences. The LT gene is located on human chromosome 6, as determined by Southern blot analysis of human-murine hybrid DNA. Putative transcriptional control regions and areas of homology with the promoters of interferon and other genes are identified
Nordén, Rickard; Samuelsson, Ebba; Nyström, Kristina
Herpes simplex virus type 1 has the ability to induce expression of a human gene cluster located on chromosome 19 upon infection. This gene cluster contains three fucosyltransferases (encoded by FUT3, FUT5 and FUT6) with the ability to add a fucose to an N-acetylglucosamine residue. Little is known regarding the transcriptional activation of these three genes in human cells. Intriguingly, herpes simplex virus type 1 activates all three genes simultaneously during infection, a situation not observed in uninfected tissue, pointing towards a virus specific mechanism for transcriptional activation. The aim of this study was to define the underlying mechanism for the herpes simplex virus type 1 activation of FUT3, FUT5 and FUT6 transcription. The transcriptional activation of the FUT-gene cluster on chromosome 19 in fibroblasts was specific, not involving adjacent genes. Moreover, inhibition of NFκB signaling through panepoxydone treatment significantly decreased the induction of FUT3, FUT5 and FUT6 transcriptional activation, as did siRNA targeting of p65, in herpes simplex virus type 1 infected fibroblasts. NFκB and p65 signaling appears to play an important role in the regulation of FUT3, FUT5 and FUT6 transcriptional activation by herpes simplex virus type 1 although additional, unidentified, viral factors might account for part of the mechanism as direct interferon mediated stimulation of NFκB was not sufficient to induce the fucosyltransferase encoding gene cluster in uninfected cells. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: email@example.com.
Sep 27, 2017 ... Author for correspondence (firstname.lastname@example.org). MS received 15 ... lic clusters using density functional theory (DFT)-GGA of the DMOL3 package. ... In the process of geometric optimization, con- vergence thresholds ..... and Postgraduate Research & Practice Innovation Program of. Jiangsu Province ...
environmental as well as technical problems during fuel gas utilization. ... adsorption on some alloys of Pd, namely PdAu, PdAg ... ried out on small neutral and charged Au24,26,27, Cu,28 ... study of Zanti et al.29 on Pdn (n = 1–9) clusters.
Peña, Alejandro; Del Carratore, Francesco; Cummings, Matthew; Takano, Eriko; Breitling, Rainer
The rapid increase of publicly available microbial genome sequences has highlighted the presence of hundreds of thousands of biosynthetic gene clusters (BGCs) encoding valuable secondary metabolites. The experimental characterization of new BGCs is extremely laborious and struggles to keep pace with the in silico identification of potential BGCs. Therefore, the prioritisation of promising candidates among computationally predicted BGCs represents a pressing need. Here, we propose an output ordering and prioritisation system (OOPS) which helps sorting identified BGCs by a wide variety of custom-weighted biological and biochemical criteria in a flexible and user-friendly interface. OOPS facilitates a judicious prioritisation of BGCs using G+C content, coding sequence length, gene number, cluster self-similarity and codon bias parameters, as well as enabling the user to rank BGCs based upon BGC type, novelty, and taxonomic distribution. Effective prioritisation of BGCs will help to reduce experimental attrition rates and improve the breadth of bioactive metabolites characterized.
Hensman, James; Lawrence, Neil D; Rattray, Magnus
Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
The Ouro Negro common bean cultivar contains the Co-34/Phg-3 gene cluster that confers resistance to the anthracnose (ANT) and angular leaf spot (ALS) pathogens. These genes are tightly linked on chromosome 4. Ouro Negro also has the Ur-14 rust resistance gene, reportedly in the vicinity of Co- 34; ...
Abbas, Tariq; Younus, Muhammad; Muhammad, Sayyad Aun
Crimean Congo hemorrhagic fever (CCHF) is a tick-borne viral zoonotic disease that has been reported in almost all geographic regions in Pakistan. The aim of this study was to identify spatial clusters of human cases of CCHF reported in country. Kulldorff's spatial scan statisitc, Anselin's Local Moran's I and Getis Ord Gi* tests were applied on data (i.e. number of laboratory confirmed cases reported from each district during year 2013). The analyses revealed a large multi-district cluster of high CCHF incidence in the uplands of Balochistan province near it border with Afghanistan. The cluster comprised the following districts: Qilla Abdullah; Qilla Saifullah; Loralai, Quetta, Sibi, Chagai, and Mastung. Another cluster was detected in Punjab and included Rawalpindi district and a part of Islamabad. We provide empirical evidence of spatial clustering of human CCHF cases in the country. The districts in the clusters should be given priority in surveillance, control programs, and further research.
Ehrlich, Kenneth C; Mack, Brian M
Fifty six secondary metabolite biosynthesis gene clusters are predicted to be in the Aspergillus flavus genome. In spite of this, the biosyntheses of only seven metabolites, including the aflatoxins, kojic acid, cyclopiazonic acid and aflatrem, have been assigned to a particular gene cluster. We used RNA-seq to compare expression of secondary metabolite genes in gene clusters for the closely related fungi A. parasiticus, A. oryzae, and A. flavus S and L sclerotial morphotypes. The data help to refine the identification of probable functional gene clusters within these species. Our results suggest that A. flavus, a prevalent contaminant of maize, cottonseed, peanuts and tree nuts, is capable of producing metabolites which, besides aflatoxin, could be an underappreciated contributor to its toxicity.
Gnonlonfin, G. J. B.; Adjovi, Y. C.; Tokpo, A. F.
Fungal infection and aflatoxin contamination were evaluated on 114 samples of dried and milled spices such as ginger, garlic and black pepper from southern Benin and Togo collected in November 2008 -January 2009. These products are dried to preserve them for lean periods available throughout...... of Aspergillus were dominant on all marketed dried and milled spices irrespective of country. Gene characterization and amplification analysis showed that most of the Aspergillus flavus isolates possess the cluster genes for aflatoxin production. Aflatoxin B1 assessment by Thin Layer Chromatography showed...... further for other products such as dried and milled spices. Crown Copyright (C) 2013 Published by Elsevier Ltd. All rights reserved....
Spiering, Martin J.; Moon, Christina D.; Wilkinson, Heather H.; Schardl, Christopher L.
Loline alkaloids are produced by mutualistic fungi symbiotic with grasses, and they protect the host plants from insects. Here we identify in the fungal symbiont, Neotyphodium uncinatum, two homologous gene clusters (LOL-1 and LOL-2) associated with loline-alkaloid production. Nine genes were identified in a 25-kb region of LOL-1 and designated (in order) lolF-1, lolC-1, lolD-1, lolO-1, lolA-1, lolU-1, lolP-1, lolT-1, and lolE-1. LOL-2 contained the homologs lolC-2 through lolE-2 in the same ...
Dian Anggraini Suroto
Full Text Available Phthoxazolin A, an oxazole-containing polyketide, has a broad spectrum of anti-oomycete activity and herbicidal activity. We recently identified phthoxazolin A as a cryptic metabolite of Streptomyces avermitilis that produces the important anthelmintic agent avermectin. Even though genome data of S. avermitilis is publicly available, no plausible biosynthetic gene cluster for phthoxazolin A is apparent in the sequence data. Here, we identified and characterized the phthoxazolin A (ptx biosynthetic gene cluster through genome sequencing, comparative genomic analysis, and gene disruption. Sequence analysis uncovered that the putative ptx biosynthetic genes are laid on an extra genomic region that is not found in the public database, and 8 open reading frames in the extra genomic region could be assigned roles in the biosynthesis of the oxazole ring, triene polyketide and carbamoyl moieties. Disruption of the ptxA gene encoding a discrete acyltransferase resulted in a complete loss of phthoxazolin A production, confirming that the trans-AT type I PKS system is responsible for the phthoxazolin A biosynthesis. Based on the predicted functional domains in the ptx assembly line, we propose the biosynthetic pathway of phthoxazolin A.
Full Text Available Abstract Background Animal societies are diverse, ranging from small family-based groups to extraordinarily large social networks in which many unrelated individuals interact. At the extreme of this continuum, some ant species form unicolonial populations in which workers and queens can move among multiple interconnected nests without eliciting aggression. Although unicoloniality has been mostly studied in invasive ants, it also occurs in some native non-invasive species. Unicoloniality is commonly associated with very high queen number, which may result in levels of relatedness among nestmates being so low as to raise the question of the maintenance of altruism by kin selection in such systems. However, the actual relatedness among cooperating individuals critically depends on effective dispersal and the ensuing pattern of genetic structuring. In order to better understand the evolution of unicoloniality in native non-invasive ants, we investigated the fine-scale population genetic structure and gene flow in three unicolonial populations of the wood ant F. paralugubris. Results The analysis of geo-referenced microsatellite genotypes and mitochondrial haplotypes revealed the presence of cryptic clusters of genetically-differentiated nests in the three populations of F. paralugubris. Because of this spatial genetic heterogeneity, members of the same clusters were moderately but significantly related. The comparison of nuclear (microsatellite and mitochondrial differentiation indicated that effective gene flow was male-biased in all populations. Conclusion The three unicolonial populations exhibited male-biased and mostly local gene flow. The high number of queens per nest, exchanges among neighbouring nests and restricted long-distance gene flow resulted in large clusters of genetically similar nests. The positive relatedness among clustermates suggests that kin selection may still contribute to the maintenance of altruism in unicolonial
Full Text Available Abstract Background The recent increase in bacterial resistance to antibiotics has promoted the exploration of novel antibacterial materials. As a result, many researchers are undertaking work to identify new lantibiotics because of their potent antimicrobial activities. The objective of this study was to provide details of a lantibiotic-like gene cluster in Paenibacillus elgii B69 and to produce the antibacterial substances coded by this gene cluster based on culture screening. Results Analysis of the P. elgii B69 genome sequence revealed the presence of a lantibiotic-like gene cluster composed of five open reading frames (elgT1, elgC, elgT2, elgB, and elgA. Screening of culture extracts for active substances possessing the predicted properties of the encoded product led to the isolation of four novel peptides (elgicins AI, AII, B, and C with a broad inhibitory spectrum. The molecular weights of these peptides were 4536, 4593, 4706, and 4820 Da, respectively. The N-terminal sequence of elgicin B was Leu-Gly-Asp-Tyr, which corresponded to the partial sequence of the peptide ElgA encoded by elgA. Edman degradation suggested that the product elgicin B is derived from ElgA. By correlating the results of electrospray ionization-mass spectrometry analyses of elgicins AI, AII, and C, these peptides are deduced to have originated from the same precursor, ElgA. Conclusions A novel lantibiotic-like gene cluster was shown to be present in P. elgii B69. Four new lantibiotics with a broad inhibitory spectrum were isolated, and these appear to be promising antibacterial agents.
Woods Donald E
Full Text Available Abstract Background Rhamnolipids are surface active molecules composed of rhamnose and β-hydroxydecanoic acid. These biosurfactants are produced mainly by Pseudomonas aeruginosa and have been thoroughly investigated since their early discovery. Recently, they have attracted renewed attention because of their involvement in various multicellular behaviors. Despite this high interest, only very few studies have focused on the production of rhamnolipids by Burkholderia species. Results Orthologs of rhlA, rhlB and rhlC, which are responsible for the biosynthesis of rhamnolipids in P. aeruginosa, have been found in the non-infectious Burkholderia thailandensis, as well as in the genetically similar important pathogen B. pseudomallei. In contrast to P. aeruginosa, both Burkholderia species contain these three genes necessary for rhamnolipid production within a single gene cluster. Furthermore, two identical, paralogous copies of this gene cluster are found on the second chromosome of these bacteria. Both Burkholderia spp. produce rhamnolipids containing 3-hydroxy fatty acid moieties with longer side chains than those described for P. aeruginosa. Additionally, the rhamnolipids produced by B. thailandensis contain a much larger proportion of dirhamnolipids versus monorhamnolipids when compared to P. aeruginosa. The rhamnolipids produced by B. thailandensis reduce the surface tension of water to 42 mN/m while displaying a critical micelle concentration value of 225 mg/L. Separate mutations in both rhlA alleles, which are responsible for the synthesis of the rhamnolipid precursor 3-(3-hydroxyalkanoyloxyalkanoic acid, prove that both copies of the rhl gene cluster are functional, but one contributes more to the total production than the other. Finally, a double ΔrhlA mutant that is completely devoid of rhamnolipid production is incapable of swarming motility, showing that both gene clusters contribute to this phenotype. Conclusions Collectively, these
Wolf Yuri I
Full Text Available Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs. Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile
Naumenko, Olesya I; Guo, Xi; Senchenkova, Sof'ya N; Geng, Peng; Perepelov, Andrei V; Shashkov, Alexander S; Liu, Bin; Knirel, Yuriy A
Mild acid hydrolysis of the lipopolysaccharide of Escherichia coli O54 afforded an O-polysaccharide, which was studied by sugar analysis, solvolysis with anhydrous trifluoroacetic acid, and 1 H and 13 C NMR spectroscopy. Solvolysis cleaved predominantly the linkage of β-d-Ribf and, to a lesser extent, that of β-d-GlcpNAc, whereas the other linkages, including the linkage of α-l-Rhap, were stable under selected conditions (40 °C, 5 h). The following structure of the O-polysaccharide was established: →4)-α-d-GalpA-(1 → 2)-α-l-Rhap-(1 → 2)-β-d-Ribf-(1 → 4)-β-d-Galp-(1 → 3)-β-d-GlcpNAc-(1→ The O-antigen gene cluster of E. coli O54 was analyzed and found to be consistent in general with the O-polysaccharide structure established but there were two exceptions: i) in the cluster, there were genes for phosphoserine phosphatase and serine transferase, which have no apparent role in the O-polysaccharide synthesis, and ii) no ribofuranosyltransferase gene was present in the cluster. Both uncommon features are shared by some other enteric bacteria. Copyright © 2018 Elsevier Ltd. All rights reserved.
Valdmanis, P N; Kabashi, E; Dyck, A; Hince, P; Lee, J; Dion, P; D'Amour, M; Souchon, F; Bouchard, J-P; Salachas, F; Meininger, V; Andersen, P M; Camu, W; Dupré, N; Rouleau, G A
The paraoxonase gene cluster on chromosome 7 comprising the PON1-3 genes is an attractive candidate for association in amyotrophic lateral sclerosis (ALS) given the role of paraoxonase genes during the response to oxidative stress and their contribution to the enzymatic break down of nerve toxins. Oxidative stress is considered one of the mechanisms involved in ALS pathogenesis. Evidence for this includes the fact that mutations of SOD1, which normally reduce the production of toxic superoxide anion, account for 12% to 23% of familial cases in ALS. In addition, PON variants were shown to be associated with susceptibility to ALS in several North American and European populations. We extended this analysis to examine 20 single nucleotide polymorphisms (SNPs) across the PON gene cluster in a set of patients from France (480 cases, 475 controls), Quebec (159 cases, 95 controls), and Sweden (558 cases, 506 controls). Although individual SNPs were not considered associated on their own, a haplotype of SNPs at the C-terminal portion of PON2 that includes the PON2 C311S amino acid change was significant in the French (p value 0.0075) and Quebec (p value 0.026) populations as well as all three populations combined (p value 1.69 x 10(-6)). Stratification of the samples showed that this variation was pertinent to ALS susceptibility as a whole, and not to a particular subset of patients. These findings contribute to the increasing weight of evidence that genetic variants in the paraoxonase gene cluster are associated with amyotrophic lateral sclerosis.
Wolf Yuri I
Full Text Available Abstract Background Collections of Clusters of Orthologous Genes (COGs provide indispensable tools for comparative genomic analysis, evolutionary reconstruction and functional annotation of new genomes. Initially, COGs were made for all complete genomes of cellular life forms that were available at the time. However, with the accumulation of thousands of complete genomes, construction of a comprehensive COG set has become extremely computationally demanding and prone to error propagation, necessitating the switch to taxon-specific COG collections. Previously, we reported the collection of COGs for 41 genomes of Archaea (arCOGs. Here we present a major update of the arCOGs and describe evolutionary reconstructions to reveal general trends in the evolution of Archaea. Results The updated version of the arCOG database incorporates 91% of the pangenome of 120 archaea (251,032 protein-coding genes altogether into 10,335 arCOGs. Using this new set of arCOGs, we performed maximum likelihood reconstruction of the genome content of archaeal ancestral forms and gene gain and loss events in archaeal evolution. This reconstruction shows that the last Common Ancestor of the extant Archaea was an organism of greater complexity than most of the extant archaea, probably with over 2,500 protein-coding genes. The subsequent evolution of almost all archaeal lineages was apparently dominated by gene loss resulting in genome streamlining. Overall, in the evolution of Archaea as well as a representative set of bacteria that was similarly analyzed for comparison, gene losses are estimated to outnumber gene gains at least 4 to 1. Analysis of specific patterns of gene gain in Archaea shows that, although some groups, in particular Halobacteria, acquire substantially more genes than others, on the whole, gene exchange between major groups of Archaea appears to be largely random, with no major ‘highways’ of horizontal gene transfer. Conclusions The updated collection
Ye, Zhongfeng; Yamazaki, Kohei; Minoda, Hiromi; Miyamoto, Koji; Miyazaki, Sho; Kawaide, Hiroshi; Yajima, Arata; Nojiri, Hideaki; Yamane, Hisakazu; Okada, Kazunori
In response to environmental stressors such as blast fungal infections, rice produces phytoalexins, an antimicrobial diterpenoid compound. Together with momilactones, phytocassanes are among the major diterpenoid phytoalexins. The biosynthetic genes of diterpenoid phytoalexin are organized on the chromosome in functional gene clusters, comprising diterpene cyclase, dehydrogenase, and cytochrome P450 monooxygenase genes. Their functions have been studied extensively using in vitro enzyme assay systems. Specifically, P450 genes (CYP71Z6, Z7; CYP76M5, M6, M7, M8) on rice chromosome 2 have multifunctional activities associated with ent-copalyl diphosphate-related diterpene hydrocarbons, but the in planta contribution of these genes to diterpenoid phytoalexin production remains unknown. Here, we characterized cyp71z7 T-DNA mutant and CYP76M7/M8 RNAi lines to find that potential phytoalexin intermediates accumulated in these P450-suppressed rice plants. The results suggested that in planta, CYP71Z7 is responsible for C2-hydroxylation of phytocassanes and that CYP76M7/M8 is involved in C11α-hydroxylation of 3-hydroxy-cassadiene. Based on these results, we proposed potential routes of phytocassane biosynthesis in planta.
Jiang, Chunyan; Wang, Hougen; Kang, Qianjin; Liu, Jing
Salinomycin is widely used in animal husbandry as a food additive due to its antibacterial and anticoccidial activities. However, its biosynthesis had only been studied by feeding experiments with isotope-labeled precursors. A strategy with degenerate primers based on the polyether-specific epoxidase sequences was successfully developed to clone the salinomycin gene cluster. Using this strategy, a putative epoxidase gene, slnC, was cloned from the salinomycin producer Streptomyces albus XM211. The targeted replacement of slnC and subsequent trans-complementation proved its involvement in salinomycin biosynthesis. A 127-kb DNA region containing slnC was sequenced, including genes for polyketide assembly and release, oxidative cyclization, modification, export, and regulation. In order to gain insight into the salinomycin biosynthesis mechanism, 13 gene replacements and deletions were conducted. Including slnC, 7 genes were identified as essential for salinomycin biosynthesis and putatively responsible for polyketide chain release, oxidative cyclization, modification, and regulation. Moreover, 6 genes were found to be relevant to salinomycin biosynthesis and possibly involved in precursor supply, removal of aberrant extender units, and regulation. Sequence analysis and a series of gene replacements suggest a proposed pathway for the biosynthesis of salinomycin. The information presented here expands the understanding of polyether biosynthesis mechanisms and paves the way for targeted engineering of salinomycin activity and productivity. PMID:22156425
Mathor, Monica Beatriz.
Taking advantage of the recent progress in the DNA-recombinant techniques and of the potentiality of normal human keratinocytes primary culture to reconstitute the epidermis, it was decided to genetically transform these keratinocytes to produce human growth hormone under controllable conditions that would be used in gene therapy at this hormone deficient patients. The first step to achieve this goal was to standardize infection of keratinocytes with retrovirus producer cells containing a construct which included the gene of bacterial b-galactosidase. The best result was obtained cultivating the keratinocytes for 3 days in a 2:1 mixture of retrovirus producer cells and 3T3-J2 fibroblasts irradiated with 60 Gy, and splitting these infected keratinocytes on 3T3-J2 fibroblasts feeder layer. Another preliminary experiment was to infect normal human keratinocytes with interleukin-6 gene (hIL-6) that, in pathologic conditions, could be reproduced by keratinocytes and secreted to the blood stream. Thus, we verify that infected keratinocytes secrete an average amount of 500 ng/10 6 cell/day of cytokin during the in vitro life time, that certify the stable character of the injection. These keratinocytes, when grafted in mice, secrete hIL-6 to the blood stream reaching levels of 40 pg/ml of serum. After these preliminary experiments, we construct a retroviral vector with the human growth hormone gene (h GH) driven by human metallothionein promoter (h PMT), designated DChPMTGH. Normal human keratinocytes were infected with DChPMTGH producer cells, following previously standardized protocol, obtaining infected keratinocytes secreting to the culture media 340 ng h GH/10 6 cell/day without promoter activation. This is the highest level of h GH secreted in human keratinocytes primary culture described in literature. The h GH value increases approximately 10 times after activation with 100 μM Zn +2 for 8-12 hours. (author). 158 refs., 42 figs., 6 tabs
Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun
A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.
Harris, Abigail K P; Williamson, Neil R; Slater, Holly
The prodigiosin biosynthesis gene cluster (pig cluster) from two strains of Serratia (S. marcescens ATCC 274 and Serratia sp. ATCC 39006) has been cloned, sequenced and expressed in heterologous hosts. Sequence analysis of the respective pig clusters revealed 14 ORFs in S. marcescens ATCC 274...... and 15 ORFs in Serratia sp. ATCC 39006. In each Serratia species, predicted gene products showed similarity to polyketide synthases (PKSs), non-ribosomal peptide synthases (NRPSs) and the Red proteins of Streptomyces coelicolor A3(2). Comparisons between the two Serratia pig clusters and the red cluster...... from Str. coelicolor A3(2) revealed some important differences. A modified scheme for the biosynthesis of prodigiosin, based on the pathway recently suggested for the synthesis of undecylprodigiosin, is proposed. The distribution of the pig cluster within several Serratia sp. isolates is demonstrated...
Full Text Available Abstract Background Pelgipeptin, a potent antibacterial and antifungal agent, is a non-ribosomally synthesised lipopeptide antibiotic. This compound consists of a β-hydroxy fatty acid and nine amino acids. To date, there is no information about its biosynthetic pathway. Results A potential pelgipeptin synthetase gene cluster (plp was identified from Paenibacillus elgii B69 through genome analysis. The gene cluster spans 40.8 kb with eight open reading frames. Among the genes in this cluster, three large genes, plpD, plpE, and plpF, were shown to encode non-ribosomal peptide synthetases (NRPSs, with one, seven, and one module(s, respectively. Bioinformatic analysis of the substrate specificity of all nine adenylation domains indicated that the sequence of the NRPS modules is well collinear with the order of amino acids in pelgipeptin. Additional biochemical analysis of four recombinant adenylation domains (PlpD A1, PlpE A1, PlpE A3, and PlpF A1 provided further evidence that the plp gene cluster involved in pelgipeptin biosynthesis. Conclusions In this study, a gene cluster (plp responsible for the biosynthesis of pelgipeptin was identified from the genome sequence of Paenibacillus elgii B69. The identification of the plp gene cluster provides an opportunity to develop novel lipopeptide antibiotics by genetic engineering.
Li, Jun; Tai, Cui; Deng, Zixin; Zhong, Weihong; He, Yongqun; Ou, Hong-Yu
VRprofile is a Web server that facilitates rapid investigation of virulence and antibiotic resistance genes, as well as extends these trait transfer-related genetic contexts, in newly sequenced pathogenic bacterial genomes. The used backend database MobilomeDB was firstly built on sets of known gene cluster loci of bacterial type III/IV/VI/VII secretion systems and mobile genetic elements, including integrative and conjugative elements, prophages, class I integrons, IS elements and pathogenicity/antibiotic resistance islands. VRprofile is thus able to co-localize the homologs of these conserved gene clusters using HMMer or BLASTp searches. With the integration of the homologous gene cluster search module with a sequence composition module, VRprofile has exhibited better performance for island-like region predictions than the other widely used methods. In addition, VRprofile also provides an integrated Web interface for aligning and visualizing identified gene clusters with MobilomeDB-archived gene clusters, or a variety set of bacterial genomes. VRprofile might contribute to meet the increasing demands of re-annotations of bacterial variable regions, and aid in the real-time definitions of disease-relevant gene clusters in pathogenic bacteria of interest. VRprofile is freely available at http://bioinfo-mml.sjtu.edu.cn/VRprofile. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: email@example.com.
Spiering, Martin J; Moon, Christina D; Wilkinson, Heather H; Schardl, Christopher L
Loline alkaloids are produced by mutualistic fungi symbiotic with grasses, and they protect the host plants from insects. Here we identify in the fungal symbiont, Neotyphodium uncinatum, two homologous gene clusters (LOL-1 and LOL-2) associated with loline-alkaloid production. Nine genes were identified in a 25-kb region of LOL-1 and designated (in order) lolF-1, lolC-1, lolD-1, lolO-1, lolA-1, lolU-1, lolP-1, lolT-1, and lolE-1. LOL-2 contained the homologs lolC-2 through lolE-2 in the same order and orientation. Also identified was lolF-2, but its possible linkage with either cluster was undetermined. Most lol genes were regulated in N. uncinatum and N. coenophialum, and all were expressed concomitantly with loline-alkaloid biosynthesis. A lolC-2 RNA-interference (RNAi) construct was introduced into N. uncinatum, and in two independent transformants, RNAi significantly decreased lolC expression (P lol-gene products indicate that the pathway has evolved from various different primary and secondary biosynthesis pathways.
Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun
Background A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. Results From the pa...
Geib, Elena; Brock, Matthias
Fungi are treasure chests for yet unexplored natural products. However, exploitation of their real potential remains difficult as a significant proportion of biosynthetic gene clusters appears silent under standard laboratory conditions. Therefore, elucidation of novel products requires gene activation or heterologous expression. For heterologous gene expression, we previously developed an expression platform in Aspergillus niger that is based on the transcriptional regulator TerR and its target promoter P terA . In this study, we extended this system by regulating expression of terR by the doxycycline inducible Tet-on system. Reporter genes cloned under the control of the target promoter P terA remained silent in the absence of doxycycline, but were strongly expressed when doxycycline was added. Reporter quantification revealed that the coupled system results in about five times higher expression rates compared to gene expression under direct control of the Tet-on system. As production of secondary metabolites generally requires the expression of several biosynthetic genes, the suitability of the self-cleaving viral peptide sequence P2A was tested in this optimised expression system. P2A allowed polycistronic expression of genes required for Asp-melanin formation in combination with the gene coding for the red fluorescent protein tdTomato. Gene expression and Asp-melanin formation was prevented in the absence of doxycycline and strongly induced by addition of doxycycline. Fluorescence studies confirmed the correct subcellular localisation of the respective enzymes. This tightly regulated but strongly inducible expression system enables high level production of secondary metabolites most likely even those with toxic potential. Furthermore, this system is compatible with polycistronic gene expression and, thus, suitable for the discovery of novel natural products.
Full Text Available Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown.To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage.Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small
Kaczkowski, Bogumil; Þórarinsson, Elfar; Reiche, Kristin
secondary structure already predicted, little is known about the patterns of structural conservation among pre-miRNAs. We address this issue by clustering the human pre-miRNA sequences based on pairwise, sequence and secondary structure alignment using FOLDALIGN, followed by global multiple alignment...... of obtained clusters by WAR. As a result, the common secondary structure was successfully determined for four FOLDALIGN clusters: the RF00027 structural family of the Rfam database and three clusters with previously undescribed consensus structures. Availability: http://genome.ku.dk/resources/mirclust...
Nielsen, Morten Thrane; Nielsen, Jakob Blæsbjerg; Anyaogu, Dianna Chinyere
was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to ransformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were...... of solid methodology for genetic manipulation of most species severely hampers pathway haracterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus...... successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC) encodes a polyketide synthase, ATEG_08453 (gedR) encodes a transcription factor...
Waldman, Abraham J; Pechersky, Yakov; Wang, Peng; Wang, Jennifer X; Balskus, Emily P
Diazo groups are found in a range of natural products that possess potent biological activities. Despite longstanding interest in these metabolites, diazo group biosynthesis is not well understood, in part because of difficulties in identifying specific genes linked to diazo formation. Here we describe the discovery of the gene cluster that produces the o-diazoquinone natural product cremeomycin and its heterologous expression in Streptomyces lividans. We used stable isotope feeding experiments and in vitro characterization of biosynthetic enzymes to decipher the order of events in this pathway and establish that diazo construction involves late-stage N-N bond formation. This work represents the first successful production of a diazo-containing metabolite in a heterologous host, experimentally linking a set of genes with diazo formation. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Kutil, Brandi L; Greenwald, Charles; Liu, Gang; Spiering, Martin J; Schardl, Christopher L; Wilkinson, Heather H
LOL, a fungal secondary metabolite gene cluster found in Epichloë and Neotyphodium species, is responsible for production of insecticidal loline alkaloids. To analyze the genetic architecture and to predict the evolutionary history of LOL, we compared five clusters from four fungal species (single clusters from Epichloë festucae, Neotyphodium sp. PauTG-1, Neotyphodium coenophialum, and two clusters we previously characterized in Neotyphodium uncinatum). Using PhyloCon to compare putative lol gene promoter regions, we have identified four motifs conserved across the lol genes in all five clusters. Each motif has significant similarity to known fungal transcription factor binding sites in the TRANSFAC database. Conservation of these motifs is further support for the hypothesis that the lol genes are co-regulated. Interestingly, the history of asexual Neotyphodium spp. includes multiple interspecific hybridization events. Comparing clusters from three Neotyphodium species and E. festucae allowed us to determine which Epichloë ancestors are the most likely contributors of LOL in these asexual species. For example, while no present day Epichloë typhina isolates are known to produce lolines, our data support the hypothesis that the E. typhina ancestor(s) of three asexual endophyte species contained a LOL gene cluster. Thus, these data support a model of evolution in which the polymorphism in loline alkaloid production phenotypes among endophyte species is likely due to the loss of the trait over time.
Shwan, Nzar A A; Louzada, Sandra; Yang, Fengtang; Armour, John A L
The human amylase gene cluster includes the human salivary (AMY1) and pancreatic amylase genes (AMY2A and AMY2B), and is a highly variable and dynamic region of the genome. Copy number variation (CNV) of AMY1 has been implicated in human dietary adaptation, and in population association with obesity, but neither of these findings has been independently replicated. Despite these functional implications, the structural genomic basis of CNV has only been defined in detail very recently. In this work, we use high-resolution analysis of copy number, and analysis of segregation in trios, to define new, independent allelic series of amylase CNVs in sub-Saharan Africans, including a series of higher-order expansions of a unit consisting of one copy each of AMY1, AMY2A, and AMY2B. We use fiber-FISH (fluorescence in situ hybridization) to define unexpected complexity in the accompanying rearrangements. These findings demonstrate recurrent involvement of the amylase gene region in genomic instability, involving at least five independent rearrangements of the pancreatic amylase genes (AMY2A and AMY2B). Structural features shared by fundamentally distinct lineages strongly suggest that the common ancestral state for the human amylase cluster contained more than one, and probably three, copies of AMY1. © 2017 WILEY PERIODICALS, INC.
The full nucleotide sequence of two additional human metallothionein (hMT) genes has been determined. These genes, hMT-I/sub B/ and hMT-I/sub F/, are located within the MT-I gene cluster we have described originally. The hMT-I/sub F/ gene is the first hMT-I gene whose amino acid sequence is in complete agreement with the published sequence of the human MT-I proteins. Therefore it is likely to be an active gene encoding a functional protein. However, since we have just completed the sequence analysis, we have not characterized this gene further yet. The hMT-I/sub B/ gene is closely linked to the hMT-I/sub A/ gene, and two pseudogenes, hMT-I/sub C/ and hMT-I/sub D/ separate the two. From its nucleotide sequence hMT-I/sub B/ seems to be an active gene, encoding a functional protein even though it differs in four positions from the published sequence of human MT-I proteins. This gene is expressed in a human hepatoma cell line, HepG2, and its expression is stimulated by Cd ++ . Using gene fusions to the viral thymidine-kinase gene we find that hMT-I/sub B/, like the hMT-I/sub A/ and hMT-II/sub A/ genes, contains a heavy metal responsive promoterregulatory element within its 5' flanking region. We analyzed the level of hMT-I/sub B/ mRNA in a variety of human cell lines by the S1 nuclease technique, and compared it to the expression of the hMT-II/sub A/ gene. While the hMT-II/sub A/ gene was expressed in all of the cell lines analyzed, the hMT-I/sub B/ gene was expressed in liver and kidney derived cell lines cells. This suggest that the expression of the hMT-I/sub B/ gene is controlled in a tissue specific manner. 13 refs
Wermter, Anne-Kathrin; Reichwald, Kathrin; Büch, Thomas
The importance of the melanin-concentrating hormone (MCH) system for regulation of energy homeostasis and body weight has been demonstrated in rodents. We analysed the human MCH receptor 1 gene (MCHR1) with respect to human obesity....
Roosendaal, B; Damoiseaux, J; Jordi, W; de Graaf, F K
The transcriptional organization of the K99 gene cluster was investigated in two ways. First, the DNA region, containing the transcriptional signals was analyzed using a transcription vector system with Escherichia coli galactokinase (GalK) as assayable marker and second, an in vitro transcription system was employed. A detailed analysis of the transcription signals revealed that a strong promoter PA and a moderate promoter PB are located upstream of fanA and fanB, respectively. No promoter activity was detected in the intercistronic region between fanB and fanC. Factor-dependent terminators of transcription were detected and are probably located in the intercistronic region between fanA and fanB (T1), and between fanB and fanC (T2). A third terminator (T3) was observed between fanC and fanD and has an efficiency of 90%. Analysis of the regulatory region in an in vitro transcription system confirmed the location of the respective transcription signals. A model for the transcriptional organization of the K99 cluster is presented. Indications were obtained that the trans-acting regulatory polypeptides FanA and FanB both function as anti-terminators. A model for the regulation of expression of the K99 gene cluster is postulated.
ten Asbroek, A. L.; Ouellette, M.; Borst, P.
Kinetoplastids are unicellular eukaryotes that include important parasites of man, such as trypanosomes and leishmanias. The study of these organisms received a recent boost from the development of transient transformation allowing the short-term expression of genes reintroduced into parasites like
Reading, N. S.; Shooter, C.; Song, J.; Miller, R.; Agarwal, A.; Láníková, Lucie; Clark, B.; Thein, S.L.; Divoký, V.; Prchal, J.T.
Roč. 37, č. 11 (2016), s. 1153-1156 ISSN 1059-7794 R&D Projects: GA MŠk(CZ) LH15223 Institutional support: RVO:68378050 Keywords : globin genes * regulation * sickle cell disease * HBB duplication Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 4.601, year: 2016
Full Text Available Abstract Background The four heterogeneous childhood cancers, neuroblastoma, non-Hodgkin lymphoma, rhabdomyosarcoma, and Ewing sarcoma present a similar histology of small round blue cell tumor (SRBCT and thus often leads to misdiagnosis. Identification of biomarkers for distinguishing these cancers is a well studied problem. Existing methods typically evaluate each gene separately and do not take into account the nonlinear interaction between genes and the tools that are used to design the diagnostic prediction system. Consequently, more genes are usually identified as necessary for prediction. We propose a general scheme for finding a small set of biomarkers to design a diagnostic system for accurate classification of the cancer subgroups. We use multilayer networks with online gene selection ability and relational fuzzy clustering to identify a small set of biomarkers for accurate classification of the training and blind test cases of a well studied data set. Results Our method discerned just seven biomarkers that precisely categorized the four subgroups of cancer both in training and blind samples. For the same problem, others suggested 19–94 genes. These seven biomarkers include three novel genes (NAB2, LSP1 and EHD1 – not identified by others with distinct class-specific signatures and important role in cancer biology, including cellular proliferation, transendothelial migration and trafficking of MHC class antigens. Interestingly, NAB2 is downregulated in other tumors including Non-Hodgkin lymphoma and Neuroblastoma but we observed moderate to high upregulation in a few cases of Ewing sarcoma and Rabhdomyosarcoma, suggesting that NAB2 might be mutated in these tumors. These genes can discover the subgroups correctly with unsupervised learning, can differentiate non-SRBCT samples and they perform equally well with other machine learning tools including support vector machines. These biomarkers lead to four simple human interpretable
Sura Zaki Alrashid
Full Text Available Clustering of gene expression time series gives insight into which genes may be co-regulated, allowing us to discern the activity of pathways in a given microarray experiment. Of particular interest is how a given group of genes varies with different conditions or genetic background. This paper develops a new clustering method that allows each cluster to be parameterised according to whether the behaviour of the genes across conditions is correlated or anti-correlated. By specifying correlation between such genes,more information is gain within the cluster about how the genes interrelate. Amyotrophic lateral sclerosis (ALS is an irreversible neurodegenerative disorder that kills the motor neurons and results in death within 2 to 3 years from the symptom onset. Speed of progression for different patients are heterogeneous with significant variability. The SOD1G93A transgenic mice from different backgrounds (129Sv and C57 showed consistent phenotypic differences for disease progression. A hierarchy of Gaussian isused processes to model condition-specific and gene-specific temporal co-variances. This study demonstrated about finding some significant gene expression profiles and clusters of associated or co-regulated gene expressions together from four groups of data (SOD1G93A and Ntg from 129Sv and C57 backgrounds. Our study shows the effectiveness of sharing information between replicates and different model conditions when modelling gene expression time series. Further gene enrichment score analysis and ontology pathway analysis of some specified clusters for a particular group may lead toward identifying features underlying the differential speed of disease progression.
Ryu, Ji-Young; Seo, Jiyoung; Unno, Tatsuya; Ahn, Joong-Hoon; Yan, Tao; Sadowsky, Michael J; Hur, Hor-Gil
The plant-derived phenylpropanoids eugenol and isoeugenol have been proposed as useful precursors for the production of natural vanillin. Genes involved in the metabolism of eugenol and isoeugenol were clustered in region of about a 30 kb of Pseudomonas nitroreducens Jin1. Two of the 23 ORFs in this region, ORFs 26 (iemR) and 27 (iem), were predicted to be involved in the conversion of isoeugenol to vanillin. The deduced amino acid sequence of isoeugenol monooxygenase (Iem) of strain Jin1 had 81.4% identity to isoeugenol monooxygenase from Pseudomonas putida IE27, which also transforms isoeugenol to vanillin. Iem was expressed in E. coli BL21(DE3) and was found to lead to isoeugenol to vanillin transformation. Deletion and cloning analyses indicated that the gene iemR, located upstream of iem, is required for expression of iem in the presence of isoeugenol, suggesting it to be the iem regulatory gene. Reverse transcription, real-time PCR analyses indicated that the genes involved in the metabolism of eugenol and isoeugenol were differently induced by isoeugenol, eugenol, and vanillin.
Katoh, Masuko; Katoh, Masaru
Drosophila Guanylate-kinase holder (Gukh) is an adaptor molecule bridging Discs large (Dlg) and Scribble (Scrib), which are implicated in the establishment and maintenance of epithelial polarity. Here, we searched for human homologs of Drosophila gukh by using bioinformatics, and identified GUKH1 and GUKH2 genes. GUKH1 was identical to Nance-Horan syndrome (NHS) gene, while GUKH2 was a novel gene. FLJ35425 (AK092744.1), DKFZp686P1949 (BX647246.1) and KIAA1357 (AB037778.1) cDNAs were derived from human GUKH2 gene. Nucleotide sequence of GUKH2 cDNA was determined by assembling 5'-part of FLJ35425 cDNA and entire region of DKFZp686P1949 cDNA. Human GUKH2 gene consists of 8 exons. Exon 5 (132 bp) of GUKH2 gene was spliced out in GUKH2 cDNA due to alternative splicing. GUKH2-REPS1 locus at human chromosome 6q24.1 and GUKH1-REPS2 locus at human chromosome Xp22.22-p22.13 are paralogous regions within the human genome. Mouse Gukh2 and zebrafish gukh2 genes were also identified. N-terminal part of human GUKH2, mouse Gukh2 and zebrafish gukh2 proteins were completely divergent from human GUKH1 protein. Human GUKH2 and GUKH1, consisting of eight GUKH homology (GKH1-GKH8) domains and Proline-rich domain, showed 28.5% total-amino-acid identity. GKH1, GKH4, GKH5, GKH7 and GKH8 domains were conserved among human GUKH1, human GUKH2 and Drosophila Gukh. Because human homologs of Drosophila dlg (DLG1-DLG7) as well as human homologs of Drosophila scrib (SCRIB, ERBB2IP and Densin-180) are cancer-associated genes, human homologs of Drosophila gukh (GUKH1 and GUKH2) are predicted cancer-associated genes.
Full Text Available Secondary metabolites are produced mostly by clustered genes that are essential to their biosynthesis. The transcriptional expression of these genes is often cooperatively regulated by a transcription factor located inside or close to a cluster. Most of the secondary metabolism biosynthesis (SMB gene clusters identified to date contain so-called core genes with distinctive sequence features, such as polyketide synthase (PKS and non-ribosomal peptide synthetase (NRPS. Recent efforts in sequencing fungal genomes have revealed far more SMB gene clusters than expected based on the number of core genes in the genomes. Several bioinformatics tools have been developed to survey SMB gene clusters using the sequence motif information of the core genes, including SMURF and antiSMASH.More recently, accompanied by the development of sequencing techniques allowing to obtain large-scale genomic and transcriptomic data, motif-independent prediction methods of SMB gene clusters, including MIDDAS-M, have been developed. Most these methods detect the clusters in which the genes are cooperatively regulated at transcriptional levels, thus allowing the identification of novel SMB gene clusters regardless of the presence of the core genes. Another type of the method, MIPS-CG, uses the characteristics of SMB genes, which are highly enriched in non-syntenic blocks (NSBs, enabling the prediction even without transcriptome data although the results have not been evaluated in detail. Considering that large portion of SMB gene clusters might be sufficiently expressed only in limited uncommon conditions, it seems that prediction of SMB gene clusters by bioinformatics and successive experimental validation is an only way to efficiently uncover hidden SMB gene clusters. Here, we describe and discuss possible novel approaches for the determination of SMB gene clusters that have not been identified using conventional methods.
Schneider, E; Jensen, L R; Farcas, R; Kondova, I; Bontrop, R E; Navarro, B; Fuchs, E; Kuss, A W; Haaf, T
The human brain is distinguished by its remarkable size, high energy consumption, and cognitive abilities compared to all other mammals and non-human primates. However, little is known about what has accelerated brain evolution in the human lineage. One possible explanation is that the appearance of advanced communication skills and language has been a driving force of human brain development. The phenotypic adaptations in brain structure and function which occurred on the way to modern humans may be associated with specific molecular signatures in today's human genome and/or transcriptome. Genes that have been linked to language, reading, and/or autism spectrum disorders are prime candidates when searching for genes for human-specific communication abilities. The database and genome-wide expression analyses we present here revealed a clustering of such communication-associated genes (COAG) on human chromosomes X and 7, in particular chromosome 7q31-q36. Compared to the rest of the genome, we found a high number of COAG to be differentially expressed in the cortices of humans and non-human primates (chimpanzee, baboon, and/or marmoset). The role of X-linked genes for the development of human-specific cognitive abilities is well known. We now propose that chromosome 7q31-q36 also represents a hot spot for the evolution of human-specific communication abilities. Selective pressure on the T cell receptor beta locus on chromosome 7q34, which plays a pivotal role in the immune system, could have led to rapid dissemination of positive gene variants in hitchhiking COAG. Copyright © 2012 S. Karger AG, Basel.
Oct 26, 2011 ... assignment (Ewing and Green, 1998a; Ewing et al., 1998b). The trace files were trimmed with trim-alt 0.05 (P-score>20). In addition, vector trimming was conducted with cross-match software. Each gene expression pattern was analyzed by clustering. (30 bp or more 94% homology) and assembly.
Davis, Elizabeth; Sloan, Tyler; Aurelius, Krista; Barbour, Angela; Bodey, Elijah; Clark, Brigette; Dennis, Celeste; Drown, Rachel; Fleming, Megan; Humbert, Allison; Glasgo, Elizabeth; Kerns, Trent; Lingro, Kelly; McMillin, MacKenzie; Meyer, Aaron; Pope, Breanna; Stalevicz, April; Steffen, Brittney; Steindl, Austin; Williams, Carolyn; Wimberley, Carmen; Zenas, Robert; Butela, Kristen; Wildschutte, Hans
The emergence of bacterial pathogens resistant to all known antibiotics is a global health crisis. Adding to this problem is that major pharmaceutical companies have shifted away from antibiotic discovery due to low profitability. As a result, the pipeline of new antibiotics is essentially dry and many bacteria now resist the effects of most commonly used drugs. To address this global health concern, citizen science through the Small World Initiative (SWI) was formed in 2012. As part of SWI, students isolate bacteria from their local environments, characterize the strains, and assay for antibiotic production. During the 2015 fall semester at Bowling Green State University, students isolated 77 soil-derived bacteria and genetically characterized strains using the 16S rRNA gene, identified strains exhibiting antagonistic activity, and performed an expanded SWI workflow using transposon mutagenesis to identify a biosynthetic gene cluster involved in toxigenic compound production. We identified one mutant with loss of antagonistic activity and through subsequent whole-genome sequencing and linker-mediated PCR identified a 24.9 kb biosynthetic gene locus likely involved in inhibitory activity in that mutant. Further assessment against human pathogens demonstrated the inhibition of Bacillus cereus, Listeria monocytogenes, and methicillin-resistant Staphylococcus aureus in the presence of this compound, thus supporting our molecular strategy as an effective research pipeline for SWI antibiotic discovery and genetic characterization. © 2017 The Authors. MicrobiologyOpen published by John Wiley & Sons Ltd.
Full Text Available Abstract Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org. This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.
Koonin Eugene V
Full Text Available Abstract Background A genome-wide comparative analysis of human and mouse gene expression patterns was performed in order to evaluate the evolutionary divergence of mammalian gene expression. Tissue-specific expression profiles were analyzed for 9,105 human-mouse orthologous gene pairs across 28 tissues. Expression profiles were resolved into species-specific coexpression networks, and the topological properties of the networks were compared between species. Results At the global level, the topological properties of the human and mouse gene coexpression networks are, essentially, identical. For instance, both networks have topologies with small-world and scale-free properties as well as closely similar average node degrees, clustering coefficients, and path lengths. However, the human and mouse coexpression networks are highly divergent at the local level: only a small fraction ( Conclusion The dissonance between global versus local network divergence suggests that the interspecies similarity of the global network properties is of limited biological significance, at best, and that the biologically relevant aspects of the architectures of gene coexpression are specific and particular, rather than universal. Nevertheless, there is substantial evolutionary conservation of the local network structure which is compatible with the notion that gene coexpression networks are subject to purifying selection.
Xie, Pingyuan; Sun, Yi; Ouyang, Qi; Hu, Liang; Tan, Yueqiu; Zhou, Xiaoying; Xiong, Bo; Zhang, Qianjun; Yuan, Ding; Pan, Yi; Liu, Tiancheng; Liang, Ping; Lu, Guangxiu; Lin, Ge
Genetic and epigenetic alterations are observed in long-term culture (>30 passages) of human embryonic stem cells (hESCs); however, little information is available in early cultures. Through a large-scale gene expression analysis between initial-passage hESCs (ihESCs, cell derivatives, possibly through attenuation of the expression and phosphorylation of p53. Furthermore, we demonstrated that 5% oxygen, instead of the commonly used 20% oxygen, is required for preserving the expression of the DLK1-DIO3 cluster. Overall, the data suggest that active expression of the DLK1-DIO3 cluster represents a new biomarker for epigenetic stability of hESCs and indicates the importance of using a proper physiological oxygen level during the derivation and culture of hESCs. © AlphaMed Press.
Komaki, Hisayuki; Ichikawa, Natsuko; Hosoyama, Akira; Takahashi-Nakaguchi, Azusa; Matsuzawa, Tetsuhiro; Suzuki, Ken-ichiro; Fujita, Nobuyuki; Gonoi, Tohru
Actinobacteria of the genus Nocardia usually live in soil or water and play saprophytic roles, but they also opportunistically infect the respiratory system, skin, and other organs of humans and animals. Primarily because of the clinical importance of the strains, some Nocardia genomes have been sequenced, and genome sequences have accumulated. Genome sizes of Nocardia strains are similar to those of Streptomyces strains, the producers of most antibiotics. In the present work, we compared secondary metabolite biosynthesis gene clusters of type-I polyketide synthase (PKS-I) and nonribosomal peptide synthetase (NRPS) among genomes of representative Nocardia species/strains based on domain organization and amino acid sequence homology. Draft genome sequences of Nocardia asteroides NBRC 15531(T), Nocardia otitidiscaviarum IFM 11049, Nocardia brasiliensis NBRC 14402(T), and N. brasiliensis IFM 10847 were read and compared with published complete genome sequences of Nocardia farcinica IFM 10152, Nocardia cyriacigeorgica GUH-2, and N. brasiliensis HUJEG-1. Genome sizes are as follows: N. farcinica, 6.0 Mb; N. cyriacigeorgica, 6.2 Mb; N. asteroides, 7.0 Mb; N. otitidiscaviarum, 7.8 Mb; and N. brasiliensis, 8.9 - 9.4 Mb. Predicted numbers of PKS-I, NRPS, and PKS-I/NRPS hybrid clusters ranged between 4-11, 7-13, and 1-6, respectively, depending on strains, and tended to increase with increasing genome size. Domain and module structures of representative or unique clusters are discussed in the text. We conclude the following: 1) genomes of Nocardia strains carry as many PKS-I and NRPS gene clusters as those of Streptomyces strains, 2) the number of PKS-I and NRPS gene clusters in Nocardia strains varies substantially depending on species, and N. brasiliensis strains carry the largest numbers of clusters among the species studied, 3) the seven Nocardia strains studied in the present work have seven common PKS-I and/or NRPS clusters, some of whose products are yet to be studied
Takeda, Itaru; Umemura, Myco; Koike, Hideaki; Asai, Kiyoshi; Machida, Masayuki
Despite their biological importance, a significant number of genes for secondary metabolite biosynthesis (SMB) remain undetected due largely to the fact that they are highly diverse and are not expressed under a variety of cultivation conditions. Several software tools including SMURF and antiSMASH have been developed to predict fungal SMB gene clusters by finding core genes encoding polyketide synthase, nonribosomal peptide synthetase and dimethylallyltryptophan synthase as well as several others typically present in the cluster. In this work, we have devised a novel comparative genomics method to identify SMB gene clusters that is independent of motif information of the known SMB genes. The method detects SMB gene clusters by searching for a similar order of genes and their presence in nonsyntenic blocks. With this method, we were able to identify many known SMB gene clusters with the core genes in the genomic sequences of 10 filamentous fungi. Furthermore, we have also detected SMB gene clusters without core genes, including the kojic acid biosynthesis gene cluster of Aspergillus oryzae. By varying the detection parameters of the method, a significant difference in the sequence characteristics was detected between the genes residing inside the clusters and those outside the clusters. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H
Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Serganova, Inna [Department of Neurology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States); Ponomarev, Vladimir [Department of Radiology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States); Blasberg, Ronald [Department of Neurology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States); Department of Radiology, Memorial Sloan-Kettering Cancer Center, New York, NY 10021 (United States)], E-mail: firstname.lastname@example.org
The clinical application of positron-emission-tomography-based reporter gene imaging will expand over the next several years. The translation of reporter gene imaging technology into clinical applications is the focus of this review, with emphasis on the development and use of human reporter genes. Human reporter genes will play an increasingly more important role in this development, and it is likely that one or more reporter systems (human gene and complimentary radiopharmaceutical) will take leading roles. Three classes of human reporter genes are discussed and compared: receptors, transporters and enzymes. Examples of highly expressed cell membrane receptors include specific membrane somatostatin receptors (hSSTrs). The transporter group includes the sodium iodide symporter (hNIS) and the norepinephrine transporter (hNET). The endogenous enzyme classification includes human mitochondrial thymidine kinase 2 (hTK2). In addition, we also discuss the nonhuman dopamine 2 receptor and two viral reporter genes, the wild-type herpes simplex virus 1 thymidine kinase (HSV1-tk) gene and the HSV1-tk mutant (HSV1-sr39tk). Initial applications of reporter gene imaging in patients will be developed within two different clinical disciplines: (a) gene therapy and (b) adoptive cell-based therapies. These studies will benefit from the availability of efficient human reporter systems that can provide critical monitoring information for adenoviral-based, retroviral-based and lenteviral-based gene therapies, oncolytic bacterial and viral therapies, and adoptive cell-based therapies. Translational applications of noninvasive in vivo reporter gene imaging are likely to include: (a) quantitative monitoring of gene therapy vectors for targeting and transduction efficacy in clinical protocols by imaging the location, extent and duration of transgene expression; (b) monitoring of cell trafficking, targeting, replication and activation in adoptive T-cell and stem/progenitor cell therapies
Serganova, Inna; Ponomarev, Vladimir; Blasberg, Ronald
The clinical application of positron-emission-tomography-based reporter gene imaging will expand over the next several years. The translation of reporter gene imaging technology into clinical applications is the focus of this review, with emphasis on the development and use of human reporter genes. Human reporter genes will play an increasingly more important role in this development, and it is likely that one or more reporter systems (human gene and complimentary radiopharmaceutical) will take leading roles. Three classes of human reporter genes are discussed and compared: receptors, transporters and enzymes. Examples of highly expressed cell membrane receptors include specific membrane somatostatin receptors (hSSTrs). The transporter group includes the sodium iodide symporter (hNIS) and the norepinephrine transporter (hNET). The endogenous enzyme classification includes human mitochondrial thymidine kinase 2 (hTK2). In addition, we also discuss the nonhuman dopamine 2 receptor and two viral reporter genes, the wild-type herpes simplex virus 1 thymidine kinase (HSV1-tk) gene and the HSV1-tk mutant (HSV1-sr39tk). Initial applications of reporter gene imaging in patients will be developed within two different clinical disciplines: (a) gene therapy and (b) adoptive cell-based therapies. These studies will benefit from the availability of efficient human reporter systems that can provide critical monitoring information for adenoviral-based, retroviral-based and lenteviral-based gene therapies, oncolytic bacterial and viral therapies, and adoptive cell-based therapies. Translational applications of noninvasive in vivo reporter gene imaging are likely to include: (a) quantitative monitoring of gene therapy vectors for targeting and transduction efficacy in clinical protocols by imaging the location, extent and duration of transgene expression; (b) monitoring of cell trafficking, targeting, replication and activation in adoptive T-cell and stem/progenitor cell therapies
Viguerie, Nathalie; Montastier, Emilie; Maoret, Jean-José
weight maintenance diets. For 175 genes, opposite regulation was observed during calorie restriction and weight maintenance phases, independently of variations in body weight. Metabolism and immunity genes showed inverse profiles. During the dietary intervention, network-based analyses revealed strong...... interconnection between expression of genes involved in de novo lipogenesis and components of the metabolic syndrome. Sex had a marked influence on AT expression of 88 transcripts, which persisted during the entire dietary intervention and after control for fat mass. In women, the influence of body mass index...... on expression of a subset of genes persisted during the dietary intervention. Twenty-two genes revealed a metabolic syndrome signature common to men and women. Genetic control of AT gene expression by cis signals was observed for 46 genes. Dietary intervention, sex, and cis genetic variants independently...
Blin, Kai; Wolf, Thomas; Chevrette, Marc G.
Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding...... the production of such compounds. Since 2011, the 'antibiotics and secondary metabolite analysis shell-antiSMASH' has assisted researchers in efficiently performing this, both as a web server and a standalone tool. Here, we present the thoroughly updated antiSMASH version 4, which adds several novel features...
Reimegård, Johan; Kundu, Snehangshu; Pendle, Ali; Irish, Vivian F; Shaw, Peter; Nakayama, Naomi; Sundström, Jens F; Emanuelsson, Olof
Co-expression of physically linked genes occurs surprisingly frequently in eukaryotes. Such chromosomal clustering may confer a selective advantage as it enables coordinated gene regulation at the chromatin level. We studied the chromosomal organization of genes involved in male reproductive development in Arabidopsis thaliana. We developed an in-silico tool to identify physical clusters of co-regulated genes from gene expression data. We identified 17 clusters (96 genes) involved in stamen development and acting downstream of the transcriptional activator MS1 (MALE STERILITY 1), which contains a PHD domain associated with chromatin re-organization. The clusters exhibited little gene homology or promoter element similarity, and largely overlapped with reported repressive histone marks. Experiments on a subset of the clusters suggested a link between expression activation and chromatin conformation: qRT-PCR and mRNA in situ hybridization showed that the clustered genes were up-regulated within 48 h after MS1 induction; out of 14 chromatin-remodeling mutants studied, expression of clustered genes was consistently down-regulated only in hta9/hta11, previously associated with metabolic cluster activation; DNA fluorescence in situ hybridization confirmed that transcriptional activation of the clustered genes was correlated with open chromatin conformation. Stamen development thus appears to involve transcriptional activation of physically clustered genes through chromatin de-condensation. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Crnovčić, Ivana; Rückert, Christian; Semsary, Siamak; Lang, Manuel; Kalinowski, Jörn; Keller, Ullrich
Sequencing the actinomycin (acm) biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X), revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm biosynthetic gene clusters lack a kynurenine-3-monooxygenase gene necessary for biosynthesis of 3-hydroxy-4-methylanthranilic acid, the building block of the Acm chromophore, which suggests participation of a genome-encoded relevant monooxygenase during Acm biosynthesis in both S. chrysomallus and S
Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling
Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: email@example.com.
Trichothecenes are sesquiterpenes that act like mycotoxins. Their biosynthesis has been mainly studied in the fungal genera Fusarium, where most of the biosynthetic genes (tri) are grouped in a cluster regulated by ambient conditions and regulatory genes. Unexpectedly, few studies are available abou...
E.R. Fearon; H.H.Jr. Kazazian; P.G. Waber (Pamela); J.I. Lee (Joseph); S.E. Antonarakis; S.H. Orkin (Stuart); E.F. Vanin; P.S. Henthorn; F.G. Grosveld (Frank); A.F. Scott; G.R. Buchanan
textabstractWe have used restriction endonuclease mapping to study a deletion involving the beta-globin gene cluster in a Mexican-American family with gamma delta beta-thalassemia. Analysis of DNA polymorphisms demonstrated deletion of the beta-globin gene from the affected chromosome. Using a DNA
Full Text Available The regulation of gene expression is essential for normal functioning of biological systems in every form of life. Gene expression is primarily controlled at the level of transcription, especially at the phase of initiation. Non-coding RNAs are one of the major players at every level of genetic regulation, including the control of chromatin organisation, transcription, various post-transcriptional processes and translation. In this study, the Transcriptional Interference Network (TIN hypothesis was put forward in an attempt to explain the global expression of antisense RNAs and the overall occurrence of tandem gene clusters in the genomes of various biological systems ranging from viruses to mammalian cells. The TIN hypothesis suggests the existence of a novel layer of genetic regulation, based on the interactions between the transcriptional machineries of neighbouring genes at their overlapping regions, which are assumed to play a fundamental role in coordinating gene expression within a cluster of functionally-linked genes. It is claimed that the transcriptional overlaps between adjacent genes are much more widespread in genomes than is thought today. The Waterfall model of the TIN hypothesis postulates a unidirectional effect of upstream genes on the transcription of downstream genes within a cluster of tandemly-arrayed genes, while the Seesaw model proposes a mutual interdependence of gene expression between the oppositely-oriented genes. The TIN represents an auto-regulatory system with an exquisitely timed and highly synchronised cascade of gene expression in functionally-linked genes located in close physical proximity to each other. In this study, we focused on herpesviruses. The reason for this lies in the compressed nature of viral genes, which allows a tight regulation and an easier investigation of the transcriptional interactions between genes. However, I believe that the same or similar principles can be applied to cellular
Gruben, Birgit S; Mäkelä, Miia R; Kowalczyk, Joanna E; Zhou, Miaomiao; Benoit-Gelber, Isabelle; De Vries, Ronald P
The Aspergillus niger genome contains a large repertoire of genes encoding carbohydrate active enzymes (CAZymes) that are targeted to plant polysaccharide degradation enabling A. niger to grow on a wide range of plant biomass substrates. Which genes need to be activated in certain environmental conditions depends on the composition of the available substrate. Previous studies have demonstrated the involvement of a number of transcriptional regulators in plant biomass degradation and have identified sets of target genes for each regulator. In this study, a broad transcriptional analysis was performed of the A. niger genes encoding (putative) plant polysaccharide degrading enzymes. Microarray data focusing on the initial response of A. niger to the presence of plant biomass related carbon sources were analyzed of a wild-type strain N402 that was grown on a large range of carbon sources and of the regulatory mutant strains ΔxlnR, ΔaraR, ΔamyR, ΔrhaR and ΔgalX that were grown on their specific inducing compounds. The cluster analysis of the expression data revealed several groups of co-regulated genes, which goes beyond the traditionally described co-regulated gene sets. Additional putative target genes of the selected regulators were identified, based on their expression profile. Notably, in several cases the expression profile puts questions on the function assignment of uncharacterized genes that was based on homology searches, highlighting the need for more extensive biochemical studies into the substrate specificity of enzymes encoded by these non-characterized genes. The data also revealed sets of genes that were upregulated in the regulatory mutants, suggesting interaction between the regulatory systems and a therefore even more complex overall regulatory network than has been reported so far. Expression profiling on a large number of substrates provides better insight in the complex regulatory systems that drive the conversion of plant biomass by fungi. In
Morten Thrane Nielsen
Full Text Available Fungal natural products are a rich resource for bioactive molecules. To fully exploit this potential it is necessary to link genes to metabolites. Genetic information for numerous putative biosynthetic pathways has become available in recent years through genome sequencing. However, the lack of solid methodology for genetic manipulation of most species severely hampers pathway characterization. Here we present a simple PCR based approach for heterologous reconstitution of intact gene clusters. Specifically, the putative gene cluster responsible for geodin production from Aspergillus terreus was transferred in a two step procedure to an expression platform in A. nidulans. The individual cluster fragments were generated by PCR and assembled via efficient USER fusion prior to transformation and integration via re-iterative gene targeting. A total of 13 open reading frames contained in 25 kb of DNA were successfully transferred between the two species enabling geodin synthesis in A. nidulans. Subsequently, functions of three genes in the cluster were validated by genetic and chemical analyses. Specifically, ATEG_08451 (gedC encodes a polyketide synthase, ATEG_08453 (gedR encodes a transcription factor responsible for activation of the geodin gene cluster and ATEG_08460 (gedL encodes a halogenase that catalyzes conversion of sulochrin to dihydrogeodin. We expect that our approach for transferring intact biosynthetic pathways to a fungus with a well developed genetic toolbox will be instrumental in characterizing the many exciting pathways for secondary metabolite production that are currently being uncovered by the fungal genome sequencing projects.
Wan, B; Yarbrough, J W; Schultz, T W
This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Hoff, Kevin G; Culler, Stephanie J; Nguyen, Peter Q; McGuire, Ryan M; Silberg, Jonathan J; Smolke, Christina D
A major challenge to studying Fe-S cluster biosynthesis in higher eukaryotes is the lack of simple tools for imaging metallocluster binding to proteins. We describe the first fluorescent approach for in vivo detection of 2Fe2S clusters that is based upon the complementation of Venus fluorescent protein fragments via human glutaredoxin 2 (GRX2) coordination of a 2Fe2S cluster. We show that Escherichia coli and mammalian cells expressing Venus fragments fused to GRX2 exhibit greater fluorescence than cells expressing fragments fused to a C37A mutant that cannot coordinate a metallocluster. In addition, we find that maximal fluorescence in the cytosol of mammalian cells requires the iron-sulfur cluster assembly proteins ISCU and NFS1. These findings provide evidence that glutaredoxins can dimerize within mammalian cells through coordination of a 2Fe2S cluster as observed with purified recombinant proteins. Copyright 2009 Elsevier Ltd. All rights reserved.
Nov 28, 2011 ... Targeting an exogenous gene into a favorable gene locus and for expression under endogenous regulators is ... case, the expression of human lysozyme could be regulated by the endogenous cis-element of αs1- casein gene in .... Mouse mammary epithelial C127 cells (Cell Bank, Chinese. Academy of ...
Transposable elements (TEs) are DNA sequences that can insert elsewhere in the genome and modify genome structure and gene regulation. The role of TEs in evolution is contentious. One hypothesis posits that TE activity generates genomic incompatibilities that can cause reproductive isolation between incipient species. This predicts that TEs will accumulate during speciation events. Here, I tested the prediction that extant lineages with a relatively high rate of speciation have a high number of TEs in their genomes. I sequenced and analysed the TE content of a marker genomic region (Hox clusters) in Anolis lizards, a classic case of an adaptive radiation. Unlike other vertebrates, including closely related lizards, Anolis lizards have high numbers of TEs in their Hox clusters, genomic regions that regulate development of the morphological adaptations that characterize habitat specialists in these lizards. Following a burst of TE activity in the lineage leading to extant Anolis, TEs have continued to accumulate during or after speciation events, resulting in a positive relationship between TE density and lineage speciation rate. These results are consistent with the prediction that TE activity contributes to adaptive radiation by promoting speciation. Although there was no evidence that TE density per se is associated with ecological morphology, the activity of TEs in Hox clusters could have been a rich source for phenotypic variation that may have facilitated the rapid parallel morphological adaptation to microhabitats seen in extant Anolis lizards. © 2016 The Author(s).
Cui, Peng; Zhong, Tingyan; Wang, Zhuo; Wang, Tao; Zhao, Hongyu; Liu, Chenglin; Lu, Hui
Circadian genes express periodically in an approximate 24-h period and the identification and study of these genes can provide deep understanding of the circadian control which plays significant roles in human health. Although many circadian gene identification algorithms have been developed, large numbers of false positives and low coverage are still major problems in this field. In this study we constructed a novel computational framework for circadian gene identification using deep neural networks (DNN) - a deep learning algorithm which can represent the raw form of data patterns without imposing assumptions on the expression distribution. Firstly, we transformed time-course gene expression data into categorical-state data to denote the changing trend of gene expression. Two distinct expression patterns emerged after clustering of the state data for circadian genes from our manually created learning dataset. DNN was then applied to discriminate the aperiodic genes and the two subtypes of periodic genes. In order to assess the performance of DNN, four commonly used machine learning methods including k-nearest neighbors, logistic regression, naïve Bayes, and support vector machines were used for comparison. The results show that the DNN model achieves the best balanced precision and recall. Next, we conducted large scale circadian gene detection using the trained DNN model for the remaining transcription profiles. Comparing with JTK_CYCLE and a study performed by Möller-Levet et al. (doi: https://doi.org/10.1073/pnas.1217154110), we identified 1132 novel periodic genes. Through the functional analysis of these novel circadian genes, we found that the GTPase superfamily exhibits distinct circadian expression patterns and may provide a molecular switch of circadian control of the functioning of the immune system in human blood. Our study provides novel insights into both the circadian gene identification field and the study of complex circadian-driven biological
Even though gene therapy made its way through the clinics to treat a number of human pathologies since the early years of experimental research and despite the recent approval of the first gene-based product (Glybera) in Europe, the safe and effective use of gene transfer vectors remains a challenge in human gene therapy due to the existence of barriers in the host organism. While work is under active investigation to improve the gene transfer systems themselves, the use of controlled release approaches may offer alternative, convenient tools of vector delivery to achieve a performant gene transfer in vivo while overcoming the various physiological barriers that preclude its wide use in patients. This article provides an overview of the most significant contributions showing how the principles of controlled release strategies may be adapted for human gene therapy.
Full Text Available Abstract Background During the colonization of the world, after dispersal out of African, modern humans encountered changeable environments and substantial phenotypic variations that involve diverse behaviors, lifestyles and cultures, were generated among the different modern human populations. Results Here, we study the level of population differentiation among different populations of human genes. Intriguingly, genes involved in osteoblast development were identified as being enriched with higher FST SNPs, a result consistent with the proposed role of the skeletal system in accounting for variation among human populations. Genes involved in the development of hair follicles, where hair is produced, were also found to have higher levels of population differentiation, consistent with hair morphology being a distinctive trait among human populations. Other genes that showed higher levels of population differentiation include those involved in pigmentation, spermatid, nervous system and organ development, and some metabolic pathways, but few involved with the immune system. Disease-related genes demonstrate excessive SNPs with lower levels of population differentiation, probably due to purifying selection. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection. Conclusion Our analysis demonstrates different level of population differentiation among human populations for different gene groups.
Wu, Dong-Dong; Zhang, Ya-Ping
During the colonization of the world, after dispersal out of African, modern humans encountered changeable environments and substantial phenotypic variations that involve diverse behaviors, lifestyles and cultures, were generated among the different modern human populations. Here, we study the level of population differentiation among different populations of human genes. Intriguingly, genes involved in osteoblast development were identified as being enriched with higher FST SNPs, a result consistent with the proposed role of the skeletal system in accounting for variation among human populations. Genes involved in the development of hair follicles, where hair is produced, were also found to have higher levels of population differentiation, consistent with hair morphology being a distinctive trait among human populations. Other genes that showed higher levels of population differentiation include those involved in pigmentation, spermatid, nervous system and organ development, and some metabolic pathways, but few involved with the immune system. Disease-related genes demonstrate excessive SNPs with lower levels of population differentiation, probably due to purifying selection. Surprisingly, we find that Mendelian-disease genes appear to have a significant excessive of SNPs with high levels of population differentiation, possibly because the incidence and susceptibility of these diseases show differences among populations. As expected, microRNA regulated genes show lower levels of population differentiation due to purifying selection. Our analysis demonstrates different level of population differentiation among human populations for different gene groups.
Full Text Available Microorganisms form diverse multispecies communities in various ecosystems. The high abundance of fungal and bacterial species in these consortia results in specific communication between the microorganisms. A key role in this communication is played by secondary metabolites (SMs, which are also called natural products. Recently, it was shown that interspecies ‘talk’ between microorganisms represents a physiological trigger to activate silent gene clusters leading to the formation of novel SMs by the involved species. This review focuses on mixed microbial cultivation, mainly between bacteria and fungi, with a special emphasis on the induced formation of fungal SMs in co-cultures. In addition, the role of chromatin remodeling in the induction is examined, and methodical perspectives for the analysis of natural products are presented. As an example for an intermicrobial interaction elucidated at the molecular level, we discuss the specific interaction between the filamentous fungi Aspergillus nidulans and Aspergillus fumigatus with the soil bacterium Streptomyces rapamycinicus, which provides an excellent model system to enlighten molecular concepts behind regulatory mechanisms and will pave the way to a novel avenue of drug discovery through targeted activation of silent SM gene clusters through co-cultivations of microorganisms.
Baumgart, Meike; Huber, Isabel; Abdollahzadeh, Iman; Gensch, Thomas; Frunzke, Julia
Compartmentalization represents a ubiquitous principle used by living organisms to optimize metabolic flux and to avoid detrimental interactions within the cytoplasm. Proteinaceous bacterial microcompartments (BMCs) have therefore created strong interest for the encapsulation of heterologous pathways in microbial model organisms. However, attempts were so far mostly restricted to Escherichia coli. Here, we introduced the carboxysomal gene cluster of Halothiobacillus neapolitanus into the biotechnological platform species Corynebacterium gluta-micum. Transmission electron microscopy, fluorescence microscopy and single molecule localization microscopy suggested the formation of BMC-like structures in cells expressing the complete carboxysome operon or only the shell proteins. Purified carboxysomes consisted of the expected protein components as verified by mass spectrometry. Enzymatic assays revealed the functional production of RuBisCO in C. glutamicum both in the presence and absence of carboxysomal shell proteins. Furthermore, we could show that eYFP is targeted to the carboxysomes by fusion to the large RuBisCO subunit. Overall, this study represents the first transfer of an α-carboxysomal gene cluster into a Gram-positive model species supporting the modularity and orthogonality of these microcompartments, but also identified important challenges which need to be addressed on the way towards biotechnological application. Copyright © 2017 Elsevier B.V. All rights reserved.
Li, Yongxin; Li, Zhongrui; Yamanaka, Kazuya; Xu, Ying; Zhang, Weipeng; Vlamakis, Hera; Kolter, Roberto; Moore, Bradley S.; Qian, Pei-Yuan
Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning ``plug-and-play'' approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.
Bacilli are ubiquitous low G+C environmental Gram-positive bacteria that produce a wide assortment of specialized small molecules. Although their natural product biosynthetic potential is high, robust molecular tools to support the heterologous expression of large biosynthetic gene clusters in Bacillus hosts are rare. Herein we adapt transformation-associated recombination (TAR) in yeast to design a single genomic capture and expression vector for antibiotic production in Bacillus subtilis. After validating this direct cloning plug-and-playa approach with surfactin, we genetically interrogated amicoumacin biosynthetic gene cluster from the marine isolate Bacillus subtilis 1779. Its heterologous expression allowed us to explore an unusual maturation process involving the N-acyl-asparagine pro-drug intermediates preamicoumacins, which are hydrolyzed by the asparagine-specific peptidase into the active component amicoumacin A. This work represents the first direct cloning based heterologous expression of natural products in the model organism B. subtilis and paves the way to the development of future genome mining efforts in this genus.
Full Text Available The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA-seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes.
Evans, B.A.; Yun, Z.X.; Close, J.A.
Glandular kallikreins are a family of proteases encoded by a variable number of genes in different mammalian species. In all species examined, however, one particular kallikrein is functionally conserved in its capacity to release the vasoactive peptide, Lys-bradykinin, from low molecular weight kininogen. This kallikrein is found in the kidney, pancreas, and salivary gland, showing a unique pattern of tissue-specific expression relative to other members of the family. The authors have isolated a genomic clone carrying the human renal kallikrein gene and compared the nucleotide sequence of its promoter region with those of the mouse renal kallikrein gene and another mouse kallikrein gene expressed in a distinct cell type. They find four sequence elements conserved between renal kallikrein genes from the two species. They have also shown that the human gene is localized to 19q13, a position analogous to that of the kallikrein gene family on mouse chromosome 7
Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping
The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831
Rubins, Kathleen H.; Hensley, Lisa E.; Relman, David A.; Brown, Patrick O.
Poxviruses use an arsenal of molecular weapons to evade detection and disarm host immune responses. We used DNA microarrays to investigate the gene expression responses to infection by monkeypox virus (MPV), an emerging human pathogen, and Vaccinia virus (VAC), a widely used model and vaccine organism, in primary human macrophages, primary human fibroblasts and HeLa cells. Even as the overwhelmingly infected cells approached their demise, with extensive cytopathic changes, their gene expression programs appeared almost oblivious to poxvirus infection. Although killed (gamma-irradiated) MPV potently induced a transcriptional program characteristic of the interferon response, no such response was observed during infection with either live MPV or VAC. Moreover, while the gene expression response of infected cells to stimulation with ionomycin plus phorbol 12-myristate 13-acetate (PMA), or poly (I-C) was largely unimpaired by infection with MPV, a cluster of pro-inflammatory genes were a notable exception. Poly(I-C) induction of genes involved in alerting the innate immune system to the infectious threat, including TNF-alpha, IL-1 alpha and beta, CCL5 and IL-6, were suppressed by infection with live MPV. Thus, MPV selectively inhibits expression of genes with critical roles in cell-signaling pathways that activate innate immune responses, as part of its strategy for stealthy infection. PMID:21267444
Kathleen H Rubins
Full Text Available Poxviruses use an arsenal of molecular weapons to evade detection and disarm host immune responses. We used DNA microarrays to investigate the gene expression responses to infection by monkeypox virus (MPV, an emerging human pathogen, and Vaccinia virus (VAC, a widely used model and vaccine organism, in primary human macrophages, primary human fibroblasts and HeLa cells. Even as the overwhelmingly infected cells approached their demise, with extensive cytopathic changes, their gene expression programs appeared almost oblivious to poxvirus infection. Although killed (gamma-irradiated MPV potently induced a transcriptional program characteristic of the interferon response, no such response was observed during infection with either live MPV or VAC. Moreover, while the gene expression response of infected cells to stimulation with ionomycin plus phorbol 12-myristate 13-acetate (PMA, or poly (I-C was largely unimpaired by infection with MPV, a cluster of pro-inflammatory genes were a notable exception. Poly(I-C induction of genes involved in alerting the innate immune system to the infectious threat, including TNF-alpha, IL-1 alpha and beta, CCL5 and IL-6, were suppressed by infection with live MPV. Thus, MPV selectively inhibits expression of genes with critical roles in cell-signaling pathways that activate innate immune responses, as part of its strategy for stealthy infection.
Spicer, A.P.; McDonald, J.A. [Mayo Clinic Scottsdale, AZ (United States); Seldin, M.F. [Univ. of California Davis, CA (United States)] [and others
We have recently identified a new vertebrate gene family encoding putative hyaluronan (HA) synthases. Three highly conserved related genes have been identified, designated HAS1, HAS2, and HAS3 in humans and Has1, Has2, and Has3 in the mouse. All three genes encode predicted plasma membrane proteins with multiple transmembrane domains and approximately 25% amino acid sequence identity to the Streptococcus pyogenes HA synthase, HasA. Furthermore, expression of any one HAS gene in transfected mammalian cells leads to high levels of HA biosynthesis. We now report the chromosomal localization of the three HAS genes in human and in mouse. The genes localized to three different positions within both the human and the mouse genomes. HAS1 was localized to the human chromosome 19q13.3-q13.4 boundary and Has1 to mouse Chr 17. HAS2 was localized to human chromosome 8q24.12 and Has2 to mouse Chr 15. HAS3 was localized to human chromosome 16q22.1 and Has3 to mouse Chr 8. The map position for HAS1 reinforces the recently reported relationship between a small region of human chromosome 19q and proximal mouse chromosome 17. HAS2 mapped outside the predicted critical region delineated for the Langer-Giedion syndrome and can thus be excluded as a candidate gene for this genetic syndrome. 33 refs., 2 figs.
Turner Renee J
Full Text Available Abstract Background Gene expression studies require appropriate normalization methods. One such method uses stably expressed reference genes. Since suitable reference genes appear to be unique for each tissue, we have identified an optimal set of the most stably expressed genes in human blood that can be used for normalization. Methods Whole-genome Affymetrix Human 2.0 Plus arrays were examined from 526 samples of males and females ages 2 to 78, including control subjects and patients with Tourette syndrome, stroke, migraine, muscular dystrophy, and autism. The top 100 most stably expressed genes with a broad range of expression levels were identified. To validate the best candidate genes, we performed quantitative RT-PCR on a subset of 10 genes (TRAP1, DECR1, FPGS, FARP1, MAPRE2, PEX16, GINS2, CRY2, CSNK1G2 and A4GALT, 4 commonly employed reference genes (GAPDH, ACTB, B2M and HMBS and PPIB, previously reported to be stably expressed in blood. Expression stability and ranking analysis were performed using GeNorm and NormFinder algorithms. Results Reference genes were ranked based on their expression stability and the minimum number of genes needed for nomalization as calculated using GeNorm showed that the fewest, most stably expressed genes needed for acurate normalization in RNA expression studies of human whole blood is a combination of TRAP1, FPGS, DECR1 and PPIB. We confirmed the ranking of the best candidate control genes by using an alternative algorithm (NormFinder. Conclusion The reference genes identified in this study are stably expressed in whole blood of humans of both genders with multiple disease conditions and ages 2 to 78. Importantly, they also have different functions within cells and thus should be expressed independently of each other. These genes should be useful as normalization genes for microarray and RT-PCR whole blood studies of human physiology, metabolism and disease.
York, Dan; Higgins, Robert J.; LeCouteur, Richard A.; Joshi, Nikhil; Bannasch, Danika
Spontaneous gliomas in dogs occur at a frequency similar to that in humans and may provide a translational model for therapeutic development and comparative biological investigations. Copy number alterations in 38 canine gliomas, including diffuse astrocytomas, glioblastomas, oligodendrogliomas, and mixed oligoastrocytomas, were defined using an Illumina 170K single nucleotide polymorphism array. Highly recurrent alterations were seen in up to 85% of some tumor types, most notably involving chromosomes 13, 22, and 38, and gliomas clustered into 2 major groups consisting of high-grade IV astrocytomas, or oligodendrogliomas and other tumors. Tumor types were characterized by specific broad and focal chromosomal events including focal loss of the INK4A/B locus in glioblastoma and loss of the RB1 gene and amplification of the PDGFRA gene in oligodendrogliomas. Genes associated with the 3 critical pathways in human high-grade gliomas (TP53, RB1, and RTK/RAS/PI3K) were frequently associated with canine aberrations. Analysis of oligodendrogliomas revealed regions of chromosomal losses syntenic to human 1p involving tumor suppressor genes, such as CDKN2C, as well as genes associated with apoptosis, autophagy, and response to chemotherapy and radiation. Analysis of high frequency chromosomal aberrations with respect to human orthologues may provide insight into both novel and common pathways in gliomagenesis and response to therapy. PMID:27251041
Furuya, Toshiki; Hirose, Satomi; Semba, Hisashi; Kino, Kuniki
The mimABCD gene cluster encodes the binuclear iron monooxygenase that oxidizes propane and phenol in Mycobacterium smegmatis strain MC2 155 and Mycobacterium goodii strain 12523. Interestingly, expression of the mimABCD gene cluster is induced by acetone. In this study, we investigated the regulator gene responsible for this acetone-responsive expression. In the genome sequence of M. smegmatis strain MC2 155, the mimABCD gene cluster is preceded by a gene designated mimR, which is divergently transcribed. Sequence analysis revealed that MimR exhibits amino acid similarity with the NtrC family of transcriptional activators, including AcxR and AcoR, which are involved in acetone and acetoin metabolism, respectively. Unexpectedly, many homologs of the mimR gene were also found in the sequenced genomes of actinomycetes. A plasmid carrying a transcriptional fusion of the intergenic region between the mimR and mimA genes with a promoterless green fluorescent protein (GFP) gene was constructed and introduced into M. smegmatis strain MC2 155. Using a GFP reporter system, we confirmed by deletion and complementation analyses that the mimR gene product is the positive regulator of the mimABCD gene cluster expression that is responsive to acetone. M. goodii strain 12523 also utilized the same regulatory system as M. smegmatis strain MC2 155. Although transcriptional activators of the NtrC family generally control transcription using the σ54 factor, a gene encoding the σ54 factor was absent from the genome sequence of M. smegmatis strain MC2 155. These results suggest the presence of a novel regulatory system in actinomycetes, including mycobacteria. PMID:21856847
Full Text Available Abstract Background We describe a hierarchical clustering algorithm for using Single Nucleotide Polymorphism (SNP genetic data to assign individuals to populations. The method does not assume Hardy-Weinberg equilibrium and linkage equilibrium among loci in sample population individuals. Results We show that the algorithm can assign sample individuals highly accurately to their corresponding ethnic groups in our tests using HapMap SNP data and it is also robust to admixed populations when tested with Perlegen SNP data. Moreover, it can detect fine-scale population structure as subtle as that between Chinese and Japanese by using genome-wide high-diversity SNP loci. Conclusion The algorithm provides an alternative approach to the popular STRUCTURE program, especially for fine-scale population structure detection in genome-wide association studies. This is the first successful separation of Chinese and Japanese samples using random SNP loci with high statistical support.
Tarazanova, Mariya; Beerthuyzen, Marke; Siezen, Roland; Fernandez-Gutierrez, Marcela M; de Jong, Anne; van der Meulen, Sjoerd; Kok, Jan; Bachmann, Herwig
Lactococcus lactis MG1363 is an important gram-positive model organism. It is a plasmid-free and phage-cured derivative of strain NCDO712. Plasmid-cured strains facilitate studies on molecular biological aspects, but many properties which make L. lactis an important organism in the dairy industry are plasmid encoded. We sequenced the total DNA of strain NCDO712 and, contrary to earlier reports, revealed that the strain carries 6 rather than 5 plasmids. A new 50-kb plasmid, designated pNZ712, encodes functional nisin immunity (nisCIP) and copper resistance (lcoRSABC). The copper resistance could be used as a marker for the conjugation of pNZ712 to L. lactis MG1614. A genome comparison with the plasmid cured daughter strain MG1363 showed that the number of single nucleotide polymorphisms that accumulated in the laboratory since the strains diverted more than 30 years ago is limited to 11 of which only 5 lead to amino acid changes. The 16-kb plasmid pSH74 was found to contain a novel 8-kb pilus gene cluster spaCB-spaA-srtC1-srtC2, which is predicted to encode a pilin tip protein SpaC, a pilus basal subunit SpaB, and a pilus backbone protein SpaA. The sortases SrtC1/SrtC2 are most likely involved in pilus polymerization while the chromosomally encoded SrtA could act to anchor the pilus to peptidoglycan in the cell wall. Overexpression of the pilus gene cluster from a multi-copy plasmid in L. lactis MG1363 resulted in cell chaining, aggregation, rapid sedimentation and increased conjugation efficiency of the cells. Electron microscopy showed that the over-expression of the pilus gene cluster leads to appendices on the cell surfaces. A deletion of the gene encoding the putative basal protein spaB, by truncating spaCB, led to more pilus-like structures on the cell surface, but cell aggregation and cell chaining were no longer observed. This is consistent with the prediction that spaB is involved in the anchoring of the pili to the cell.
Wang, Kai-Hung; Lin, Cuei-Jyuan; Liu, Chou-Jen; Liu, Dai-Wei; Huang, Rui-Lan; Ding, Dah-Ching; Weng, Ching-Feng; Chu, Tang-Yuan
Epigenetic remodeling of cell adhesion genes is a common phenomenon in cancer invasion. This study aims to investigate global methylation of cell adhesion genes in cervical carcinogenesis and to apply them in early detection of cancer from cervical scraping. Genome-wide methylation array was performed on an investigation cohort, including 16 cervical intraepithelial neoplasia 3 (CIN3) and 20 cervical cancers (CA) versus 12 each of normal, inflammation and CIN1 as controls. Twelve members of clustered proto-cadherin (PCDH) genes were collectively methylated and silenced, which were validated in cancer cells of the cervix, endometrium, liver, head and neck, breast, and lung. In an independent cohort including 107 controls, 66 CIN1, 85 CIN2/3, and 38 CA, methylated PCDHA4 and PCDHA13 were detected in 2.8%, 24.2%, 52.9%, and 84.2% (P < 10 −25 ), and 2.8%, 24.2%, 50.6%, and 94.7% (P < 10 −29 ), respectively. In diagnosis of CIN2 or more severe lesion of the cervix, a combination test of methylated PCDHA4 or PCDHA13 from cervical scraping had a sensitivity, specificity, positive predictive value, and negative predictive value of 74.8%, 80.3%, 73%, and 81.8%, respectively. Testing of this combination from cervical scraping is equally sensitive but more specific than human papillomavirus (HPV) test in diagnosis of CIN2 or more severe lesions. The study disclosed a collective methylation of PCDH genes in cancer of cervix and other sites. At least two of them can be promising diagnostic markers for cervical cancer noninferior to HPV
Full Text Available Background Streptomyces are well known for their capability to produce many bioactive secondary metabolites with medical and industrial importance. Here we report a novel bioactive phenazine compound, 6-((2-hydroxy-4-methoxyphenoxy carbonyl phenazine-1-carboxylic acid (HCPCA extracted from Streptomyces kebangsaanensis, an endophyte isolated from the ethnomedicinal Portulaca oleracea. Methods The HCPCA chemical structure was determined using nuclear magnetic resonance spectroscopy. We conducted whole genome sequencing for the identification of the gene cluster(s believed to be responsible for phenazine biosynthesis in order to map its corresponding pathway, in addition to bioinformatics analysis to assess the potential of S. kebangsaanensis in producing other useful secondary metabolites. Results The S. kebangsaanensis genome comprises an 8,328,719 bp linear chromosome with high GC content (71.35% consisting of 12 rRNA operons, 81 tRNA, and 7,558 protein coding genes. We identified 24 gene clusters involved in polyketide, nonribosomal peptide, terpene, bacteriocin, and siderophore biosynthesis, as well as a gene cluster predicted to be responsible for phenazine biosynthesis. Discussion The HCPCA phenazine structure was hypothesized to derive from the combination of two biosynthetic pathways, phenazine-1,6-dicarboxylic acid and 4-methoxybenzene-1,2-diol, originated from the shikimic acid pathway. The identification of a biosynthesis pathway gene cluster for phenazine antibiotics might facilitate future genetic engineering design of new synthetic phenazine antibiotics. Additionally, these findings confirm the potential of S. kebangsaanensis for producing various antibiotics and secondary metabolites.
Full Text Available The Nkrp1 (Klrb1-Clr (Clec2 genes encode a receptor-ligand system utilized by NK cells as an MHC-independent immunosurveillance strategy for innate immune responses. The related Ly49 family of MHC-I receptors displays extreme allelic polymorphism and haplotype plasticity. In contrast, previous BAC-mapping and aCGH studies in the mouse suggest the neighboring and related Nkrp1-Clr cluster is evolutionarily stable. To definitively compare the relative evolutionary rate of Nkrp1-Clr vs. Ly49 gene clusters, the Nkrp1-Clr gene clusters from two Ly49 haplotype-disparate inbred mouse strains, BALB/c and 129S6, were sequenced. Both Nkrp1-Clr gene cluster sequences are highly similar to the C57BL/6 reference sequence, displaying the same gene numbers and order, complete pseudogenes, and gene fragments. The Nkrp1-Clr clusters contain a strikingly dissimilar proportion of repetitive elements compared to the Ly49 clusters, suggesting that certain elements may be partly responsible for the highly disparate Ly49 vs. Nkrp1 evolutionary rate. Focused allelic polymorphisms were found within the Nkrp1b/d (Klrb1b, Nkrp1c (Klrb1c, and Clr-c (Clec2f genes, suggestive of possible immune selection. Cell-type specific transcription of Nkrp1-Clr genes in a large panel of tissues/organs was determined. Clr-b (Clec2d and Clr-g (Clec2i showed wide expression, while other Clr genes showed more tissue-specific expression patterns. In situ hybridization revealed specific expression of various members of the Clr family in leukocytes/hematopoietic cells of immune organs, various tissue-restricted epithelial cells (including intestinal, kidney tubular, lung, and corneal progenitor epithelial cells, as well as myocytes. In summary, the Nkrp1-Clr gene cluster appears to evolve more slowly relative to the related Ly49 cluster, and likely regulates innate immunosurveillance in a tissue-specific manner.
Full Text Available Gene targeting in human somatic cells is of importance because it can be used to either delineate the loss-of-function phenotype of a gene or correct a mutated gene back to wild-type. Both of these outcomes require a form of DNA double-strand break (DSB repair known as homologous recombination (HR. The mechanism of HR leading to gene targeting, however, is not well understood in human cells. Here, we demonstrate that a two-end, ends-out HR intermediate is valid for human gene targeting. Furthermore, the resolution step of this intermediate occurs via the classic DSB repair model of HR while synthesis-dependent strand annealing and Holliday Junction dissolution are, at best, minor pathways. Moreover, and in contrast to other systems, the positions of Holliday Junction resolution are evenly distributed along the homology arms of the targeting vector. Most unexpectedly, we demonstrate that when a meganuclease is used to introduce a chromosomal DSB to augment gene targeting, the mechanism of gene targeting is inverted to an ends-in process. Finally, we demonstrate that the anti-recombination activity of mismatch repair is a significant impediment to gene targeting. These observations significantly advance our understanding of HR and gene targeting in human cells.
Radioactive probes of high specific activity have been used for human gene localisation on metaphase chromosome preparations. Human 5S ribosomal RNA was used as a model system, as a probe for the localisation of human 5S ribosomal genes. 125 I-labelled mouse 5S ribosomal RNA was used to study the 5S ribosomal gene content and arrangement in families with translocations on the long arm of chromosome 1 close to or containing the 5S ribosomal RNA locus, by in situ hybridisation to human metaphase chromosomes from peripheral blood cultures. This confirmed the chromosomal assignment of 5S ribosomal genes to 1q 42-43. In situ hybridisation probes were also prepared from recombinant plasmids containing Xenopus laevis oocyte 5S or 28S/18S gene sequences to give [ 3 H]-labelled cRNA and [ 3 H]-labelled nick-translated plasmid DNA. Studies on the kinetics of hybridisation of plasmid probes with and without ribosomal gene sequences questioned the role of plasmid DNA for amplification of signal during gene localisation. Gene localisation was obtained with nick-translated plasmid DNA containing the 28S/18S ribosomal DNA insert after short exposure times, but poor results were obtained using a [ 3 H]-labelled cRNA probe transcribed from the plasmid with the 5S gene insert. (author)
Full Text Available Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay (EMSA demonstrated that TetR regulator bound directly to the promoter of this gene cluster. Consistently, the results of quantitative real-time PCR also showed alterations in expression of associated genes. Moreover, the proteins affected by TetR under oxidative stress were revealed by comparing proteomic profiles of wild-type and mutant strains via 1D SDS-PAGE and LC-MS/MS analyses. Taken together, our results demonstrated that tetR gene in this novel gene cluster contributed to cell survival under oxidative stress, and TetR protein played an important regulatory role in growth kinetics, biofilm-forming capability, SOD and catalase activity, and oxide detoxicating ability.
Liu, He; Yang, Chun-Lan; Ge, Meng-Yu; Ibrahim, Muhammad; Li, Bin; Zhao, Wen-Jun; Chen, Gong-You; Zhu, Bo; Xie, Guan-Lin
Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay demonstrated that TetR regulator bound directly to the promoter of this gene cluster. Consistently, the results of quantitative real-time PCR also showed alterations in expression of associated genes. Moreover, the proteins affected by TetR under oxidative stress were revealed by comparing proteomic profiles of wild-type and mutant strains via 1D SDS-PAGE and LC-MS/MS analyses. Taken together, our results demonstrated that tetR gene in this novel gene cluster contributed to cell survival under oxidative stress, and TetR protein played an important regulatory role in growth kinetics, biofilm-forming capability, superoxide dismutase and catalase activity, and oxide detoxicating ability.
Sonna, L.A; Sawka, M. N; Lilly, C. M
Microarray analysis of gene expression at the level of RNA has generated new insights into the relationship between cellular responses to acute heat shock in vitro, exercise, and exertional heat illness...
Yang, Xing; Xie, Lu; Li, Yixue; Wei, Chaochun
Estimating the number of genes in human genome has been long an important problem in computational biology. With the new conception of considering human as a super-organism, it is also interesting to estimate the number of genes in this human super-organism. We presented our estimation of gene numbers in the human gut bacterial community, the largest microbial community inside the human super-organism. We got 552,700 unique genes from 202 complete human gut bacteria genomes. Then, a novel gene counting model was built to check the total number of genes by combining culture-independent sequence data and those complete genomes. 16S rRNAs were used to construct a three-level tree and different counting methods were introduced for the three levels: strain-to-species, species-to-genus, and genus-and-up. The model estimates that the total number of genes is about 9,000,000 after those with identity percentage of 97% or up were merged. By combining completed genomes currently available and culture-independent sequencing data, we built a model to estimate the number of genes in human gut bacterial community. The total number of genes is estimated to be about 9 million. Although this number is huge, we believe it is underestimated. This is an initial step to tackle this gene counting problem for the human super-organism. It will still be an open problem in the near future. The list of genomes used in this paper can be found in the supplementary table.
Full Text Available Gene duplications within the conserved Hox cluster are rare in animal evolution, but in Lepidoptera an array of divergent Hox-related genes (Shx genes has been reported between pb and zen. Here, we use genome sequencing of five lepidopteran species (Polygonia c-album, Pararge aegeria, Callimorpha dominula, Cameraria ohridella, Hepialus sylvina plus a caddisfly outgroup (Glyphotaelius pellucidus to trace the evolution of the lepidopteran Shx genes. We demonstrate that Shx genes originated by tandem duplication of zen early in the evolution of large clade Ditrysia; Shx are not found in a caddisfly and a member of the basally diverging Hepialidae (swift moths. Four distinct Shx genes were generated early in ditrysian evolution, and were stably retained in all descendent Lepidoptera except the silkmoth which has additional duplications. Despite extensive sequence divergence, molecular modelling indicates that all four Shx genes have the potential to encode stable homeodomains. The four Shx genes have distinct spatiotemporal expression patterns in early development of the Speckled Wood butterfly (Pararge aegeria, with ShxC demarcating the future sites of extraembryonic tissue formation via strikingly localised maternal RNA in the oocyte. All four genes are also expressed in presumptive serosal cells, prior to the onset of zen expression. Lepidopteran Shx genes represent an unusual example of Hox cluster expansion and integration of novel genes into ancient developmental regulatory networks.
Luz; Nayibe; Garzon; Matthew; Wohlgemuth; Blair
Common bean is an important but often a disease-susceptible legume crop of temperate,subtropical and tropical regions worldwide. The crop is affected by bacterial, fungal and viral pathogens. The strategy of resistance-gene homologue(RGH) cloning has proven to be an efficient tool for identifying markers and R(resistance) genes associated with resistances to diseases. Microsatellite or SSR markers can be identified by physical association with RGH clones on large-insert DNA clones such as bacterial artificial chromosomes(BACs). Our objectives in this work were to identify RGH-SSR in a BAC library from the Andean genotype G19833 and to test and map any polymorphic markers to identify associations with known positions of disease resistance genes. We developed a set of specific probes designed for clades of common bean RGH genes and then identified positive BAC clones and developed microsatellites from BACs having SSR loci in their end sequences. A total of 629 new RGH-SSRs were identified and named BMr(bean microsatellite RGH-associated markers). A subset of these markers was screened for detecting polymorphism in the genetic mapping population DOR364 × G19833. A genetic map was constructed with a total of 264 markers,among which were 80 RGH loci anchored to single-copy RFLP and SSR markers. Clusters of RGH-SSRs were observed on most of the linkage groups of common bean and in positions associated with R-genes and QTL. The use of these new markers to select for disease resistance is discussed.
Subtyping Salmonella enterica serovar enteritidis isolates from different sources by using sequence typing based on virulence genes and clustered regularly interspaced short palindromic repeats (CRISPRs).
Liu, Fenyun; Kariyawasam, Subhashinie; Jayarao, Bhushan M; Barrangou, Rodolphe; Gerner-Smidt, Peter; Ribot, Efrain M; Knabel, Stephen J; Dudley, Edward G
Salmonella enterica subsp. enterica serovar Enteritidis is a major cause of food-borne salmonellosis in the United States. Two major food vehicles for S. Enteritidis are contaminated eggs and chicken meat. Improved subtyping methods are needed to accurately track specific strains of S. Enteritidis related to human salmonellosis throughout the chicken and egg food system. A sequence typing scheme based on virulence genes (fimH and sseL) and clustered regularly interspaced short palindromic repeats (CRISPRs)-CRISPR-including multi-virulence-locus sequence typing (designated CRISPR-MVLST)-was used to characterize 35 human clinical isolates, 46 chicken isolates, 24 egg isolates, and 63 hen house environment isolates of S. Enteritidis. A total of 27 sequence types (STs) were identified among the 167 isolates. CRISPR-MVLST identified three persistent and predominate STs circulating among U.S. human clinical isolates and chicken, egg, and hen house environmental isolates in Pennsylvania, and an ST that was found only in eggs and humans. It also identified a potential environment-specific sequence type. Moreover, cluster analysis based on fimH and sseL identified a number of clusters, of which several were found in more than one outbreak, as well as 11 singletons. Further research is needed to determine if CRISPR-MVLST might help identify the ecological origins of S. Enteritidis strains that contaminate chickens and eggs.
Mojarrad, Majid; Abdolazimi, Yassan; Hajati, Jamshid; Modarressi, Mohammad Hossein
Objective(s) Recombinant adenoviruses are currently used for a variety of purposes, including in vitro gene transfer, in vivo vaccination, and gene therapy. Ability to infect many cell types, high efficiency in gene transfer, entering both dividing and non dividing cells, and growing to high titers make this virus a good choice for using in various experiments. In the present experiment, a recombinant adenovirus containing human IL-4 coding sequence was made. IL-4 has several characteristics ...
Zhang, Xianglan; Cha, In-Ho; Kim, Ki-Yeol
In this study, we investigated the consensus gene modules in head and neck cancer (HNC) and cervical cancer (CC). We used a publicly available gene expression dataset, GSE6791, which included 42 HNC, 14 normal head and neck, 20 CC and 8 normal cervical tissue samples. To exclude bias because of different human papilloma virus (HPV) types, we analyzed HPV16-positive samples only. We identified 3824 genes common to HNC and CC samples. Among these, 977 genes showed high connectivity and were used to construct consensus modules. We demonstrated eight consensus gene modules for HNC and CC using the dissimilarity measure and average linkage hierarchical clustering methods. These consensus modules included genes with significant biological functions, including ATP binding and extracellular exosome. Eigengen network analysis revealed the consensus modules were highly preserved with high connectivity. These findings demonstrate that HPV16-positive head and neck and cervical cancers share highly preserved consensus gene modules with common potentially therapeutic targets.
Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional
Liu, He; Yang, Chun-Lan; Ge, Meng-Yu; Ibrahim, Muhammad; Li, Bin; Zhao, Wen-Jun; Chen, Gong-You; Zhu, Bo; Xie, Guan-Lin
Acidovorax avenae subsp. avenae is the causal agent of bacterial brown stripe disease in rice. In this study, we characterized a novel horizontal transfer of a gene cluster, including tetR, on the chromosome of A. avenae subsp. avenae RS-1 by genome-wide analysis. TetR acted as a repressor in this gene cluster and the oxidative stress resistance was enhanced in tetR-deletion mutant strain. Electrophoretic mobility shift assay demonstrated that TetR regulator bound directly to the promoter of ...
Cannella, Anthony P; Nguyen, Bichchau M; Piggott, Caroline D; Lee, Robert A; Vinetz, Joseph M; Mehta, Sanjay R
Cutaneous leishmaniasis (CL) is rarely seen in the United States, and the social and geographic context of the infection can be a key to its diagnosis and management. Four Somali and one Ethiopian, in U.S. Border Patrol custody, came to the United States by the same human trafficking route: Djibouti to Dubai to Moscow to Havana to Quito; and then by ground by Columbia/Panama to the United States-Mexico border where they were detained. Although traveling at different times, all five patients simultaneously presented to our institution with chronic ulcerative skin lesions at different sites and stages of evolution. Culture of biopsy specimens grew Leishmania panamensis. Soon thereafter, three individuals from East Africa traveling the identical route presented with L. panamensis CL to physicians in Tacoma, WA. We document here the association of a human trafficking route and new world CL. Clinicians and public health officials should be aware of this emerging infectious disease risk.
Pantano, Lorena; Jodar, Meritxell; Bak, Mads
-specific genes. The most abundant class of small noncoding RNAs in sperm are PIWI-interacting RNAs (piRNAs). Surprisingly, we found that human sperm cells contain piRNAs processed from pseudogenes. Clusters of piRNAs from human testes contain pseudogenes transcribed in the antisense strand and processed...... into small RNAs. Several human protein-coding genes contain antisense predicted targets of pseudogene-derived piRNAs in the male germline and these piRNAs are still found in mature sperm. Our study provides the most extensive data set and annotation of human sperm small RNAs to date and is a resource...... for further functional studies on the roles of sperm small RNAs. In addition, we propose that some of the pseudogene-derived human piRNAs may regulate expression of their parent gene in the male germline....
Huang, W.; Li, S.; Xu, S.
How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time) to four dimensions (space, time and semantics). More specifically, not only a location and time that people stay and spend are collected, but also what people "say" for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The results show that the
Full Text Available How people move in cities and what they do in various locations at different times form human activity patterns. Human activity pattern plays a key role in in urban planning, traffic forecasting, public health and safety, emergency response, friend recommendation, and so on. Therefore, scholars from different fields, such as social science, geography, transportation, physics and computer science, have made great efforts in modelling and analysing human activity patterns or human mobility patterns. One of the essential tasks in such studies is to find the locations or places where individuals stay to perform some kind of activities before further activity pattern analysis. In the era of Big Data, the emerging of social media along with wearable devices enables human activity data to be collected more easily and efficiently. Furthermore, the dimension of the accessible human activity data has been extended from two to three (space or space-time to four dimensions (space, time and semantics. More specifically, not only a location and time that people stay and spend are collected, but also what people “say” for in a location at a time can be obtained. The characteristics of these datasets shed new light on the analysis of human mobility, where some of new methodologies should be accordingly developed to handle them. Traditional methods such as neural networks, statistics and clustering have been applied to study human activity patterns using geosocial media data. Among them, clustering methods have been widely used to analyse spatiotemporal patterns. However, to our best knowledge, few of clustering algorithms are specifically developed for handling the datasets that contain spatial, temporal and semantic aspects all together. In this work, we propose a three-step human activity clustering method based on space, time and semantics to fill this gap. One-year Twitter data, posted in Toronto, Canada, is used to test the clustering-based method. The
Geromy G Moore
Full Text Available Aflatoxins are produced by Aspergillus flavus and A. parasiticus in oil-rich seed and grain crops and are a serious problem in agriculture, with aflatoxin B₁ being the most carcinogenic natural compound known. Sexual reproduction in these species occurs between individuals belonging to different vegetative compatibility groups (VCGs. We examined natural genetic variation in 758 isolates of A. flavus, A. parasiticus and A. minisclerotigenes sampled from single peanut fields in the United States (Georgia, Africa (Benin, Argentina (Córdoba, Australia (Queensland and India (Karnataka. Analysis of DNA sequence variation across multiple intergenic regions in the aflatoxin gene clusters of A. flavus, A. parasiticus and A. minisclerotigenes revealed significant linkage disequilibrium (LD organized into distinct blocks that are conserved across different localities, suggesting that genetic recombination is nonrandom and a global occurrence. To assess the contributions of asexual and sexual reproduction to fixation and maintenance of toxin chemotype diversity in populations from each locality/species, we tested the null hypothesis of an equal number of MAT1-1 and MAT1-2 mating-type individuals, which is indicative of a sexually recombining population. All samples were clone-corrected using multi-locus sequence typing which associates closely with VCG. For both A. flavus and A. parasiticus, when the proportions of MAT1-1 and MAT1-2 were significantly different, there was more extensive LD in the aflatoxin cluster and populations were fixed for specific toxin chemotype classes, either the non-aflatoxigenic class in A. flavus or the B₁-dominant and G₁-dominant classes in A. parasiticus. A mating type ratio close to 1∶1 in A. flavus, A. parasiticus and A. minisclerotigenes was associated with higher recombination rates in the aflatoxin cluster and less pronounced chemotype differences in populations. This work shows that the reproductive nature of
Nielsen, Jens Christian; Grijseels, Sietske; Prigent, Sylvain
Filamentous fungi produce a wide range of bioactive compounds with important pharmaceutical applications, such as antibiotic penicillins and cholesterol-lowering statins. However, less attention has been paid to fungal secondary metabolites compared to those from bacteria. In this study, we...... sequenced the genomes of 9 Penicillium species and, together with 15 published genomes, we investigated the secondary metabolism of Penicillium and identified an immense, unexploited potential for producing secondary metabolites by this genus. A total of 1,317 putative biosynthetic gene clusters (BGCs) were......-referenced the predicted pathways with published data on the production of secondary metabolites and experimentally validated the production of antibiotic yanuthones in Penicillia and identified a previously undescribed compound from the yanuthone pathway. This study is the first genus-wide analysis of the genomic...
Gonzalez-Dominguez, Jorge; Martin, Maria J
In this work we present MPIGeneNet, a parallel tool that applies Pearson's correlation and Random Matrix Theory to construct gene co-expression networks. It is based on the state-of-the-art sequential tool RMTGeneNet, which provides networks with high robustness and sensitivity at the expenses of relatively long runtimes for large scale input datasets. MPIGeneNet returns the same results as RMTGeneNet but improves the memory management, reduces the I/O cost, and accelerates the two most computationally demanding steps of co-expression network construction by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on two different systems using three typical input datasets shows that MPIGeneNet is significantly faster than RMTGeneNet. As an example, our tool is up to 175.41 times faster on a cluster with eight nodes, each one containing two 12-core Intel Haswell processors. Source code of MPIGeneNet, as well as a reference manual, are available at https://sourceforge.net/projects/mpigenenet/.
Full Text Available Actinorhodopsins (ActRs are recently discovered proteorhodopsins present in Actinobacteria, enabling them to adapt to a wider spectrum of environmental conditions. Frequently, a large fraction of freshwater bacterioplankton belongs to the acI lineage of Actinobacteria and codes the LG1 type of ActRs. In this paper we studied the genotype variability of the LG1 ActRs. We have constructed two clone libraries originating from two environmentally different habitats located in Central Europe; the large alkaline lake Mondsee (Austria and the small humic reservoir Jiřická (the Czech Republic. The 75 yielded clones were phylogenetically analyzed together with all ActR sequences currently available in public databases. Altogether 156 sequences were analyzed and 13 clusters of ActRs were distinguished. Newly obtained clones are distributed over all three LG1 subgroups--LG1-A, B and C. Eighty percent of the sequences belonged to the acI lineage (LG1-A ActR gene bearers further divided into LG1-A1 and LG1-A2 subgroups. Interestingly, the two habitats markedly differed in genotype composition with no identical sequence found in both samples of clones. Moreover, Jiřická reservoir contained three so far not reported clusters, one of them LG1-C related, presenting thus completely new, so far undescribed, genotypes of Actinobacteria in freshwaters.
Martens, Geert A; Jiang, Lei; Hellemans, Karine H
The aim of this study was to establish a gene expression blueprint of pancreatic beta cells conserved from rodents to humans and to evaluate its applicability to assess shifts in the beta cell differentiated state. Genome-wide mRNA expression profiles of isolated beta cells were compared to those...... of a large panel of other tissue and cell types, and transcripts with beta cell-abundant and -selective expression were identified. Iteration of this analysis in mouse, rat and human tissues generated a panel of conserved beta cell biomarkers. This panel was then used to compare isolated versus laser capture...
Chang, Jinyuan; Zhou, Wen; Zhou, Wen-Xin; Wang, Lan
Comparing large covariance matrices has important applications in modern genomics, where scientists are often interested in understanding whether relationships (e.g., dependencies or co-regulations) among a large number of genes vary between different biological states. We propose a computationally fast procedure for testing the equality of two large covariance matrices when the dimensions of the covariance matrices are much larger than the sample sizes. A distinguishing feature of the new procedure is that it imposes no structural assumptions on the unknown covariance matrices. Hence, the test is robust with respect to various complex dependence structures that frequently arise in genomics. We prove that the proposed procedure is asymptotically valid under weak moment conditions. As an interesting application, we derive a new gene clustering algorithm which shares the same nice property of avoiding restrictive structural assumptions for high-dimensional genomics data. Using an asthma gene expression dataset, we illustrate how the new test helps compare the covariance matrices of the genes across different gene sets/pathways between the disease group and the control group, and how the gene clustering algorithm provides new insights on the way gene clustering patterns differ between the two groups. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2016, The International Biometric Society.
MARIA A. RADANOVA
Full Text Available C1q is the first component of the classical pathway of complement activation. The coding region for C1q is localized on chromosome 1p34.1–36.3. Mutations or single nucleotide polymorphisms (SNPs in C1q gene cluster can cause developing of Systemic lupus erythematosus (SLE because of C1q deficiency or other unknown reason. We selected five SNPs located in 7.121 kbp region on chromosome 1, which were previously associated with SLE and/or low C1q level, but not causing C1q deficiency and analyzed them in terms of allele frequencies and genotype distribution in comparison with Hispanic, Asian, African and other Caucasian cohorts. These SNPs were: rs587585, rs292001, rs172378, rs294179 and rs631090. One hundred eighty five healthy Bulgarian volunteers were genotyped for the selected five C1q SNPs by quantative real-time PCR methods. International HapMap Project has been used for information about genotype distribution and allele frequencies of the five SNPs in, Hispanics, Asians, Africans and others Caucasian cohorts. Bulgarian healthy volunteers and another pooled Caucasian cohort had similar frequencies of genotypes and alleles of rs587585, rs292001, rs294179 and rs631090 SNPs. Nevertheless, genotype AA of rs172378 was significantly overrepresented in Bulgarians when compared to other healthy Caucasians from USA and UK (60% vs 31%. Genotype distribution of rs172378 in Bulgarians was similar to Greek-Cyriot Caucasians. For all Caucasians the major allele of rs172378 was A. This is the first study analyzing the allele frequencies and genotype distribution of C1q gene cluster SNPs in Bulgarian healthy population.
Roehrdanz, R; Heilmann, L; Senechal, P; Sears, S; Evenson, P
Histones are the major protein component of chromatin structure. The histone family is made up of a quintet of proteins, four core histones (H2A, H2B, H3 & H4) and the linker histones (H1). Spacers are found between the coding regions. Among insects this quintet of genes is usually clustered and the clusters are tandemly repeated. Ribosomal DNA contains a cluster of the rRNA sequences 18S, 5.8S and 28S. The rRNA genes are separated by the spacers ITS1, ITS2 and IGS. This cluster is also tandemly repeated. We found that the ribosomal RNA repeat unit of at least two species of Anthonomine weevils, Anthonomus grandis and Anthonomus texanus (Coleoptera: Curculionidae), is interspersed with a block containing the histone gene quintet. The histone genes are situated between the rRNA 18S and 28S genes in what is known as the intergenic spacer region (IGS). The complete reiterated Anthonomus grandis histone-ribosomal sequence is 16,248 bp.
Full Text Available Ivana Crnovčić,1 Christian Rückert,2 Siamak Semsary,1 Manuel Lang,1 Jörn Kalinowski,2 Ullrich Keller1 1Institut für Chemie, Technische Universität Berlin, Berlin-Charlottenburg, 2Technology Platform Genomics, Center for Biotechnology, Bielefeld University, Bielefeld, Germany Abstract: Sequencing the actinomycin (acm biosynthetic gene cluster of Streptomyces antibioticus IMRU 3720, which produces actinomycin X (Acm X, revealed 20 genes organized into a highly similar framework as in the bi-armed acm C biosynthetic gene cluster of Streptomyces chrysomallus but without an attached additional extra arm of orthologues as in the latter. Curiously, the extra arm of the S. chrysomallus gene cluster turned out to perfectly match the single arm of the S. antibioticus gene cluster in the same order of orthologues including the the presence of two pseudogenes, scacmM and scacmN, encoding a cytochrome P450 and its ferredoxin, respectively. Orthologues of the latter genes were both missing in the principal arm of the S. chrysomallus acm C gene cluster. All orthologues of the extra arm showed a G +C-contents different from that of their counterparts in the principal arm. Moreover, the similarities of translation products from the extra arm were all higher to the corresponding translation products of orthologue genes from the S. antibioticus acm X gene cluster than to those encoded by the principal arm of their own gene cluster. This suggests that the duplicated structure of the S. chrysomallus acm C biosynthetic gene cluster evolved from previous fusion between two one-armed acm gene clusters each from a different genetic background. However, while scacmM and scacmN in the extra arm of the S. chrysomallus acm C gene cluster are mutated and therefore are non-functional, their orthologues saacmM and saacmN in the S. antibioticus acm C gene cluster show no defects seemingly encoding active enzymes with functions specific for Acm X biosynthesis. Both acm
Kattoor, Jobin Jose; Saurabh, Sharad; Malik, Yashpal Singh; Sircar, Shubhankar; Dhama, Kuldeep; Ghosh, Souvik; Bányai, Krisztián; Kobayashi, Nobumichi; Singh, Raj Kumar
Rotavirus C (RVC), a known etiological agent of diarrheal outbreaks, mainly inflicts swine population globally with sporadic incidence in human, cattle, ferret, mink and dog. To demonstrate the presence of RVC in Indian swine population and characterization of its selected structural (VP6) and non-structural (NSP4 and NSP5) genes. A total of 108 diarrheic samples from different regions of India were used. Isolated RNA was loaded onto polyacrylamide gel to screen for the presence of RVs through the identification of specific electrophoretic genomic migration pattern. To characterize the RVC strains, VP6 gene and NSP4 and NSP5 genes were amplified, sequenced and analyzed. Based on VP6 gene specific diagnostic RT-PCR, the presence of RVC was confirmed in 12.0% (13/108) piglet fecal specimens. The nucleotide sequence analysis of VP6 gene, encoding inner capsid protein, from selected porcine RVC (PoRVC) strains revealed more than 93% homologies to human RVC strains (HuRVC) of Eurasian origin. These strains were distant from hitherto reported PoRVCs and clustered with HuRVCs, owning I2 genotype. However, the two non-structural genes, i.e. NSP4 and NSP5, of these strains were found to be of swine type, signifying a re-assortment event that has occurred in the Indian swine population. The findings indicate the presence of human-like RVC in Indian pigs and division of RVC clade with I2 genotype into further sub-clades. To the best of our knowledge, this appears to be the first report of RVC in Indian swine population. Incidence of human-like RVC VP6 gene in swine supports its subsequent zoonotic prospective.
Sandy A van Gool
Full Text Available We used human fetal bone marrow-derived mesenchymal stromal cells (hfMSCs differentiating towards chondrocytes as an alternative model for the human growth plate (GP. Our aims were to study gene expression patterns associated with chondrogenic differentiation to assess whether chondrocytes derived from hfMSCs are a suitable model for studying the development and maturation of the GP. hfMSCs efficiently formed hyaline cartilage in a pellet culture in the presence of TGFβ3 and BMP6. Microarray and principal component analysis were applied to study gene expression profiles during chondrogenic differentiation. A set of 232 genes was found to correlate with in vitro cartilage formation. Several identified genes are known to be involved in cartilage formation and validate the robustness of the differentiating hfMSC model. KEGG pathway analysis using the 232 genes revealed 9 significant signaling pathways correlated with cartilage formation. To determine the progression of growth plate cartilage formation, we compared the gene expression profile of differentiating hfMSCs with previously established expression profiles of epiphyseal GP cartilage. As differentiation towards chondrocytes proceeds, hfMSCs gradually obtain a gene expression profile resembling epiphyseal GP cartilage. We visualized the differences in gene expression profiles as protein interaction clusters and identified many protein clusters that are activated during the early chondrogenic differentiation of hfMSCs showing the potential of this system to study GP development.
Full Text Available In this article the rapid advances made in the molecular genetics of inherited disorders of hypo and hyperpigmentation during the past three years are reviewed. The main focus is on studies in mice as compared to homologues in humans. The main hypomelanotic diseases included are, piebaldism (white spotting due to mutations of c-KIT, PDGF and MGF genes; vitiligo (microphathalmia mice mutations of c-Kit and c-fms genes; Waardenburg syndrome (splotch locus mutations of mice PAX-3 or human Hup-2 genes; albinism (mutations of tyrosinase genes, Menkes disease (Mottled mouse, premature graying (mutations in light/brown locus/gp75/ TRP-1; Griscelli disease (mutations in TRP-1 and steel; Prader-willi and Angelman syndromes, tyrosinase-positive oculocutaneous albinism and hypomelanosis of lto (mutations of pink-eyed dilution gene/mapping to human chromosomes 15 q 11.2 - q12; and human platelet storage pool deficiency diseases due to defects in pallidin, an erythrocyte membrane protein (pallid mouse / mapping to 4.2 pallidin gene. The genetic characterization of hypermelanosis includes, neurofibromatosis 1 (Café-au-lait spots and McCune-Albright Syndrome. Rapid evolving knowledge about pigmentary genes will increase further the knowledge about these hypo and hyperpigmentary disorders.
Background: Translational selection is a ubiquitous and significant mechanism to regulate protein expression in prokaryotes and unicellular eukaryotes. Recent evidence has shown that translational selection is weakly operative in highly expressed genes in human and other vertebrates. However, it remains unclear whether translational selection acts differentially on human genes depending on their expression patterns.Results: Here we report that human housekeeping (HK) genes that are strictly defined as genes that are expressed ubiquitously and consistently in most or all tissues, are under stronger translational selection.Conclusions: These observations clearly show that translational selection is also closely associated with expression pattern. Our results suggest that human HK genes are more efficiently and/or accurately translated into proteins, which will inevitably open up a new understanding of HK genes and the regulation of gene expression.Reviewers: This article was reviewed by Yuan Yuan, Baylor College of Medicine; Han Liang, University of Texas MD Anderson Cancer Center (nominated by Dr Laura Landweber) Eugene Koonin, NCBI, NLM, NIH, United States of America Sandor Pongor, International Centre for Genetic Engineering and biotechnology (ICGEB), Italy. © 2014 Ma et al.; licensee BioMed Central Ltd.
Thompson, L.H.; Brookman, K.W.; Weber, C.A.; Salazar, E.P.; Stewart, S.A.; Carrano, A.V.
The isolation of two addition human genes that give efficient restoration of the repair defects in other CHO mutant lines is reported. The gene designated ERCC2 (Excision Repair Complementing Chinese hamster) corrects mutant UV5 from complementation group 1. They recently cloned this gene by first constructing a secondary transformant in which the human gene was shown to have become physically linked to the bacterial gpt dominant-marker gene by cotransfer in calcium phosphate precipitates in the primary transfection. Transformants expressing both genes were recovered by selecting for resistance to both UV radiation and mycophenolic acid. Using similar methods, the human gene that corrects CHO mutant EM9 was isolated in cosmids and named XRCC1 (X-ray Repair Complementing Chinese hamster). In this case, transformants were recovered by selecting for resistance to CldUrd, which kills EM9 very efficiently. In both genomic and cosmid transformants, the XRCC1 gene restored resistance to the normal range. DNA repair was studied using the kinetics of strand-break rejoining, which was measured after exposure to 137 Cs γ-rays
Nepal, Keshav Kumar; Yoo, Jin Cheol; Sohng, Jae Kyung
KanP, a putative methyltransferase, is located in the kanamycin biosynthetic gene cluster of Streptomyces kanamyceticus ATCC12853. Amino acid sequence analysis of KanP revealed the presence of S-adenosyl-L-methionine binding motifs, which are present in other O-methyltransferases. The kanP gene was expressed in Escherichia coli BL21 (DE3) to generate the E. coli KANP recombinant strain. The conversion of external quercetin to methylated quercetin in the culture extract of E. coli KANP proved the function of kanP as S-adenosyl-L-methionine-dependent methyltransferase. This is the first report concerning the identification of an O-methyltransferase gene from the kanamycin gene cluster. The resistant activity assay and RT-PCR analysis demonstrated the leeway for obtaining methylated kanamycin derivatives from the wild-type strain of kanamycin producer. 2009 Elsevier GmbH. All rights reserved.
Hoeijmakers, J.H.J.; van Duin, M.; Westerveld, A.; Yasui, A.; Bootsma, D.
To identify human DNA repair genes we have transfected human genomic DNA ligated to a dominant marker to excision repair deficient xeroderma pigmentosum (XP) and CHO cells. This resulted in the cloning of a human gene, ERCC-1, that complements the defect of a UV- and mitomycin-C sensitive CHO mutant 43-3B. The ERCC-1 gene has a size of 15 kb, consists of 10 exons and is located in the region 19q13.2-q13.3. Its primary transcript is processed into two mRNAs by alternative splicing of an internal coding exon. One of these transcripts encodes a polypeptide of 297 aminoacids. A putative DNA binding protein domain and nuclear location signal could be identified. Significant AA-homology is found between ERCC-1 and the yeast excision repair gene RAD10. 58 references, 6 figures, 1 table
Full Text Available Abstract Background Uncovering cellular roles of a protein is a task of tremendous importance and complexity that requires dedicated experimental work as well as often sophisticated data mining and processing tools. Protein functions, often referred to as its annotations, are believed to manifest themselves through topology of the networks of inter-proteins interactions. In particular, there is a growing body of evidence that proteins performing the same function are more likely to interact with each other than with proteins with other functions. However, since functional annotation and protein network topology are often studied separately, the direct relationship between them has not been comprehensively demonstrated. In addition to having the general biological significance, such demonstration would further validate the data extraction and processing methods used to compose protein annotation and protein-protein interactions datasets. Results We developed a method for automatic extraction of protein functional annotation from scientific text based on the Natural Language Processing (NLP technology. For the protein annotation extracted from the entire PubMed, we evaluated the precision and recall rates, and compared the performance of the automatic extraction technology to that of manual curation used in public Gene Ontology (GO annotation. In the second part of our presentation, we reported a large-scale investigation into the correspondence between communities in the literature-based protein networks and GO annotation groups of functionally related proteins. We found a comprehensive two-way match: proteins within biological annotation groups form significantly denser linked network clusters than expected by chance and, conversely, densely linked network communities exhibit a pronounced non-random overlap with GO groups. We also expanded the publicly available GO biological process annotation using the relations extracted by our NLP technology
Georg K Gerber
Full Text Available An important research problem in computational biology is the identification of expression programs, sets of co-expressed genes orchestrating normal or pathological processes, and the characterization of the functional breadth of these programs. The use of human expression data compendia for discovery of such programs presents several challenges including cellular inhomogeneity within samples, genetic and environmental variation across samples, uncertainty in the numbers of programs and sample populations, and temporal behavior. We developed GeneProgram, a new unsupervised computational framework based on Hierarchical Dirichlet Processes that addresses each of the above challenges. GeneProgram uses expression data to simultaneously organize tissues into groups and genes into overlapping programs with consistent temporal behavior, to produce maps of expression programs, which are sorted by generality scores that exploit the automatically learned groupings. Using synthetic and real gene expression data, we showed that GeneProgram outperformed several popular expression analysis methods. We applied GeneProgram to a compendium of 62 short time-series gene expression datasets exploring the responses of human cells to infectious agents and immune-modulating molecules. GeneProgram produced a map of 104 expression programs, a substantial number of which were significantly enriched for genes involved in key signaling pathways and/or bound by NF-kappaB transcription factors in genome-wide experiments. Further, GeneProgram discovered expression programs that appear to implicate surprising signaling pathways or receptor types in the response to infection, including Wnt signaling and neurotransmitter receptors. We believe the discovered map of expression programs involved in the response to infection will be useful for guiding future biological experiments; genes from programs with low generality scores might serve as new drug targets that exhibit minimal
Battle, Alexis; Brown, Christopher D.; Engelhardt, Barbara E.; Montgomery, Stephen B.; Aguet, François; Ardlie, Kristin G.; Cummings, Beryl B.; Gelfand, Ellen T.; Getz, Gad; Hadley, Kane; Handsaker, Robert E.; Huang, Katherine H.; Kashin, Seva; Karczewski, Konrad J.; Lek, Monkol; Li, Xiao; MacArthur, Daniel G.; Nedzel, Jared L.; Nguyen, Duyen T.; Noble, Michael S.; Segrè, Ayellet V.; Trowbridge, Casandra A.; Tukiainen, Taru; Abell, Nathan S.; Balliu, Brunilda; Barshir, Ruth; Basha, Omer; Bogu, Gireesh K.; Brown, Andrew; Castel, Stephane E.; Chen, Lin S.; Chiang, Colby; Conrad, Donald F.; Cox, Nancy J.; Damani, Farhan N.; Davis, Joe R.; Delaneau, Olivier; Dermitzakis, Emmanouil T.; Eskin, Eleazar; Ferreira, Pedro G.; Frésard, Laure; Gamazon, Eric R.; Garrido-Martín, Diego; Gewirtz, Ariel D. H.; Gliner, Genna; Gloudemans, Michael J.; Guigo, Roderic; Hall, Ira M.; Han, Buhm; He, Yuan
Characterization of the molecular function of the human genome and its variation across individuals is essential for identifying the cellular mechanisms that underlie human genetic traits and diseases. The Genotype-Tissue Expression (GTEx) project aims to characterize variation in gene expression
N.O. Dillon (Niall); F.G. Grosveld (Frank)
textabstractErythropoiesis during human development is characterized by switches in expression of beta-like globin genes during the transition from the embryonic through fetal to adult stages. Activation and high-level expression of the genes is directed by the locus control region (LCR), located 5'
Wolock, Samuel L.; Yates, Andrew; Petrill, Stephen A.; Bohland, Jason W.; Blair, Clancy; Li, Ning; Machiraju, Raghu; Huang, Kun; Bartlett, Christopher W.
Background: Numerous studies have examined gene × environment interactions (G × E) in cognitive and behavioral domains. However, these studies have been limited in that they have not been able to directly assess differential patterns of gene expression in the human brain. Here, we assessed G × E interactions using two publically available datasets…
Luis F. Larrondo; Bernardo Gonzalez; Dan Cullen; Rafael Vicuna
A cluster of multicopper oxidase genes (mco1, mco2, mco3, mco4) from the lignin-degrading basidiomycete Phanerochaete chrysosporium is described. The four genes share the same transcriptional orientation within a 25 kb region. mco1, mco2 and mco3 are tightly grouped, with intergenic regions of 2.3 and 0.8 kb, respectively, whereas mco4 is located 11 kb upstream of mco1...
Unthan, Simon; Baumgart, Meike; Radek, Andreas; Herbst, Marius; Siebert, Daniel; Brühl, Natalie; Bartsch, Anna; Bott, Michael; Wiechert, Wolfgang; Marin, Kay; Hans, Stephan; Krämer, Reinhard; Seibold, Gerd; Frunzke, Julia; Kalinowski, Jörn; Rückert, Christian; Wendisch, Volker F; Noack, Stephan
For synthetic biology applications, a robust structural basis is required, which can be constructed either from scratch or in a top-down approach starting from any existing organism. In this study, we initiated the top-down construction of a chassis organism from Corynebacterium glutamicum ATCC 13032, aiming for the relevant gene set to maintain its fast growth on defined medium. We evaluated each native gene for its essentiality considering expression levels, phylogenetic conservation, and knockout data. Based on this classification, we determined 41 gene clusters ranging from 3.7 to 49.7 kbp as target sites for deletion. 36 deletions were successful and 10 genome-reduced strains showed impaired growth rates, indicating that genes were hit, which are relevant to maintain biological fitness at wild-type level. In contrast, 26 deleted clusters were found to include exclusively irrelevant genes for growth on defined medium. A combinatory deletion of all irrelevant gene clusters would, in a prophage-free strain, decrease the size of the native genome by about 722 kbp (22%) to 2561 kbp. Finally, five combinatory deletions of irrelevant gene clusters were investigated. The study introduces the novel concept of relevant genes and demonstrates general strategies to construct a chassis suitable for biotechnological application. © 2014 The Authors. Biotechnology Journal published by Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. This is an open access article under the terms of the Creative Commons Attribution-Non-Commercial-NoDerivs Licence, which permits use and distribution in any medium, provided the original work is properly cited, the use is non- commercial and no modifications or adaptations are made.
Full Text Available Abstract Background Gene and genome duplication is the principle creative force in evolution. Recently, protein subcellular relocalization, or neolocalization was proposed as one of the mechanisms responsible for the retention of duplicated genes. This hypothesis received support from the analysis of yeast genomes, but has not been tested thoroughly on animal genomes. In order to evaluate the importance of subcellular relocalizations for retention of duplicated genes in animal genomes, we systematically analyzed nuclear encoded mitochondrial proteins in the human genome by reconstructing phylogenies of mitochondrial multigene families. Results The 456 human mitochondrial proteins selected for this study were clustered into 305 gene families including 92 multigene families. Among the multigene families, 59 (64% consisted of both mitochondrial and cytosolic (non-mitochondrial proteins (mt-cy families while the remaining 33 (36% were composed of mitochondrial proteins (mt-mt families. Phylogenetic analyses of mt-cy families revealed three different scenarios of their neolocalization following gene duplication: 1 relocalization from mitochondria to cytosol, 2 from cytosol to mitochondria and 3 multiple subcellular relocalizations. The neolocalizations were most commonly enabled by the gain or loss of N-terminal mitochondrial targeting signals. The majority of detected subcellular relocalization events occurred early in animal evolution, preceding the evolution of tetrapods. Mt-mt protein families showed a somewhat different pattern, where gene duplication occurred more evenly in time. However, for both types of protein families, most duplication events appear to roughly coincide with two rounds of genome duplications early in vertebrate evolution. Finally, we evaluated the effects of inaccurate and incomplete annotation of mitochondrial proteins and found that our conclusion of the importance of subcellular relocalization after gene duplication on
Full Text Available Abstract Background Long Interspersed Nuclear Elements (LINEs are the most abundant retrotransposons in humans. About 79% of human genes are estimated to contain at least one segment of LINE per transcription unit. Recent studies have shown that LINE elements can affect protein sequences, splicing patterns and expression of human genes. Description We have developed a database, LINE FUSION GENES, for elucidating LINE expression throughout the human gene database. We searched the 28,171 genes listed in the NCBI database for LINE elements and analyzed their structures and expression patterns. The results show that the mRNA sequences of 1,329 genes were affected by LINE expression. The LINE expression types were classified on the basis of LINEs in the 5' UTR, exon or 3' UTR sequences of the mRNAs. Our database provides further information, such as the tissue distribution and chromosomal location of the genes, and the domain structure that is changed by LINE integration. We have linked all the accession numbers to the NCBI data bank to provide mRNA sequences for subsequent users. Conclusion We believe that our work will interest genome scientists and might help them to gain insight into the implications of LINE expression for human evolution and disease. Availability http://www.primate.or.kr/line
Thompson, L.H.; Weber, C.A.; Brookman, K.W.; Salazar, E.P.; Stewart, S.A.; Mitchell, D.L.
The DNA repair systems of rodent and human cells appear to be at least as complex genetically as those in lower eukaryotes and bacteria. The use of mutant lines of rodent cells as a means of identifying human repair genes by functional complementation offers a new approach toward studying the role of repair in mutagenesis and carcinogenesis. In each of six cases examined using hybrid cells, specific human chromosomes have been identified that correct CHO cell mutations affecting repair of damage from uv or ionizing radiations. This finding suggests that both the repair genes and proteins may be virtually interchangeable between rodent and human cells. Using cosmid vectors, human repair genes that map to chromosome 19 have cloned as functional sequences: ERCC2 and XRCC1. ERCC1 was found to have homology with the yeast excision repair gene RAD10. Transformants of repair-deficient cell lines carrying the corresponding human gene show efficient correction of repair capacity by all criteria examined. 39 refs., 1 fig., 1 tab
Cocozza, S; Garofalo, S; Robledo, R; Monticelli, A; Conti, A; Chiarotti, L; Frunzio, R; Bruni, C B; Varrone, S
The probe was a 500 bp cDNA containing exons 2-3 and 4 of the human IGF II gene. The clone was isolated by screening a human liver cDNA library with synthetic oligonucleotides. Eco RI digestion of genomic DNA and hybridization with the IGF II probe detects a two allele polymorphism with allelic fragments of 13.5 kb and 10.5 kb. The frequency was studied 38 unrelated Caucasians: Human IGF II gene was localized on the short arm of chromosome 11 (p15) by in situ hybridization. Codominant segregation was observed in 2 Caucasian families (10 individuals).
Zhang, Chan; Liang, Jian; Yang, Le; Chai, Shiyuan; Zhang, Chenxi; Sun, Baoguo; Wang, Chengtao
This study investigated the effects of glutamic acid on production of monacolin K and expression of the monacolin K biosynthetic gene cluster. When Monascus M1 was grown in glutamic medium instead of in the original medium, monacolin K production increased from 48.4 to 215.4 mg l -1 , monacolin K production increased by 3.5 times. Glutamic acid enhanced monacolin K production by upregulating the expression of mokB-mokI; on day 8, the expression level of mokA tended to decrease by Reverse Transcription-polymerase Chain Reaction. Our findings demonstrated that mokA was not a key gene responsible for the quantity of monacolin K production in the presence of glutamic acid. Observation of Monascus mycelium morphology using Scanning Electron Microscope showed glutamic acid significantly increased the content of Monascus mycelium, altered the permeability of Monascus mycelium, enhanced secretion of monacolin K from the cell, and reduced the monacolin K content in Monascus mycelium, thereby enhancing monacolin K production.
Makarova, Kira; Wolf, Yuri; Koonin, Eugene
With the continuously accelerating genome sequencing from diverse groups of archaea and bacteria, accurate identification of gene orthology and availability of readily expandable clusters of orthologous genes are essential for the functional annotation of new genomes. We report an update of the collection of archaeal Clusters of Orthologous Genes (arCOGs) to cover, on average, 91% of the protein-coding genes in 168 archaeal genomes. The new arCOGs were constructed using refined algorithms for...
Arashida, Ryo; Kakizawa, Shigeyuki; Hoshi, Ayaka; Ishii, Yoshiko; Jung, Hee-Young; Kagiwada, Satoshi; Yamaji, Yasuyuki; Oshima, Kenro; Namba, Shigetou
Phytoplasmas are phloem-limited plant pathogens that are transmitted by insect vectors and are associated with diseases in hundreds of plant species. Despite their small sizes, phytoplasma genomes have repeat-rich sequences, which are due to several genes that are encoded as multiple copies. These multiple genes exist in a gene cluster, the potential mobile unit (PMU). PMUs are present at several distinct regions in the phytoplasma genome. The multicopy genes encoded by PMUs (herein named mobile unit genes [MUGs]) and similar genes elsewhere in the genome (herein named fundamental genes [FUGs]) are likely to have the same function based on their annotations. In this manuscript we show evidence that MUGs and FUGs do not cluster together within the same clade. Each MUG is in a cluster with a short branch length, suggesting that MUGs are recently diverged paralogs, whereas the origin of FUGs is different from that of MUGs. We also compared the genome structures around the lplA gene in two derivative lines of the 'Candidatus Phytoplasma asteris' OY strain, the severe-symptom line W (OY-W) and the mild-symptom line M (OY-M). The gene organizations of the nucleotide sequences upstream of the lplA genes of OY-W and OY-M were dramatically different. The tra5 insertion sequence, an element of PMUs, was found only in this region in OY-W. These results suggest that transposition of entire PMUs and PMU sections has occurred frequently in the OY phytoplasma genome. The difference in the pathogenicities of OY-W and OY-M might be caused by the duplication and transposition of PMUs, followed by genome rearrangement.
Gottelt, Marco; Kol, Stefan; Gomez-Escribano, Juan Pablo; Bibb, Mervyn; Takano, Eriko
Genome sequencing of Streptomyces coelicolor A3(2) revealed an uncharacterized type I polyketide synthase gene cluster (cpk) Here we describe the discovery of a novel antibacterial activity (abCPK) and a yellow-pigmented secondary metabolite (yCPK) after deleting a presumed pathway-specific
Lottrup, Grete; Belling, Kirstine González-Izarzugaza; Leffers, Henrik
the normally clustered and hyperplastic ALCs.WHAT IS KNOWN ALREADY: LCs are the primary androgen producing cells in males throughout development and appear in chronologically distinct populations; FLCs, neonatal LCs and ALCs. ALCs are responsible for progression through puberty and for maintenance...... of reproductive functions in adulthood. In patients with reproductive problems, such as infertility or testicular cancer, and especially in men with high gonadotrophin levels, LC function is often impaired, and LCs may cluster abnormally into hyperplastic micronodules (defined as clusters of > 15 LCs in a cross...... with reproductive disorders possibly reflect subtle changes in the expression of many genes rather than regulatory changes of single genes or pathways. The study provides new insights into the development and maturation of human LCs by the identification of a number of potential functional markers for FLC and ALC....
Wang Xi; Liu Qiang
Human gliomas are one of the most aggressive tumors in brain which grow infiltrativly. Surgery is the mainstay of treatment. But as the tumor could not be entirely cut off, it is easy to relapse. Radiotherapy plays an important role for patients with gliomas after surgery. The efficacy of radiotherapy is associated with radio sensitivity of human gliomas. This paper makes a summary of current situation and progress for radiosensitive genes of human brain gliomas. (authors)
Choi, Hyunjung; Jin, Sun Hee; Han, Mi Hwa; Lee, Jinyoung; Ahn, Seyeon; Seong, Minjeong; Choi, Hyun; Han, Jiyeon; Cho, Eun-Gyung; Lee, Tae Ryong; Noh, Minsoo
The interactions between human epidermal melanocytes and their cellular microenvironment are important in the regulation of human melanocyte functions or in their malignant transformation into melanoma. Although the basement membrane extracellular matrix (BM-ECM) is one of major melanocyte microenvironments, the effects of BM-ECM on the human melanocyte functions are not fully explained at a molecular level. This study was aimed to characterize the molecular and cellular interactions between normal human melanocytes (NHMs) and BM-ECM. We investigated cell culture models of normal human melanocytes or melanoma cells on three-dimensional (3D) Matrigel to understand the roles of the basement membrane microenvironment in human melanocyte functions. Melanogenesis and melanobast biomarker expression in both primary human melanocytes and melanoma cells on 3D Matrigel were evaluated. We found that NHMs migrated and formed reversible paired box 3 (PAX3) expressing cell clusters on three-dimensional (3D) Matrigel. The melanogenesis was significantly decreased in the PAX3 expressing cell cluster. The expression profile of PAX3, SOX10, and MITF in the melanocyte cluster on 3D Matrigel was similar to that of melanoblasts. Interestingly, PAX3 and SOX10 showed an inverse expression profile in NHMs, whereas the inverse expression pattern of PAX3 and SOX10 was disrupted in melanoma MNT1 and WM266-4 cells. The human melanocyte culture on 3D Matrigel provides an alternative model system to study functions of human melanoblasts. In addition, this system will contribute to the elucidation of PAX3-related tumorigenic mechanisms to understand human melanoma. Copyright © 2014 Japanese Society for Investigative Dermatology. Published by Elsevier Ireland Ltd. All rights reserved.
Tsujimoto, Y.; Bashir, M.M.; Givol, I.; Cossman, J.; Jaffe, E.; Croce, C.M.
In most human lymphomas, the chromosome translocation t(14;18) occurs within two breakpoint clustering regions on chromosome 18, the major one at the 3' untranslated region of the bcl-2 gene and the minor one at 3' of the gene. Analysis of a panel of follicular lymphoma DNAs using probes for the first exon of the bcl-2 gene indicates that DNA rearrangements may also occur 5' to the involved bcl-2 gene. In this case the IgH locus and the bcl-2 gene are found in an order suggesting that an inversion also occurred during the translocation process. The coding region of the bcl-2 gene, however, are left intact in all cases of follicular lymphoma studied to date
Yang, Shuang; Xi, Daoyi; Jing, Fuyi; Kong, Deju; Wu, Junli; Feng, Lu; Cao, Boyang; Wang, Lei
Capsular polysaccharides (CPSs), or K-antigens, are the major surface antigens of Escherichia coli. More than 80 serologically unique K-antigens are classified into 4 groups (Groups 1-4) of capsules. Groups 1 and 4 contain the Wzy-dependent polymerization pathway and the gene clusters are in the order galF to gnd; Groups 2 and 3 contain the ABC-transporter-dependent pathway and the gene clusters consist of 3 regions, regions 1, 2 and 3. Little is known about the variations among the gene clusters. In this study, 9 serotypes of K-antigen gene clusters (K2ab, K11, K20, K24, K38, K84, K92, K96, and K102) were sequenced and correlated with their CPS chemical structures. On the basis of sequence data, a K-antigen-specific suspension array that detects 10 distinct CPSs, including the above 9 CPSs plus K30, was developed. This is the first report to catalog the genetic features of E. coli K-antigen variations and to develop a suspension array for their molecular typing. The method has a number of advantages over traditional bacteriophage and serum agglutination methods and lays the foundation for straightforward identification and detection of additional K-antigens in the future.
Kruse, T.; Levisson, M.; Vos, de W.M.; Smidt, H.
The glycopeptide vancomycin was until recently considered a drug of last resort against Gram-positive bacteria. Increasing numbers of bacteria, however, are found to carry genes that confer resistance to this antibiotic. So far, 10 different vancomycin resistance clusters have been described. A
Félix, Christine; Pichon, Samuel; Braquart-Varnier, Christine
Wolbachia are maternally inherited alpha-proteobacteria that induce feminization of genetic males in most terrestrial crustacean isopods. Two clusters of vir genes for a type IV secretion machinery have been identified at two separate loci and characterized for the first time in a feminizing Wolb...
Histones are the major protein component of chromatin structure. The histone family is made up of a quintet of proteins, four core histones (H2A, H2B, H3 & H4) and the linker histones (H1). Spacers are found between the coding regions. Among insects this quintet of genes is usually clustered and ...
Vollberg, T.M.; Siegler, K.M.; Cool, B.L.; Sirover, M.A.
A series of anti-human placental uracil DNA glycosylase monoclonal antibodies was used to screen a human placental cDNA library in phage λgt11. Twenty-seven immunopositive plaques were detected and purified. One clone containing a 1.2-kilobase (kb) human cDNA insert was chosen for further study by insertion into pUC8. The resultant recombinant plasmid selected by hybridization a human placental mRNA that encoded a 37-kDa polypeptide. This protein was immunoprecipitated specifically by an anti-human placenta uracil DNA glycosylase monoclonal antibody. RNA blot-hybridization (Northern) analysis using placental poly(A) + RNA or total RNA from four different human fibroblast cell strains revealed a single 1.6-kb transcript. Genomic blots using DNA from each cell strain digested with either EcoRI or PstI revealed a complex pattern of cDNA-hydridizing restriction fragments. The genomic analysis for each enzyme was highly similar in all four human cell strains. In contrast, a single band was observed when genomic analysis was performed with the identical DNA digests with an actin gene probe. During cell proliferation there was an increase in the level of glycosylase mRNA that paralleled the increase in uracil DNA glycosylase enzyme activity. The isolation of the human uracil DNA glycosylase gene permits an examination of the structure, organization, and expression of a human DNA repair gene
Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun
The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.
During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.
Pennacchio, Len A.; Rubin, Edward M.
Apolipoprotein A5 (APOA5) is a newly described member of theapolipoprotein gene family whose initial discovery arose from comparativesequence analysis of the mammalian APOA1/C3/A4 gene cluster. Functionalstudies in mice indicated that alteration in the level of APOA5significantly impacted plasma triglyceride concentrations. Miceover-expressing human APOA5 displayed significantly reducedtriglycerides, while mice lacking apoA5 had a large increase in thislipid parameter. Studies in humans have also suggested an important rolefor APOA5 in determining plasma triglyceride concentrations. In theseexperiments, polymorphisms in the human gene were found to define severalcommon haplotypes that were associated with significant changes intriglyceride concentrations in multiple populations. Several separateclinical studies have provided consistent and strong support for theeffect with 24 percent of Caucasians, 35 percent of African-Americans and53 percent of Hispanics carrying APOA5 haplotypes associated withincreased plasma triglyceride levels. In summary, APOA5 represents anewly discovered gene involved in triglyceride metabolism in both humansand mice whose mechanism of action remains to be deciphered.
MENG Xu-li; DING Xiao-wen; XU Xiao-hong
Objective: To investigate the molecular etiology of breast cancer by way of studying the differential expression and initial function of the related genes in the occurrence and development of breast cancer. Methods: Two hundred and eighty-eight human tumor related genes were chosen for preparation of the oligochips probe. mRNA was extracted from 16 breast cancer tissues and the corresponding normal breast tissues, and cDNA probe was prepared through reverse-transcription and hybridized with the gene chip. A laser focused fluorescent scanner was used to scan the chip. The different gene expressions were thereafter automatically compared and analyzed between the two sample groups. Cy3/Cy5＞3.5 meant significant up-regulation. Cy3/Cy5＜0.25 meant significant down-regulation. Results: The comparison between the breast cancer tissues and their corresponding normal tissues showed that 84 genes had differential expression in the Chip. Among the differently expressed genes, there were 4 genes with significant down-regulation and 6 with significant up-regulation. Compared with normal breast tissues, differentially expressed genes did partially exist in the breast cancer tissues. Conclusion: Changes in multi-gene expression regulations take place during the occurrence and development of breast cancer; and the research on related genes can help understanding the mechanism of tumor occurrence.
Dunstan, G R
Moral analysis must begin with respect for the empirical features, the "facts of the case". Major advances in genetic knowledge and technology -- as in other sciences -- inevitably change mental attitudes. But they could not change human nature, a product of the distinctively human cerebral cortex. Human capacities like compassion and justice are our own and for us to guard. To ask (as some do) about a "right" to inherit a non-manipulated genome is to ask an unanswerable question: the language of rights is inappropriate in this context. Parents have a duty to safeguard and to serve the interests of their potential child. The medical duty is to help in that task in ways which they have limited freedom to choose. The role of churches is to be faithful to their deposit of faith and their theological principles, including that of freedom of conscience. Churches are too easily led in practice to over-rule conscience on grounds of authority, ecclesiastical or biblical, not sustained by convincing reason. This is most evident in some declarations concerning human reproduction. Better were it for them to help their faithful in moral reasoning, the ethics of choice; to keep consciences tender.
Baldi, Pierre; Brunak, Søren; Chauvin, Yves
We analyse the sequential structure of human genomic DNA by hidden Markov models. We apply models of widely different design: conventional left-right constructs and models with a built-in periodic architecture. The models are trained on segments of DNA sequences extracted such that they cover com...
Møller, Martin Nue; Kirkeby, Svend; Vikeså, Jonas
a1 sodium-bicarbonate transporter, SLC9a2 sodium-hydrogen transporter, SLC12a3 thiazide-sensitive Na-Cl transporter, and SLC34a2 sodium-phosphate transporter. CONCLUSIONS: Several important ion transporters of the SLC family are expressed in the human endolymphatic sac, including Pendrin...
Caracausi, Maria; Piovesan, Allison; Antonaros, Francesca; Strippoli, Pierluigi; Vitale, Lorenza; Pelleri, Maria Chiara
The ideal reference, or control, gene for the study of gene expression in a given organism should be expressed at a medium‑high level for easy detection, should be expressed at a constant/stable level throughout different cell types and within the same cell type undergoing different treatments, and should maintain these features through as many different tissues of the organism. From a biological point of view, these theoretical requirements of an ideal reference gene appear to be best suited to housekeeping (HK) genes. Recent advancements in the quality and completeness of human expression microarray data and in their statistical analysis may provide new clues toward the quantitative standardization of human gene expression studies in biology and medicine, both cross‑ and within‑tissue. The systematic approach used by the present study is based on the Transcriptome Mapper tool and exploits the automated reassignment of probes to corresponding genes, intra‑ and inter‑sample normalization, elaboration and representation of gene expression values in linear form within an indexed and searchable database with a graphical interface recording quantitative levels of expression, expression variability and cross‑tissue width of expression for more than 31,000 transcripts. The present study conducted a meta‑analysis of a pool of 646 expression profile data sets from 54 different human tissues and identified actin γ 1 as the HK gene that best fits the combination of all the traditional criteria to be used as a reference gene for general use; two ribosomal protein genes, RPS18 and RPS27, and one aquaporin gene, POM121 transmembrane nucleporin C, were also identified. The present study provided a list of tissue‑ and organ‑specific genes that may be most suited for the following individual tissues/organs: Adipose tissue, bone marrow, brain, heart, kidney, liver, lung, ovary, skeletal muscle and testis; and also provides in these cases a representative
Litman Gary W
Full Text Available Abstract Background Novel immune-type receptor (NITR genes are members of diversified multigene families that are found in bony fish and encode type I transmembrane proteins containing one or two extracellular immunoglobulin (Ig domains. The majority of NITRs can be classified as inhibitory receptors that possess cytoplasmic immunoreceptor tyrosine-based inhibition motifs (ITIMs. A much smaller number of NITRs can be classified as activating receptors by the lack of cytoplasmic ITIMs and presence of a positively charged residue within their transmembrane domain, which permits partnering with an activating adaptor protein. Results Forty-four NITR genes in medaka (Oryzias latipes are located in three gene clusters on chromosomes 10, 18 and 21 and can be organized into 24 families including inhibitory and activating forms. The particularly large dataset acquired in medaka makes direct comparison possible to another complete dataset acquired in zebrafish in which NITRs are localized in two clusters on different chromosomes. The two largest medaka NITR gene clusters share conserved synteny with the two zebrafish NITR gene clusters. Shared synteny between NITRs and CD8A/CD8B is limited but consistent with a potential common ancestry. Conclusion Comprehensive phylogenetic analyses between the complete datasets of NITRs from medaka and zebrafish indicate multiple species-specific expansions of different families of NITRs. The patterns of sequence variation among gene family members are consistent with recent birth-and-death events. Similar effects have been observed with mammalian immunoglobulin (Ig, T cell antigen receptor (TCR and killer cell immunoglobulin-like receptor (KIR genes. NITRs likely diverged along an independent pathway from that of the somatically rearranging antigen binding receptors but have undergone parallel evolution of V family diversity.
Yamauchi-Takihara, K.; Sole, M.J.; Liew, J.; Ing, D.; Liew, C.C.
The authors have isolated and analyzed the structure of the genes coding for the α and β forms of the human cardiac myosin heavy chain (MYHC). Detailed analysis of four overlapping MYHC genomic clones shows that the α-MYHC and β-MYHC genes constitute a total length of 51 kilobases and are tandemly linked. The β-MYHC-encoding gene, predominantly expressed in the normal human ventricle and also in slow-twitch skeletal muscle, is located 4.5 kilobases upstream of the α-MYHC-encoding gene, which is predominantly expressed in normal human atrium. The authors have determined the nucleotide sequences of the β form of the MYHC gene, which is 100% homologous to the cardiac MYHC cDNA clone (pHMC3). It is unlikely that the divergence of a few nucleotide sequences from the cardiac β-MYHC cDNA clone (pHMC3) reported in a MYHC cDNA clone (PSMHCZ) from skeletal muscle is due to a splicing mechanism. This finding suggests that the same β form of the cardiac MYHC gene is expressed in both ventricular and slow-twitch skeletal muscle. The promoter regions of both α- and β-MYHC genes, as well as the first four coding regions in the respective genes, have also been sequenced. The sequences in the 5'-flanking region of the α- and β-MYHC-encoding genes diverge extensively from one another, suggesting that expression of the α- and β-MYHC genes is independently regulated
Galligan James J
Full Text Available Abstract Enzyme-mediated disulfide bond formation is a highly conserved process affecting over one-third of all eukaryotic proteins. The enzymes primarily responsible for facilitating thiol-disulfide exchange are members of an expanding family of proteins known as protein disulfide isomerases (PDIs. These proteins are part of a larger superfamily of proteins known as the thioredoxin protein family (TRX. As members of the PDI family of proteins, all proteins contain a TRX-like structural domain and are predominantly expressed in the endoplasmic reticulum. Subcellular localization and the presence of a TRX domain, however, comprise the short list of distinguishing features required for gene family classification. To date, the PDI gene family contains 21 members, varying in domain composition, molecular weight, tissue expression, and cellular processing. Given their vital role in protein-folding, loss of PDI activity has been associated with the pathogenesis of numerous disease states, most commonly related to the unfolded protein response (UPR. Over the past decade, UPR has become a very attractive therapeutic target for multiple pathologies including Alzheimer disease, Parkinson disease, alcoholic and non-alcoholic liver disease, and type-2 diabetes. Understanding the mechanisms of protein-folding, specifically thiol-disulfide exchange, may lead to development of a novel class of therapeutics that would help alleviate a wide range of diseases by targeting the UPR.
Oortveld, Merel A. W.; Keerthikumar, Shivakumar; Oti, Martin; Nijhof, Bonnie; Fernandes, Ana Clara; Kochinke, Korinna; Castells-Nobau, Anna; van Engelen, Eva; Ellenkamp, Thijs; Eshuis, Lilian; Galy, Anne; van Bokhoven, Hans; Habermann, Bianca; Brunner, Han G.; Zweier, Christiane; Verstreken, Patrik; Huynen, Martijn A.; Schenck, Annette
Intellectual Disability (ID) disorders, defined by an IQ below 70, are genetically and phenotypically highly heterogeneous. Identification of common molecular pathways underlying these disorders is crucial for understanding the molecular basis of cognition and for the development of therapeutic intervention strategies. To systematically establish their functional connectivity, we used transgenic RNAi to target 270 ID gene orthologs in the Drosophila eye. Assessment of neuronal function in behavioral and electrophysiological assays and multiparametric morphological analysis identified phenotypes associated with knockdown of 180 ID gene orthologs. Most of these genotype-phenotype associations were novel. For example, we uncovered 16 genes that are required for basal neurotransmission and have not previously been implicated in this process in any system or organism. ID gene orthologs with morphological eye phenotypes, in contrast to genes without phenotypes, are relatively highly expressed in the human nervous system and are enriched for neuronal functions, suggesting that eye phenotyping can distinguish different classes of ID genes. Indeed, grouping genes by Drosophila phenotype uncovered 26 connected functional modules. Novel links between ID genes successfully predicted that MYCN, PIGV and UPF3B regulate synapse development. Drosophila phenotype groups show, in addition to ID, significant phenotypic similarity also in humans, indicating that functional modules are conserved. The combined data indicate that ID disorders, despite their extreme genetic diversity, are caused by disruption of a limited number of highly connected functional modules. PMID:24204314
Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.
Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both [3'- 32 P]-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these γ-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues
Ruiz-Orera, Jorge; Hernandez-Rodriguez, Jessica; Chiva, Cristina; Sabidó, Eduard; Kondova, Ivanela; Bontrop, Ronald; Marqués-Bonet, Tomàs; Albà, M Mar
The birth of new genes is an important motor of evolutionary innovation. Whereas many new genes arise by gene duplication, others originate at genomic regions that did not contain any genes or gene copies. Some of these newly expressed genes may acquire coding or non-coding functions and be preserved by natural selection. However, it is yet unclear which is the prevalence and underlying mechanisms of de novo gene emergence. In order to obtain a comprehensive view of this process, we have performed in-depth sequencing of the transcriptomes of four mammalian species--human, chimpanzee, macaque, and mouse--and subsequently compared the assembled transcripts and the corresponding syntenic genomic regions. This has resulted in the identification of over five thousand new multiexonic transcriptional events in human and/or chimpanzee that are not observed in the rest of species. Using comparative genomics, we show that the expression of these transcripts is associated with the gain of regulatory motifs upstream of the transcription start site (TSS) and of U1 snRNP sites downstream of the TSS. In general, these transcripts show little evidence of purifying selection, suggesting that many of them are not functional. However, we find signatures of selection in a subset of de novo genes which have evidence of protein translation. Taken together, the data support a model in which frequently-occurring new transcriptional events in the genome provide the raw material for the evolution of new proteins.
Guerreiro, J F; Figueiredo, M S; Zago, M A
We have determined the beta-globin cluster haplotypes for 80 Indians from four Brazilian Amazon tribes: Kayapó, Wayampí, Wayana-Apalaí, and Arára. The results are analyzed together with 20 Yanomámi previously studied. From 2 to 4 different haplotypes were identified for each tribe, and 7 of the possible 32 haplotypes were found in a sample of 172 chromosomes for which the beta haplotypes were directly determined or derived from family studies. The haplotype distribution does not differ significantly among the five populations. The two most common haplotypes in all tribes were haplotypes 2 and 6, with average frequencies of 0.843 and 0.122, respectively. The genetic affinities between Brazilian Indians and other human populations were evaluated by estimates of genetic distance based on haplotype data. The lowest values were observed in relation to Asians, especially Chinese, Polynesians, and Micronesians.
Full Text Available Human gene regulatory networks (GRN can be difficult to interpret due to a tangle of edges interconnecting thousands of genes. We constructed a general human GRN from extensive transcription factor and microRNA target data obtained from public databases. In a subnetwork of this GRN that is active during estrogen stimulation of MCF-7 breast cancer cells, we benchmarked automated algorithms for identifying core regulatory genes (transcription factors and microRNAs. Among these algorithms, we identified K-core decomposition, pagerank and betweenness centrality algorithms as the most effective for discovering core regulatory genes in the network evaluated based on previously known roles of these genes in MCF-7 biology as well as in their ability to explain the up or down expression status of up to 70% of the remaining genes. Finally, we validated the use of K-core algorithm for organizing the GRN in an easier to interpret layered hierarchy where more influential regulatory genes percolate towards the inner layers. The integrated human gene and miRNA network and software used in this study are provided as supplementary materials (S1 Data accompanying this manuscript.
Miyata, A; Yokoyama, C; Ihara, H; Bandoh, S; Takeda, O; Takahashi, E; Tanabe, T
The gene encoding human thromboxane synthase (TBXAS1) was isolated from a human EMBL3 genomic library using human platelet thromboxane synthase cDNA as a probe. Nucleotide sequencing revealed that the human thromboxane synthase gene spans more than 75 kb and consists of 13 exons and 12 introns, of which the splice donor and acceptor sites conform to the GT/AG rule. The exon-intron boundaries of the thromboxane synthase gene were similar to those of the human cytochrome P450 nifedipine oxidase gene (CYP3A4) except for introns 9 and 10, although the primary sequences of these enzymes exhibited 35.8% identity each other. The 1.2-kb of the 5'-flanking region sequence contained potential binding sites for several transcription factors (AP-1, AP-2, GATA-1, CCAAT box, xenobiotic-response element, PEA-3, LF-A1, myb, basic transcription element and cAMP-response element). Primer-extension analysis indicated the multiple transcription-start sites, and the major start site was identified as an adenine residue located 142 bases upstream of the translation-initiation site. However, neither a typical TATA box nor a typical CAAT box is found within the 100-b upstream of the translation-initiation site. Southern-blot analysis revealed the presence of one copy of the thromboxane synthase gene per haploid genome. Furthermore, a fluorescence in situ hybridization study revealed that the human gene for thromboxane synthase is localized to band q33-q34 of the long arm of chromosome 7. A tissue-distribution study demonstrated that thromboxane synthase mRNA is widely expressed in human tissues and is particularly abundant in peripheral blood leukocyte, spleen, lung and liver. The low but significant levels of mRNA were observed in kidney, placenta and thymus.
Méjean, Annick; Mazmouz, Rabia; Mann, Stéphane; Calteau, Alexandra; Médigue, Claudine; Ploux, Olivier
We report a draft sequence of the genome of Oscillatoria sp. PCC 6506, a cyanobacterium that produces anatoxin-a and homoanatoxin-a, two neurotoxins, and cylindrospermopsin, a cytotoxin. Beside the clusters of genes responsible for the biosynthesis of these toxins, we have found other clusters of genes likely involved in the biosynthesis of not-yet-identified secondary metabolites. PMID:20675499
Full Text Available The gene cluster responsible for the biosynthesis of the red polyketidic pigment bikaverin has only been characterized in Fusarium ssp. so far. Recently, a highly homologous but incomplete and nonfunctional bikaverin cluster has been found in the genome of the unrelated phytopathogenic fungus Botrytis cinerea. In this study, we provided evidence that rare B. cinerea strains such as 1750 have a complete and functional cluster comprising the six genes orthologous to Fusarium fujikuroi ffbik1-ffbik6 and do produce bikaverin. Phylogenetic analysis confirmed that the whole cluster was acquired from Fusarium through a horizontal gene transfer (HGT. In the bikaverin-nonproducing strain B05.10, the genes encoding bikaverin biosynthesis enzymes are nonfunctional due to deleterious mutations (bcbik2-3 or missing (bcbik1 but interestingly, the genes encoding the regulatory proteins BcBIK4 and BcBIK5 do not harbor deleterious mutations which suggests that they may still be functional. Heterologous complementation of the F. fujikuroi Δffbik4 mutant confirmed that bcbik4 of strain B05.10 is indeed fully functional. Deletion of bcvel1 in the pink strain 1750 resulted in loss of bikaverin and overproduction of melanin indicating that the VELVET protein BcVEL1 regulates the biosynthesis of the two pigments in an opposite manner. Although strain 1750 itself expresses a truncated BcVEL1 protein (100 instead of 575 aa that is nonfunctional with regard to sclerotia formation, virulence and oxalic acid formation, it is sufficient to regulate pigment biosynthesis (bikaverin and melanin and fenhexamid HydR2 type of resistance. Finally, a genetic cross between strain 1750 and a bikaverin-nonproducing strain sensitive to fenhexamid revealed that the functional bikaverin cluster is genetically linked to the HydR2 locus.
Chao, Moses V.; Bothwell, Mark A.; Ross, Alonzo H.; Koprowski, Hilary; Lanahan, Anthony A.; Buck, C. Randall; Sehgal, Amita
Nerve growth factor (NGF) and its receptor are important in the development of cells derived from the neural crest. Mouse L cell transformants have been generated that stably express the human NGF receptor gene transfer with total human DNA. Affinity cross-linking, metabolic labeling and immunoprecipitation, and equilibrium binding with 125I-labeled NGF revealed that this NGF receptor had the same size and binding characteristics as the receptor from human melanoma cells and rat PC12 cells. The sequences encoding the NGF receptor were molecularly cloned using the human Alu repetitive sequence as a probe. A cosmid clone that contained the human NGF receptor gene allowed efficient transfection and expression of the receptor.
Full Text Available Males and females have a variety of sexually dimorphic traits, most of which result from hormonal differences. However, differences between male and female embryos initiate very early in development, before hormonal influence begins, suggesting the presence of genetically driven sexual dimorphisms. By comparing the gene expression profiles of male and X-inactivated female human pluripotent stem cells, we detected Y-chromosome-driven effects. We discovered that the sex-determining gene SRY is expressed in human male pluripotent stem cells and is induced by reprogramming. In addition, we detected more than 200 differentially expressed autosomal genes in male and female embryonic stem cells. Some of these genes are involved in steroid metabolism pathways and lead to sex-dependent differentiation in response to the estrogen precursor estrone. Thus, we propose that the presence of the Y chromosome and specifically SRY may drive sex-specific differences in the growth and differentiation of pluripotent stem cells.
Peter, D.; Finn, P.; Liu, Y.; Roghani, A.; Edwards, R.H.; Klisak, I.; Kojis, T.; Heinzmann, C.; Sparkes, R.S. (UCLA School of Medicine, Los Angeles, CA (United States))
The physiologic and behavioral effects of pharmacologic agents that interfere with the transport of monoamine neurotransmitters into vesicles suggest that vesicular amine transport may contribute to human neuropsychiatric disease. To determine whether an alteration in the genes that encode vesicular amine transport contributes to the inherited component of these disorders, the authors have isolated a human cDNA for the brain transporter and localized the human vesciular amine transporter genes. The human brain synaptic vesicle amine transporter (SVAT) shows unexpected conservation with rat SVAT in the regions that diverge extensively between rat SVAT and the rat adrenal chromaffin granule amine transporter (CGAT). Using the cloned sequences with a panel of mouse-human hybrids and in situ hybridization for regional localization, the adrenal CGAT gene (or VAT1) maps to human chromosome 8p21.3 and the brain SVAT gene (or VAT2) maps to chromosome 10q25. Both of these sites occur very close to if not within previously described deletions that produce severe but viable phenotypes. 26 refs., 3 figs., 1 tab.
Hussein, Shaimaa; Michael, Paul; Brabant, Danielle; Omri, Abdelwahab; Narain, Ravin; Passi, Kalpdrum; Ramana, Chilakamarti V.; Parrillo, Joseph E.; Kumar, Anand; Parissenti, Amadeo; Kumar, Aseem
To gain a better understanding of the gene expression changes that occurs during sepsis, we have performed a cDNA microarray study utilizing a tissue culture model that mimics human sepsis. This study utilized an in vitro model of cultured human fetal cardiac myocytes treated with 10% sera from septic patients or 10% sera from healthy volunteers. A 1700 cDNA expression microarray was used to compare the transcription profile from human cardiac myocytes treated with septic sera vs normal sera. Septic sera treatment of myocytes resulted in the down-regulation of 178 genes and the up-regulation of 4 genes. Our data indicate that septic sera induced cell cycle, metabolic, transcription factor and apoptotic gene expression changes in human myocytes. Identification and characterization of gene expression changes that occur during sepsis may lead to the development of novel therapeutics and diagnostics. PMID:19684886
Pajusalu, Sander; Reimand, Tiia; Uibo, Oivi; Vasar, Maire; Talvik, Inga; Zilina, Olga; Tammur, Pille; Õunap, Katrin
We report a female patient with a complex phenotype consisting of failure to thrive, developmental delay, congenital bronchiectasis, gastroesophageal reflux and bilateral inguinal hernias. Chromosomal microarray analysis revealed a 230 kilobase deletion in chromosomal region 17q21.32 (arr[hg19] 17q21.32(46 550 362-46 784 039)×1) encompassing only 9 genes - HOXB1 to HOXB9. The deletion was not found in her mother or father. This is the first report of a patient with a HOXB gene cluster deletion involving only HOXB1 to HOXB9 genes. By comparing our case to previously reported five patients with larger chromosomal aberrations involving the HOXB gene cluster, we can suppose that HOXB gene cluster deletions are responsible for growth retardation, developmental delay, and specific facial dysmorphic features. Also, we suppose that bilateral inguinal hernias, tracheo-esophageal abnormalities, and lung malformations represent features with incomplete penetrance. Interestingly, previously published knock-out mice with targeted heterozygous deletion comparable to our patient did not show phenotypic alterations. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Enkh-Amgalan, Jigjiddorj; Kawasaki, Hiroko; Seki, Tatsuji
A major nif cluster was detected in the strictly anaerobic, Gram-positive phototrophic bacterium Heliobacterium chlorum. The cluster consisted of 11 genes arranged within a 10 kb region in the order nifI1, nifI2, nifH, nifD, nifK, nifE, nifN, nifX, fdx, nifB and nifV. The phylogenetic position of Hbt. chlorum was the same in the NifH, NifD, NifK, NifE and NifN trees; Hbt. chlorum formed a cluster with Desulfitobacterium hafniense, the closest neighbour of heliobacteria based on the 16S rRNA phylogeny, and two species of the genus Geobacter belonging to the Deltaproteobacteria. Two nifI genes, known to occur in the nif clusters of methanogenic archaea between nifH and nifD, were found upstream of the nifH gene of Hbt. chlorum. The organization of the nif operon and the phylogeny of individual and concatenated gene products showed that the Hbt. chlorum nif operon carrying nifI genes upstream of the nifH gene was an intermediate between the nif operon with nifI downstream of nifH (group II and III of the nitrogenase classification) and the nif operon lacking nifI (group I). Thus, the phylogenetic position of Hbt. chlorum nitrogenase may reflect an evolutionary stage of a divergence of the two nitrogenase groups, with group I consisting of the aerobic diazotrophs and group II consisting of strictly anaerobic prokaryotes.
Liao, Y.-T.; Li, W.-F.; Chen, C.-J.; Prineas, Ronald J.; Chen, Wei J.; Zhang Zhuming; Sun, C.-W.; Wang, S.-L.
Arsenic has been linked to increased prevalence of cancer and cardiovascular disease (CVD), but the long-term impact of arsenic exposure remains unclear. Human paraoxonase (PON1) is a high-density lipoprotein-associated antioxidant enzyme which hydrolyzes oxidized lipids and is thought to be protective against atherosclerosis, but evidence remains limited to case-control studies. Only recently have genes encoding enzymes responsible for arsenic metabolism, such as AS3MT and GSTO, been cloned and characterized. This study was designed to evaluate the synergistic interaction of genetic factors and arsenic exposure on electrocardiogram abnormality. A total of 216 residents from three tap water implemented villages of previous arseniasis-hyperendemic regions in Taiwan were prospectively followed for an average of 8 years. For each resident, a 12-lead conventional electrocardiogram (ECG) was recorded and coded by Minnesota Code standard criteria. Eight functional polymorphisms of PON1, PON2, AS3MT, GSTO1, and GSTO2 were examined for genetic susceptibility to ECG abnormality. Among 42 incident cases with ECG deterioration identified among 121 baseline-normal subjects, arsenic exposure was significantly correlated with incidence of ECG abnormality. In addition, polymorphisms in two paraoxonase genes were also found associated with the incidence of ECG abnormality. A haplotype R-C-S constituted by polymorphisms of PON1 Q192R, -108C/T and PON2 C311S was linked to the increased risk. Subjects exposed to high levels of As (cumulative As exposure > 14.7 ppm-year or drinking artesian well water > 21 years) and carrying the R-C-S haplotype had significantly increased risks for ECG abnormality over those with only one risk factor. Results of this study showed a long-term arsenic effect on ECG abnormality and significant gene-gene and gene-environment interactions linked to the incidence of CVD. This finding might have important implications for a novel and potentially useful
Weber, Jakob; Valiante, Vito; Nødvig, Christina S; Mattern, Derek J; Slotkowski, Rebecca A; Mortensen, Uffe H; Brakhage, Axel A
Filamentous fungi produce varieties of natural products even in a strain dependent manner. However, the genetic basis of chemical speciation between strains is still widely unknown. One example is trypacidin, a natural product of the opportunistic human pathogen Aspergillus fumigatus, which is not produced among different isolates. Combining computational analysis with targeted gene editing, we could link a single nucleotide insertion in the polyketide synthase of the trypacidin biosynthetic pathway and reconstitute its production in a nonproducing strain. Thus, we present a CRISPR/Cas9-based tool for advanced molecular genetic studies in filamentous fungi, exploiting selectable markers separated from the edited locus.
Assou, Said; Anahory, Tal; Pantesco, Véronique; Le Carrour, Tanguy; Pellestor, Franck; Klein, Bernard; Reyftmann, Lionel; Dechaud, Hervé; De Vos, John; Hamamah, Samir
BACKGROUND The understanding of the mechanisms regulating human oocyte maturation is still rudimentary. We have identified transcripts differentially expressed between immature and mature oocytes, and cumulus cells. METHODS Using oligonucleotides microarrays, genome wide gene expression was studied in pooled immature and mature oocytes or cumulus cells from patients who underwent IVF. RESULTS In addition to known genes such as DAZL, BMP15 or GDF9, oocytes upregulated 1514 genes. We show that PTTG3 and AURKC are respectively the securin and the Aurora kinase preferentially expressed during oocyte meiosis. Strikingly, oocytes overexpressed previously unreported growth factors such as TNFSF13/APRIL, FGF9, FGF14, and IL4, and transcription factors including OTX2, SOX15 and SOX30. Conversely, cumulus cells, in addition to known genes such as LHCGR or BMPR2, overexpressed cell-tocell signaling genes including TNFSF11/RANKL, numerous complement components, semaphorins (SEMA3A, SEMA6A, SEMA6D) and CD genes such as CD200. We also identified 52 genes progressively increasing during oocyte maturation, comprising CDC25A and SOCS7. CONCLUSION The identification of genes up and down regulated during oocyte maturation greatly improves our understanding of oocyte biology and will provide new markers that signal viable and competent oocytes. Furthermore, genes found expressed in cumulus cells are potential markers of granulosa cell tumors. PMID:16571642
Bertamini, Marco; Guest, Martin; Vallortigara, Giorgio; Rugani, Rosa; Regolin, Lucia
Animals can perceive the numerosity of sets of visual elements. Qualitative and quantitative similarities in different species suggest the existence of a shared system (approximate number system). Biases associated with sensory properties are informative about the underlying mechanisms. In humans, regular spacing increases perceived numerosity (regular-random numerosity illusion). This has led to a model that predicts numerosity based on occupancy (a measure that decreases when elements are close together). We used a procedure in which observers selected one of two stimuli and were given feedback with respect to whether the choice was correct. One configuration had 20 elements and the other 40, randomly placed inside a circular region. Participants had to discover the rule based on feedback. Because density and clustering covaried with numerosity, different dimensions could be used. After reaching a criterion, test trials presented two types of configurations with 30 elements. One type had a larger interelement distance than the other (high or low clustering). If observers had adopted a numerosity strategy, they would choose low clustering (if reinforced with 40) and high clustering (if reinforced with 20). A clustering or density strategy predicts the opposite. Human adults used a numerosity strategy. Chicks were tested using a similar procedure. There were two behavioral measures: first approach response and final circumnavigation (walking behind the screen). The prediction based on numerosity was confirmed by the first approach data. For chicks, one clear pattern from both responses was a preference for the configurations with higher clustering. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Zhang, Yihua; Shen, Wenzheng; Hua, Jinlian; Lei, Anmin; Lv, Changrong; Wang, Huayan; Yang, Chunrong; Gao, Zhimin; Dou, Zhongying
Bone marrow mesenchymal stem cells (BMSCs) have been reported to possess low immunogenicity and cause immunosuppression of recipients when allografted. They can differentiate into insulin-producing cells and may be a valuable source for islet formation. However, the extremely low differentiating rate of adult BMSCs toward insulin-producing cells and the insufficient insulin secretion of the differentiated BMSCs in vitro prevent their clinical use in diabetes treatment. Little is known about the potential of cell replacement therapy with human BMSCs. Previously, we isolated and identified human first-trimester fetal BMSCs (hfBMSCs). Under a novel four-step induction procedure established in this study, the hfBMSCs effectively differentiated into functional pancreatic islet-like cell clusters that contained 62 ± 14% insulin-producing cells, expressed a broad gene profile related to pancreatic islet β-cell development, and released high levels of insulin (2.245 ± 0.222 pmol/100 clusters per 30 min) and C-peptide (2.200 ± 0.468 pmol/100 clusters per 30 min) in response to 25 mmol/L glucose stimulus in vitro. The pancreatic islet-like cell clusters normalized the blood glucose level of diabetic model mice for at least 9 weeks when xenografted; blood glucose levels in these mice rose abnormally again when the grafts were removed. Examination of the grafts indicated that the transplanted cells survived in recipients and produced human insulin and C-peptide in situ. These results demonstrate that hfBMSCs derived from a human first-trimester abortus can differentiate into pancreatic islet-like cell clusters following an established four-step induction. The insulin-producing clusters present advantages in cell replacement therapy of type 1 diabetic model mice.
Rasmussen, Henrik B.; Madsen, Majbritt B.; Bjerre, Ditte
The carboxylesterase 1 gene (CES1) in humans encodes a hydrolase, which is implicated in the metabolism of several commonly used drugs 1. This gene is located on chromosome 16 with a highly homologous pseudogene, CES1P1, in its proximity. A duplicated segment of CES1 replaces most of CES1P1 in some...... appears to be low 8,13. The formation of hybrids consisting of a gene and a related pseudogene has been reported for other genes than CES1. This includes the hybrids of the gene encoding cytochrome P450 2D6 (CYP2D6) and pseudogene CYP2D7, that is, the so-called CYP2D7/D6 hybrids 14......,15. These are categorized as CYP2D6 variants and not as variants of pseudogene CYP2D716....
Mohan, Adith; Mather, Karen A; Thalamuthu, Anbupalam; Baune, Bernhard T; Sachdev, Perminder S
The review aims to provide a summary of recent developments in the study of gene expression in the aging human brain. Profiling differentially expressed genes or 'transcripts' in the human brain over the course of normal aging has provided valuable insights into the biological pathways that appear activated or suppressed in late life. Genes mediating neuroinflammation and immune system activation in particular, show significant age-related upregulation creating a state of vulnerability to neurodegenerative and neuropsychiatric disease in the aging brain. Cellular ionic dyshomeostasis and age-related decline in a host of molecular influences on synaptic efficacy may underlie neurocognitive decline in later life. Critically, these investigations have also shed light on the mobilization of protective genetic responses within the aging human brain that help determine health and disease trajectories in older age. There is growing interest in the study of pre and posttranscriptional regulators of gene expression, and the role of noncoding RNAs in particular, as mediators of the phenotypic diversity that characterizes human brain aging. Gene expression studies in healthy brain aging offer an opportunity to unravel the intricately regulated cellular underpinnings of neurocognitive aging as well as disease risk and resiliency in late life. In doing so, new avenues for early intervention in age-related neurodegenerative disease could be investigated with potentially significant implications for the development of disease-modifying therapies.
Liedert, Astrid; Kassem, Moustapha; Claes, Lutz
Mechanical loading is essential for maintaining bone mass in the adult skeleton. However, the underlying process of the transfer of the physical stimulus into a biochemical response, which is termed mechanotransduction is poorly understood. Mechanotransduction results in the modulation of gene...... cells. Analysis of the human HB-GAM gene upstream regulatory region with luciferase reporter gene assays revealed that the upregulation of HB-GAM expression occurred at the transcriptional level and was mainly dependent on the HB-GAM promoter region most upstream containing three potential AP-1 binding...
Verhulst, N.O.; Beijleveld, H.; Qiu, Y.T.; Maliepaard, C.A.; Verduyn, W.; Haasnoot, G.W.; Claas, F.H.J.; Mumm, R.; Bouwmeester, H.J.; Takken, W.; Loon, van J.J.A.; Smallegange, R.C.
Chemical cues are considered to be the most important cues for mosquitoes to find their hosts and humans can be ranked for attractiveness to mosquitoes based on the chemical cues they emit. Human leukocyte antigen (HLA) genes are considered to be involved in the regulation of human body odor and may
Martens, Geert A; Jiang, Lei; Hellemans, Karine H
The aim of this study was to establish a gene expression blueprint of pancreatic beta cells conserved from rodents to humans and to evaluate its applicability to assess shifts in the beta cell differentiated state. Genome-wide mRNA expression profiles of isolated beta cells were compared to those...... of a large panel of other tissue and cell types, and transcripts with beta cell-abundant and -selective expression were identified. Iteration of this analysis in mouse, rat and human tissues generated a panel of conserved beta cell biomarkers. This panel was then used to compare isolated versus laser capture...... microdissected beta cells, monitor adaptations of the beta cell phenotype to fasting, and retrieve possible conserved transcriptional regulators....
Full Text Available Male and female differ genetically by their respective sex chromosome composition, that is, XY as male and XX as female. Although both X and Y chromosomes evolved from the same ancestor pair of autosomes, the Y chromosome harbors male-specific genes, which play pivotal roles in male sex determination, germ cell differentiation, and masculinization of various tissues. Deletions or translocation of the sex-determining gene, SRY, from the Y chromosome causes disorders of sex development (previously termed as an intersex condition with dysgenic gonads. Failure of gonadal development results not only in infertility, but also in increased risks of germ cell tumor (GCT, such as gonadoblastoma and various types of testicular GCT. Recent studies demonstrate that either loss of Y chromosome or ectopic expression of Y chromosome genes is closely associated with various male-biased diseases, including selected somatic cancers. These observations suggest that the Y-linked genes are involved in male health and diseases in more frequently than expected. Although only a small number of protein-coding genes are present in the male-specific region of Y chromosome, the impacts of Y chromosome genes on human diseases are still largely unknown, due to lack of in vivo models and differences between the Y chromosomes of human and rodents. In this review, we highlight the involvement of selected Y chromosome genes in cancer development in men.
Li, Zuofeng; Liu, Xingnan; Wen, Jingran; Xu, Ye; Zhao, Xin; Li, Xuan; Liu, Lei; Zhang, Xiaoyan
With the completion of the human genome project and the development of new methods for gene variant detection, the integration of mutation data and its phenotypic consequences has become more important than ever. Among all available resources, locus-specific databases (LSDBs) curate one or more specific genes' mutation data along with high-quality phenotypes. Although some genotype-phenotype data from LSDB have been integrated into central databases little effort has been made to integrate all these data by a search engine approach. In this work, we have developed disease related unique gene mutation search engine (DRUMS), a search engine for human disease related unique gene mutation as a convenient tool for biologists or physicians to retrieve gene variant and related phenotype information. Gene variant and phenotype information were stored in a gene-centred relational database. Moreover, the relationships between mutations and diseases were indexed by the uniform resource identifier from LSDB, or another central database. By querying DRUMS, users can access the most popular mutation databases under one interface. DRUMS could be treated as a domain specific search engine. By using web crawling, indexing, and searching technologies, it provides a competitively efficient interface for searching and retrieving mutation data and their relationships to diseases. The present system is freely accessible at http://www.scbit.org/glif/new/drums/index.html. © 2011 Wiley-Liss, Inc.
Stanton, L.W.; Schwab, M.; Bishop, J.M.
Human neuroblastomas frequently display amplification and augmented expression of a gene known as N-myc because of its similarity to the protooncogene c-myc. It has therefore been proposed that N-myc is itself a protooncogene, and subsequent tests have shown that N-myc and c-myc have similar biological activities in cell culture. The authors have now detailed the kinship between N-myc and c-myc by determining the nucleotide sequence of human N-myc and deducing the amino acid sequence of the protein encoded by the gene. The topography of N-myc is strikingly similar to that of c-myc: both genes contain three exons of similar lengths; the coding elements of both genes are located in the second and third exons; and both genes have unusually long 5' untranslated regions in their mRNAs, with features that raise the possibility that expression of the genes may be subject to similar controls of translation. The resemblance between the proteins encoded by N-myc and c-myc sustains previous suspicions that the genes encode related functions
Liu, Xuewu; Wang, Yuanyuan; Liang, Jiao; Wang, Luojun; Qin, Na; Zhao, Ya; Zhao, Gang
Plasmodium falciparum is the most virulent malaria parasite capable of parasitizing human erythrocytes. The identification of genes related to this capability can enhance our understanding of the molecular mechanisms underlying human malaria and lead to the development of new therapeutic strategies for malaria control. With the availability of several malaria parasite genome sequences, performing computational analysis is now a practical strategy to identify genes contributing to this disease. Here, we developed and used a virtual genome method to assign 33,314 genes from three human malaria parasites, namely, P. falciparum, P. knowlesi and P. vivax, and three rodent malaria parasites, namely, P. berghei, P. chabaudi and P. yoelii, to 4605 clusters. Each cluster consisted of genes whose protein sequences were significantly similar and was considered as a virtual gene. Comparing the enriched values of all clusters in human malaria parasites with those in rodent malaria parasites revealed 115 P. falciparum genes putatively responsible for parasitizing human erythrocytes. These genes are mainly located in the chromosome internal regions and participate in many biological processes, including membrane protein trafficking and thiamine biosynthesis. Meanwhile, 289 P. berghei genes were included in the rodent parasite-enriched clusters. Most are located in subtelomeric regions and encode erythrocyte surface proteins. Comparing cluster values in P. falciparum with those in P. vivax and P. knowlesi revealed 493 candidate genes linked to virulence. Some of them encode proteins present on the erythrocyte surface and participate in cytoadhesion, virulence factor trafficking, or erythrocyte invasion, but many genes with unknown function were also identified. Cerebral malaria is characterized by accumulation of infected erythrocytes at trophozoite stage in brain microvascular. To discover cerebral malaria-related genes, fast Fourier transformation (FFT) was introduced to extract
Hulse, Amanda M.; Cai, James J.
Expression quantitative trait loci (eQTL) studies have established convincing relationships between genetic variants and gene expression. Most of these studies focused on the mean of gene expression level, but not the variance of gene expression level (i.e., gene expression variability). In the present study, we systematically explore genome-wide association between genetic variants and gene expression variability in humans. We adapt the double generalized linear model (dglm) to simultaneously fit the means and the variances of gene expression among the three possible genotypes of a biallelic SNP. The genomic loci showing significant association between the variances of gene expression and the genotypes are termed expression variability QTL (evQTL). Using a data set of gene expression in lymphoblastoid cell lines (LCLs) derived from 210 HapMap individuals, we identify cis-acting evQTL involving 218 distinct genes, among which 8 genes, ADCY1, CTNNA2, DAAM2, FERMT2, IL6, PLOD2, SNX7, and TNFRSF11B, are cross-validated using an extra expression data set of the same LCLs. We also identify ∼300 trans-acting evQTL between >13,000 common SNPs and 500 randomly selected representative genes. We employ two distinct scenarios, emphasizing single-SNP and multiple-SNP effects on expression variability, to explain the formation of evQTL. We argue that detecting evQTL may represent a novel method for effectively screening for genetic interactions, especially when the multiple-SNP influence on expression variability is implied. The implication of our results for revealing genetic mechanisms of gene expression variability is discussed. PMID:23150607
Ruano, G.; Ruddle, F.H.; Kidd, K.K. (Yale Univ., New Haven, CT (United States)); Gray, M.R. (Tufts Univ., Boston, MA (United States)); Miki, Tetsuro (Osaka Univ. (Japan)); Ferguson-Smith, A.C. (Inst. of Animal Physiology and Genetics Research, Cambridge (United Kingdom))
The human homeo box cluster 2 (HOX2) contains genes coding for DNA binding proteins involved in developmental control and is highly conserved between mouse and man. The authors have applied in concert the Polymerase Chain Reaction (PCR) and Denaturing Gradient Electrophoresis (DGE) to amplify defined primate HOX2 segments and to detect sequence differences among them. They have sequenced a PstI fragment 4 kb upstream from HOX 2.2 and synthesized primers delimiting both halves of 630 bp segment within it PCR on various unrelated humans and SC-PCR on chimpanzee, gorilla, orangutan and gibbon yielded products of the same length for each primer pair.
Lisa Michelle Ogawa
Full Text Available Many psychiatric diseases observed in humans have tenuous or absent analogs in other species. Most notable among these are schizophrenia and autism. One hypothesis has posited that these diseases have arisen as a consequence of human brain evolution, for example, that the same processes that led to advances in cognition, language, and executive function also resulted in novel diseases in humans when dysfunctional. Here, the molecular evolution of genes associated with these and other psychiatric disorders are compared among species. Genes associated with psychiatric disorders are drawn from the literature and orthologous sequences are collected from eleven primate species (human, chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, baboon, marmoset, squirrel monkey, and galago and thirty one non-primate mammalian species. Evolutionary parameters, including dN/dS, are calculated for each gene and compared between disease classes and among species, focusing on humans and primates compared to other mammals and on large-brained taxa (cetaceans, rhinoceros, walrus, bear, and elephant compared to their small-brained sister species. Evidence of differential selection in primates supports the hypothesis that schizophrenia and autism are a cost of higher brain function. Through this work a better understanding of the molecular evolution of the human brain, the pathophysiology of disease, and the genetic basis of human psychiatric disease is gained.
Ladero, Victor; Rattray, Fergal P.; Mayo, Baltasar; Martín, María Cruz; Fernández, María; Alvarez, Miguel A.
Lactococcus lactis is a prokaryotic microorganism with great importance as a culture starter and has become the model species among the lactic acid bacteria. The long and safe history of use of L. lactis in dairy fermentations has resulted in the classification of this species as GRAS (General Regarded As Safe) or QPS (Qualified Presumption of Safety). However, our group has identified several strains of L. lactis subsp. lactis and L. lactis subsp. cremoris that are able to produce putrescine from agmatine via the agmatine deiminase (AGDI) pathway. Putrescine is a biogenic amine that confers undesirable flavor characteristics and may even have toxic effects. The AGDI cluster of L. lactis is composed of a putative regulatory gene, aguR, followed by the genes (aguB, aguD, aguA, and aguC) encoding the catabolic enzymes. These genes are transcribed as an operon that is induced in the presence of agmatine. In some strains, an insertion (IS) element interrupts the transcription of the cluster, which results in a non-putrescine-producing phenotype. Based on this knowledge, a PCR-based test was developed in order to differentiate nonproducing L. lactis strains from those with a functional AGDI cluster. The analysis of the AGDI cluster and their flanking regions revealed that the capacity to produce putrescine via the AGDI pathway could be a specific characteristic that was lost during the adaptation to the milk environment by a process of reductive genome evolution. PMID:21803900
Othoum, Ghofran K
BackgroundThe increasing spectrum of multidrug-resistant bacteria is a major global public health concern, necessitating discovery of novel antimicrobial agents. Here, members of the genus Bacillus are investigated as a potentially attractive source of novel antibiotics due to their broad spectrum of antimicrobial activities. We specifically focus on a computational analysis of the distinctive biosynthetic potential of Bacillus paralicheniformis strains isolated from the Red Sea, an ecosystem exposed to adverse, highly saline and hot conditions.ResultsWe report the complete circular and annotated genomes of two Red Sea strains, B. paralicheniformis Bac48 isolated from mangrove mud and B. paralicheniformis Bac84 isolated from microbial mat collected from Rabigh Harbor Lagoon in Saudi Arabia. Comparing the genomes of B. paralicheniformis Bac48 and B. paralicheniformis Bac84 with nine publicly available complete genomes of B. licheniformis and three genomes of B. paralicheniformis, revealed that all of the B. paralicheniformis strains in this study are more enriched in nonribosomal peptides (NRPs). We further report the first computationally identified trans-acyltransferase (trans-AT) nonribosomal peptide synthetase/polyketide synthase (PKS/ NRPS) cluster in strains of this species.ConclusionsB. paralicheniformis species have more genes associated with biosynthesis of antimicrobial bioactive compounds than other previously characterized species of B. licheniformis, which suggests that these species are better potential sources for novel antibiotics. Moreover, the genome of the Red Sea strain B. paralicheniformis Bac48 is more enriched in modular PKS genes compared to B. licheniformis strains and other B. paralicheniformis strains. This may be linked to adaptations that strains surviving in the Red Sea underwent to survive in the relatively hot and saline ecosystems.
Background Streptomyces species are a major source of antibiotics. They usually grow slowly at their optimal temperature and fermentation of industrial strains in a large scale often takes a long time, consuming more energy and materials than some other bacterial industrial strains (e.g., E. coli and Bacillus). Most thermophilic Streptomyces species grow fast, but no gene cloning systems have been developed in such strains. Results We report here the isolation of 41 fast-growing (about twice the rate of S. coelicolor), moderately thermophilic (growing at both 30°C and 50°C) Streptomyces strains, detection of one linear and three circular plasmids in them, and sequencing of a 6996-bp plasmid, pTSC1, from one of them. pTSC1-derived pCWH1 could replicate in both thermophilic and mesophilic Streptomyces strains. On the other hand, several Streptomyces replicons function in thermophilic Streptomyces species. By examining ten well-sporulating strains, we found two promising cloning hosts, 2C and 4F. A gene cloning system was established by using the two strains. The actinorhodin and anthramycin biosynthetic gene clusters from mesophilic S. coelicolor A3(2) and thermophilic S. refuineus were heterologously expressed in one of the hosts. Conclusions We have developed a gene cloning and expression system in a fast-growing and moderately thermophilic Streptomyces species. Although just a few plasmids and one antibiotic biosynthetic gene cluster from mesophilic Streptomyces were successfully expressed in thermophilic Streptomyces species, we expect that by utilizing thermophilic Streptomyces-specific promoters, more genes and especially antibiotic genes clusters of mesophilic Streptomyces should be heterologously expressed. PMID:22032628
Dinesh, Raghavan; Srinivasan, Veeraraghavan; T E, Sheeja; Anandaraj, Muthuswamy; Srambikkal, Hamza
Endophytic actinobacteria, which reside in the inner tissues of host plants, are gaining serious attention due to their capacity to produce a plethora of secondary metabolites (e.g. antibiotics) possessing a wide variety of biological activity with diverse functions. This review encompasses the recent reports on endophytic actinobacterial species diversity, in planta habitats and mechanisms underlying their mode of entry into plants. Besides, their metabolic potential, novel bioactive compounds they produce and mechanisms to unravel their hidden metabolic repertoire by activation of cryptic or silent biosynthetic gene clusters (BGCs) for eliciting novel secondary metabolite production are discussed. The study also reviews the classical conservative techniques (chemical/biological/physical elicitation, co-culturing) as well as modern microbiology tools (e.g. next generation sequencing) that are being gainfully employed to uncover the vast hidden scaffolds for novel secondary metabolites produced by these endophytes, which would subsequently herald a revolution in drug engineering. The potential role of these endophytes in the agro-environment as promising biological candidates for inhibition of phytopathogens and the way forward to thoroughly exploit this unique microbial community by inducing expression of cryptic BGCs for encoding unseen products with novel therapeutic properties are also discussed.
Bang, Bo; Gniadecki, Robert; Larsen, Jørgen K
In vitro studies with human cell lines have demonstrated that the death receptor Fas plays a role in ultraviolet (UV)-induced apoptosis. The purpose of the present study was to investigate the relation between Fas expression and apoptosis as well as clustering of Fas in human epidermis after...... a single dose of UVB irradiation. Normal healthy individuals were irradiated with three minimal erythema doses (MED) of UVB on forearm or buttock skin. Suction blisters from unirradiated and irradiated skin were raised, and Fas, FasL, and apoptosis of epidermal cells quantified by flow cytometry....... Clustering of Fas was from skin biopsied. Soluble FasL in suction blister fluid was quantified by ELISA. Flow cytometric analysis demonstrated increased expression intensity of Fas after irradiation, with 1.6-,2.2- and 2.7-fold increased median expression at 24, 48 and 72 h after irradiation, respectively (n...
Floyd H. Chilton
Full Text Available The “modern western” diet (MWD has increased the onset and progression of chronic human diseases as qualitatively and quantitatively maladaptive dietary components give rise to obesity and destructive gene-diet interactions. There has been a three-fold increase in dietary levels of the omega-6 (n-6 18 carbon (C18, polyunsaturated fatty acid (PUFA linoleic acid (LA; 18:2n-6, with the addition of cooking oils and processed foods to the MWD. Intense debate has emerged regarding the impact of this increase on human health. Recent studies have uncovered population-related genetic variation in the LCPUFA biosynthetic pathway (especially within the fatty acid desaturase gene (FADS cluster that is associated with levels of circulating and tissue PUFAs and several biomarkers and clinical endpoints of cardiovascular disease (CVD. Importantly, populations of African descent have higher frequencies of variants associated with elevated levels of arachidonic acid (ARA, CVD biomarkers and disease endpoints. Additionally, nutrigenomic interactions between dietary n-6 PUFAs and variants in genes that encode for enzymes that mobilize and metabolize ARA to eicosanoids have been identified. These observations raise important questions of whether gene-PUFA interactions are differentially driving the risk of cardiovascular and other diseases in diverse populations, and contributing to health disparities, especially in African American populations.
Lukežič, Tadeja; Lešnik, Urška; Podgoršek, Ajda; Horvat, Jaka; Polak, Tomaž; Šala, Martin; Jenko, Branko; Raspor, Peter; Herron, Paul R; Hunter, Iain S; Petković, Hrvoje
Tetracyclines (TCs) are medically important antibiotics from the polyketide family of natural products. Chelocardin (CHD), produced by Amycolatopsis sulphurea, is a broad-spectrum tetracyclic antibiotic with potent bacteriolytic activity against a number of Gram-positive and Gram-negative multi-resistant pathogens. CHD has an unknown mode of action that is different from TCs. It has some structural features that define it as 'atypical' and, notably, is active against tetracycline-resistant pathogens. Identification and characterization of the chelocardin biosynthetic gene cluster from A. sulphurea revealed 18 putative open reading frames including a type II polyketide synthase. Compared to typical TCs, the chd cluster contains a number of features that relate to its classification as 'atypical': an additional gene for a putative two-component cyclase/aromatase that may be responsible for the different aromatization pattern, a gene for a putative aminotransferase for C-4 with the opposite stereochemistry to TCs and a gene for a putative C-9 methylase that is a unique feature of this biosynthetic cluster within the TCs. Collectively, these enzymes deliver a molecule with different aromatization of ring C that results in an unusual planar structure of the TC backbone. This is a likely contributor to its different mode of action. In addition CHD biosynthesis is primed with acetate, unlike the TCs, which are primed with malonamate, and offers a biosynthetic engineering platform that represents a unique opportunity for efficient generation of novel tetracyclic backbones using combinatorial biosynthesis.
Richard, S; Zingg, H H
Gonadal steroids affect brain function primarily by altering the expression of specific genes, yet the specific mechanisms by which neuronal target genes undergo such regulation are unknown. Recent evidence suggests that the expression of the neuropeptide gene for oxytocin (OT) is modulated by estrogens. We therefore examined the possibility that this regulation occurred via a direct interaction of the estrogen-receptor complex with cis-acting elements flanking the OT gene. DNA-mediated gene transfer experiments were performed using Neuro-2a neuroblastoma cells and chimeric plasmids containing portions of the human OT gene 5'-glanking region linked to the chloramphenicol acetyltransferase gene. We identified a 19-base pair region located at -164 to -146 upstream of the transcription start site which is capable of conferring estrogen responsiveness to the homologous as well as to a heterologous promoter. The hormonal response is strictly dependent on the presence of intracellular estrogen receptors, since estrogen induced stimulation occurred only in Neuro-2a cells co-transfected with an expression vector for the human estrogen receptor. The identified region contains a novel imperfect palindrome (GGTGACCTTGACC) with sequence similarity to other estrogen response elements (EREs). To define cis-acting elements that function in synergism with the ERE, sequences 3' to the ERE were deleted, including the CCAAT box, two additional motifs corresponding to the right half of the ERE palindrome (TGACC), as well as a CTGCTAA heptamer similar to the "elegans box" found in Caenorhabditis elegans. Interestingly, optimal function of the identified ERE was fully independent of these elements and only required a short promoter region (-49 to +36). Our studies define a molecular mechanism by which estrogens can directly modulate OT gene expression. However, only a subset of OT neurons are capable of binding estrogens, therefore, direct action of estrogens on the OT gene may be
Deutsch Eric W
Full Text Available Abstract Background: Expression levels of mRNA and protein by cell types exhibit a range of correlations for different genes. In this study, we compared levels of mRNA abundance for several cluster designation (CD genes determined by gene arrays using magnetic sorted and laser-capture microdissected human prostate cells with levels of expression of the respective CD proteins determined by immunohistochemical staining in the major cell types of the prostate – basal epithelial, luminal epithelial, stromal fibromuscular, and endothelial – and for prostate precursor/stem cells and prostate carcinoma cells. Immunohistochemical stains of prostate tissues from more than 50 patients were scored for informative CD antigen expression and compared with cell-type specific transcriptomes. Results: Concordance between gene and protein expression findings based on 'present' vs. 'absent' calls ranged from 46 to 68%. Correlation of expression levels was poor to moderate (Pearson correlations ranged from 0 to 0.63. Divergence between the two data types was most frequently seen for genes whose array signals exceeded background (> 50 but lacked immunoreactivity by immunostaining. This could be due to multiple factors, e.g. low levels of protein expression, technological sensitivities, sample processing, probe set definition or anatomical origin of tissue and actual biological differences between transcript and protein abundance. Conclusion: Agreement between these two very different methodologies has great implications for their respective use in both molecular studies and clinical trials employing molecular biomarkers.
Full Text Available BACKGROUND: The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia. METHODS: A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization. RESULTS: The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52% corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578. CONCLUSION: The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra
Francis Richard W
Full Text Available Abstract Background Pre-clinical models that effectively recapitulate human disease are critical for expanding our knowledge of cancer biology and drug resistance mechanisms. For haematological malignancies, the non-obese diabetic/severe combined immunodeficient (NOD/SCID mouse is one of the most successful models to study paediatric acute lymphoblastic leukaemia (ALL. However, for this model to be effective for studying engraftment and therapy responses at the whole genome level, careful molecular characterisation is essential. Results Here, we sought to validate species-specific gene expression profiling in the high engraftment continuous ALL NOD/SCID xenograft. Using the human Affymetrix whole transcript platform we analysed transcriptional profiles from engrafted tissues without prior cell separation of mouse cells and found it to return highly reproducible profiles in xenografts from individual mice. The model was further tested with experimental mixtures of human and mouse cells, demonstrating that the presence of mouse cells does not significantly skew expression profiles when xenografts contain 90% or more human cells. In addition, we present a novel in silico and experimental masking approach to identify probes and transcript clusters susceptible to cross-species hybridisation. Conclusions We demonstrate species-specific transcriptional profiles can be obtained from xenografts when high levels of engraftment are achieved or with the application of transcript cluster masks. Importantly, this masking approach can be applied and adapted to other xenograft models where human tissue infiltration is lower. This model provides a powerful platform for identifying genes and pathways associated with ALL disease progression and response to therapy in vivo.
Full Text Available Understanding complex networks that modulate development in humans is hampered by genetic and phenotypic heterogeneity within and between populations. Here we present a method that exploits natural variation in highly diverse mouse genetic reference panels in which genetic and environmental factors can be tightly controlled. The aim of our study is to test a cross-species genetic mapping strategy, which compares data of gene mapping in human patients with functional data obtained by QTL mapping in recombinant inbred mouse strains in order to prioritize human disease candidate genes.We exploit evolutionary conservation of developmental phenotypes to discover gene variants that influence brain development in humans. We studied corpus callosum volume in a recombinant inbred mouse panel (C57BL/6J×DBA/2J, BXD strains using high-field strength MRI technology. We aligned mouse mapping results for this neuro-anatomical phenotype with genetic data from patients with abnormal corpus callosum (ACC development.From the 61 syndromes which involve an ACC, 51 human candidate genes have been identified. Through interval mapping, we identified a single significant QTL on mouse chromosome 7 for corpus callosum volume with a QTL peak located between 25.5 and 26.7 Mb. Comparing the genes in this mouse QTL region with those associated with human syndromes (involving ACC and those covered by copy number variations (CNV yielded a single overlap, namely HNRPU in humans and Hnrpul1 in mice. Further analysis of corpus callosum volume in BXD strains revealed that the corpus callosum was significantly larger in BXD mice with a B genotype at the Hnrpul1 locus than in BXD mice with a D genotype at Hnrpul1 (F = 22.48, p<9.87*10(-5.This approach that exploits highly diverse mouse strains provides an efficient and effective translational bridge to study the etiology of human developmental disorders, such as autism and schizophrenia.
Full Text Available Correlations of genetic variation in DNA with functional brain activity have already provided a starting point for delving into human cognitive mechanisms. However, these analyses do not provide the specific genes driving the associations, which are complicated by intergenic localization as well as tissue-specific epigenetics and expression. The use of brain-derived expression datasets could build upon the foundation of these initial genetic insights and yield genes and molecular pathways for testing new hypotheses regarding the molecular bases of human brain development, cognition, and disease. Thus, coupling these human brain gene expression data with measurements of brain activity may provide genes with critical roles in brain function. However, these brain gene expression datasets have their own set of caveats, most notably a reliance on postmortem tissue. In this perspective, I summarize and examine the progress that has been made in this realm to date, and discuss the various frontiers remaining, such as the inclusion of cell-type-specific information, additional physiological measurements, and genomic data from patient cohorts.
Fialho, Arsenio M; Chakrabarty, Ananda M
Mutations, Single Nucleotide Polymorphisms (SNPs), deletions and genetic rearrangements in specific genes in the human genome account for not only our physical characteristics and behavior, but can lead to many in-born and acquired diseases. Such changes in the genome can also predispose people to cancers, as well as significantly affect the metabolism and efficacy of many drugs, resulting in some cases in acute toxicity to the drug. The testing of the presence of such genetic mutations and rearrangements is of great practical and commercial value, leading many of these genes and their mutations/deletions and genetic rearrangements to be patented. A recent decision by a judge in the Federal District Court in the Southern District of New York, has created major uncertainties, based on the revocation of BRCA1 and BRCA2 gene patents, in the eligibility of all human and presumably other gene patents. This article argues that while patents on BRCA1 and BRCA2 genes could be challenged based on a lack of utility, the patenting of the mutations and genetic rearrangements is of great importance to further development and commercialization of genetic tests that can save human lives and prevent suffering, and should be allowed.
Kuo Ching Chao
Full Text Available BACKGROUND: There is a widespread interest in developing renewable sources of islet-replacement tissue for type I diabetes mellitus. Human mesenchymal cells isolated from the Wharton's jelly of the umbilical cord (HUMSCs, which can be easily obtained and processed compared with embryonic and bone marrow stem cells, possess stem cell properties. HUMSCs may be a valuable source for the generation of islets. METHODOLOGY AND PRINCIPAL FINDINGS: HUMSCs were induced to transform into islet-like cell clusters in vitro through stepwise culturing in neuron-conditioned medium. To assess the functional stability of the islet-like cell clusters in vivo, these cell clusters were transplanted into the liver of streptozotocin-induced diabetic rats via laparotomy. Glucose tolerance was measured on week 12 after transplantation accompanied with immunohistochemistry and electron microscopy analysis. These islet-like cell clusters were shown to contain human C-peptide and release human insulin in response to physiological glucose levels. Real-time RT-PCR detected the expressions of insulin and other pancreatic beta-cell-related genes (Pdx1, Hlxb9, Nkx2.2, Nkx6.1, and Glut-2 in these islet-like cell clusters. The hyperglycemia and glucose intolerance in streptozotocin-induced diabetic rats was significantly alleviated after xenotransplantation of islet-like cell clusters, without the use of immunosuppressants. In addition to the existence of islet-like cell clusters in the liver, some special fused liver cells were also found, which characterized by human insulin and nuclei-positive staining and possessing secretory granules. CONCLUSIONS AND SIGNIFICANCE: In this study, we successfully differentiate HUMSCs into mature islet-like cell clusters, and these islet-like cell clusters possess insulin-producing ability in vitro and in vivo. HUMSCs in Wharton's Jelly of the umbilical cord seem to be the preferential source of stem cells to convert into insulin
Babbitt, Courtney C; Haygood, Ralph; Nielsen, William J; Wray, Gregory A
Despite evidence for adaptive changes in both gene expression and non-protein-coding, putatively regulatory regions of the genome during human evolution, the relationship between gene expression and adaptive changes in cis-regulatory regions remains unclear. Here we present new measurements of gene expression in five tissues of humans and chimpanzees, and use them to assess this relationship. We then compare our results with previous studies of adaptive noncoding changes, analyzing correlations at the level of gene ontology groups, in order to gain statistical power to detect correlations. Consistent with previous studies, we find little correlation between gene expression and adaptive noncoding changes at the level of individual genes; however, we do find significant correlations at the level of biological function ontology groups. The types of function include processes regulated by specific transcription factors, responses to genetic or chemical perturbations, and differentiation of cell types within the immune system. Among functional categories co-enriched with both differential expression and noncoding adaptation, prominent themes include cancer, particularly epithelial cancers, and neural development and function.
Bustamam, A.; Aldila, D.; Fatimah, Arimbi, M. D.
One of the most widely used clustering method, since it has advantage on its robustness, is Self-Organizing Maps (SOM) method. This paper discusses the application of SOM method on Human Papillomavirus (HPV) DNA which is the main cause of cervical cancer disease, the most dangerous cancer in developing countries. We use 18 types of HPV DNA-based on the newest complete genome. By using open-source-based program R, clustering process can separate 18 types of HPV into two different clusters. There are two types of HPV in the first cluster while 16 others in the second cluster. The analyzing result of 18 types HPV based on the malignancy of the virus (the difficultness to cure). Two of HPV types the first cluster can be classified as tame HPV, while 16 others in the second cluster are classified as vicious HPV.
Cornelia M Hooper
Full Text Available Medulloblastoma is the most common form of malignant paediatric brain tumour and is the leading cause of childhood cancer related mortality. The four molecular subgroups of medulloblastoma that have been identified - WNT, SHH, Group 3 and Group 4 - have molecular and topographical characteristics suggestive of different cells of origin. Definitive identification of the cell(s of origin of the medulloblastoma subgroups, particularly the poorer prognosis Group 3 and Group 4 medulloblastoma, is critical to understand the pathogenesis of the disease, and ultimately for the development of more effective treatment options. To address this issue, the gene expression profiles of normal human neural tissues and cell types representing a broad neuro-developmental continuum, were compared to those of two independent cohorts of primary human medulloblastoma specimens. Clustering, co-expression network, and gene expression analyses revealed that WNT and SHH medulloblastoma may be derived from distinct neural stem cell populations during early embryonic development, while the transcriptional profiles of Group 3 and Group 4 medulloblastoma resemble cerebellar granule neuron precursors at weeks 10-15 and 20-30 of embryogenesis, respectively. Our data indicate that Group 3 medulloblastoma may arise through abnormal neuronal differentiation, whereas deregulation of synaptic pruning-associated apoptosis may be driving Group 4 tumorigenesis. Overall, these data provide significant new insight into the spatio-temporal relationships and molecular pathogenesis of the human medulloblastoma subgroups, and provide an important framework for the development of more refined model systems, and ultimately improved therapeutic strategies.
Costanzo, F; Colombo, M; Staempfli, S; Santoro, C; Marone, M; Frank, K; Delius, H; Cortese, R
Ferritin is composed of two subunits, H and L. cDNA's coding for these proteins from human liver, lymphocytes and from the monocyte-like cell line U937 have been cloned and sequenced. Southern blot analysis on total human DNA reveals that there are many DNA segments hybridizing to the apoferritin H and L cDNA probes. In view of the tissue heterogeneity of ferritin molecules, it appeared possible that apoferritin molecules could be coded by a family of genes differentially expressed in various tissues. In this paper, the authors describe the cloning and sequencing of the gene coding for human apoferritin H. This gene has three introns; the exon sequence is identical to that of cDNAs isolated from human liver, lymphocytes, HeLa cells and endothelial cells. In addition they show that at least 15 intronless pseudogenes exist, with features suggesting that there were originated by reverse transcription and insertion. On the basis of these results they conclude that only one gene is responsible for the synthesis of the majority of apoferritin H mRNA in various tissues examined, and that probably all the other DNA segments hybridizing with apoferritin cDNA are pseudogenes.
Gene expression profiles in adenosine-treated human mast cells. ... SW Kang, JE Jeong, CH Kim, SH Choi, SH Chae, SA Jun, HJ Cha, JH Kim, YM Lee, YS ... beta 4, ring finger protein, high-mobility group, calmodulin 2, RAN binding protein, ...
Transgenic banana has been developed to prevent hepatitis B through vaccination. Its production seems to be an ideal alternative for cheaper vaccines. The objective of this paper is to assess the ethical perception of transgenic banana which involved the transfer of human albumin gene, and to compare their ethical ...
Hudjashov, Georgi; Villems, Richard; Kivisild, Toomas
Global variation in skin pigmentation is one of the most striking examples of environmental adaptation in humans. More than two hundred loci have been identified as candidate genes in model organisms and a few tens of these have been found to be significantly associated with human skin pigmentation in genome-wide association studies. However, the evolutionary history of different pigmentation genes is rather complex: some loci have been subjected to strong positive selection, while others evolved under the relaxation of functional constraints in low UV environment. Here we report the results of a global study of the human tyrosinase gene, which is one of the key enzymes in melanin production, to assess the role of its variation in the evolution of skin pigmentation differences among human populations. We observe a higher rate of non-synonymous polymorphisms in the European sample consistent with the relaxation of selective constraints. A similar pattern was previously observed in the MC1R gene and concurs with UV radiation-driven model of skin color evolution by which mutations leading to lower melanin levels and decreased photoprotection are subject to purifying selection at low latitudes while being tolerated or even favored at higher latitudes because they facilitate UV-dependent vitamin D production. Our coalescent date estimates suggest that the non-synonymous variants, which are frequent in Europe and North Africa, are recent and have emerged after the separation of East and West Eurasian populations.
Summarizes the views of a sample of primary and high school teachers on the application of gene technology to human medicine. In general, high school teachers are more positive about these developments than primary teachers, and both groups of teachers are more positive than interested lay publics. Highlights ways in which this topic can be…
dos Santos, Marcelo Bertalan Quintanilha; Sicheritz-Pontén, Thomas; Nielsen, Henrik Bjørn
To understand the impact of gut microbes on human health and well-being it is crucial to assess their genetic potential. Here we describe the Illumina-based metagenomic sequencing, assembly and characterization of 3.3 million non-redundant microbial genes, derived from 576.7 gigabases of sequence...
t'Hart, B. A.; Vervoordeldonk, M.; Heeney, J. L.; Tak, P. P.
Before autoimmune diseases in humans can be treated with gene therapy, the safety and efficacy of the used vectors must be tested in valid experimental models. Monkeys, such as the rhesus macaque or the common marmoset, provide such models. This publication reviews the state of the art in monkey
Mutational landscape of the human Y chromosome-linked genes and loci in patients with hypogonadism. Deepali Pathak, Sandeep Kumar Yadav, Leena Rawal and Sher Ali. J. Genet. 94, 677–687. Table 1. Details showing age, sex, karyotype, clinical features and diagnosis results of the patients with H. Hormone profile.
Gopinath, Chitra; Nathar, Trupti Job; Ghosh, Arkasubhra; Hickstein, Dennis Durand; Nelson, Everette Jacob Remington
Over the past three decades, gene therapy has been making considerable progress as an alternative strategy in the treatment of many diseases. Since 2009, several studies have been reported in humans on the successful treatment of various diseases. Animal models mimicking human disease conditions are very essential at the preclinical stage before embarking on a clinical trial. In gene therapy, for instance, they are useful in the assessment of variables related to the use of viral vectors such as safety, efficacy, dosage and localization of transgene expression. However, choosing a suitable disease-specific model is of paramount importance for successful clinical translation. This review focuses on the animal models that are most commonly used in gene therapy studies, such as murine, canine, non-human primates, rabbits, porcine, and a more recently developed humanized mice. Though small and large animals both have their own pros and cons as disease-specific models, the choice is made largely based on the type and length of study performed. While small animals with a shorter life span could be well-suited for degenerative/aging studies, large animals with longer life span could suit longitudinal studies and also help with dosage adjustments to maximize therapeutic benefit. Recently, humanized mice or mouse-human chimaeras have gained interest in the study of human tissues or cells, thereby providing a more reliable understanding of therapeutic interventions. Thus, animal models are of great importance with regard to testing new vector technologies in vivo for assessing safety and efficacy prior to a gene therapy clinical trial.
Full Text Available To understand whether any human-specific new genes may be associated with human brain functions, we computationally screened the genetic vulnerable factors identified through Genome-Wide Association Studies and linkage analyses of nicotine addiction and found one human-specific de novo protein-coding gene, FLJ33706 (alternative gene symbol C20orf203. Cross-species analysis revealed interesting evolutionary paths of how this gene had originated from noncoding DNA sequences: insertion of repeat elements especially Alu contributed to the formation of the first coding exon and six standard splice junctions on the branch leading to humans and chimpanzees, and two subsequent substitutions in the human lineage escaped two stop codons and created an open reading frame of 194 amino acids. We experimentally verified FLJ33706's mRNA and protein expression in the brain. Real-Time PCR in multiple tissues demonstrated that FLJ33706 was most abundantly expressed in brain. Human polymorphism data suggested that FLJ33706 encodes a protein under purifying selection. A specifically designed antibody detected its protein expression across human cortex, cerebellum and midbrain. Immunohistochemistry study in normal human brain cortex revealed the localization of FLJ33706 protein in neurons. Elevated expressions of FLJ33706 were detected in Alzheimer's brain samples, suggesting the role of this novel gene in human-specific pathogenesis of Alzheimer's disease. FLJ33706 provided the strongest evidence so far that human-specific de novo genes can have protein-coding potential and differential protein expression, and be involved in human brain functions.
De Benedictis, G; Tan, Q; Jeune, B
This paper reviews the recent literature on genes and longevity. The influence of genes on human life span has been confirmed in studies of life span correlation between related individuals based on family and twin data. Results from major twin studies indicate that approximately 25......% of the variation in life span is genetically determined. Taking advantage of recent developments in molecular biology, researchers are now searching for candidate genes that might have an influence on life span. The data on unrelated individuals emerging from an ever-increasing number of centenarian studies makes...... this possible. This paper summarizes the rich literature dealing with the various aspects of the influence of genes on individual survival. Common phenomena affecting the development of disease and longevity are discussed. The major methodological difficulty one is confronted with when studying the epidemiology...
Chen, Jianshun; Chen, Fan; Cheng, Changyong; Fang, Weihuan
Arginine deiminase and agmatine deiminase systems are involved in acid tolerance, and their encoding genes form the cluster lmo0036-0043 in Listeria monocytogenes. While lmo0042 and lmo0043 were conserved in all L. monocytogenes strains, the lmo0036-0041 region of this cluster was identified in all lineages I and II, and the majority of lineage IV (83.3%) strains, but absent in all lineage III and a small fraction of lineage IV (16.7%) strains, suggesting that the presence of the complete lmo0036-0043 cluster is dependent on lineages. lmo0036-0043-complete and -deficient lineage IV strains exhibit specific ascB-dapE profiles, which might represent two subpopulations with distinct genetic characteristics.
Jensen, Philip J; Fazio, Gennaro; Altman, Naomi; Praul, Craig; McNellis, Timothy W
Apple tree breeding is slow and difficult due to long generation times, self-incompatibility, and complex genetics. The identification of molecular markers linked to traits of interest is a way to expedite the breeding process. In the present study, we aimed to identify genes whose steady-state transcript abundance was associated with inheritance of specific traits segregating in an apple (Malus × domestica) rootstock F1 breeding population, including resistance to powdery mildew (Podosphaera leucotricha) disease and woolly apple aphid (Eriosoma lanigerum). Transcription profiling was performed for 48 individual F1 apple trees from a cross of two highly heterozygous parents, using RNA isolated from healthy, actively-growing shoot tips and a custom apple DNA oligonucleotide microarray representing 26,000 unique transcripts. Genome-wide expression profiles were not clear indicators of powdery mildew or woolly apple aphid resistance phenotype. However, standard differential gene expression analysis between phenotypic groups of trees revealed relatively small sets of genes with trait-associated expression levels. For example, thirty genes were identified that were differentially expressed between trees resistant and susceptible to powdery mildew. Interestingly, the genes encoding twenty-four of these transcripts were physically clustered on chromosome 12. Similarly, seven genes were identified that were differentially expressed between trees resistant and susceptible to woolly apple aphid, and the genes encoding five of these transcripts were also clustered, this time on chromosome 17. In each case, the gene clusters were in the vicinity of previously identified major quantitative trait loci for the corresponding trait. Similar results were obtained for a series of molecular traits. Several of the differentially expressed genes were used to develop DNA polymorphism markers linked to powdery mildew disease and woolly apple aphid resistance. Gene expression profiling
Jacobs, Andreas H.; Winkler, Alexandra; Castro, Maria G.; Lowenstein, Pedro
Molecular imaging aims to assess non-invasively disease-specific biological and molecular processes in animal models and humans in vivo. Apart from precise anatomical localisation an