WorldWideScience

Sample records for niche-specific gene set

  1. A Phyletically Rare Gene Promotes the Niche-specific Fitness of an E. coli Pathogen during Bacteremia

    Science.gov (United States)

    Wiles, Travis J.; Lewis, Adam J.; Mobley, Harry L. T.; Casjens, Sherwood R.; Mulvey, Matthew A.

    2013-01-01

    In bacteria, laterally acquired genes are often concentrated within chromosomal regions known as genomic islands. Using a recently developed zebrafish infection model, we set out to identify unique factors encoded within genomic islands that contribute to the fitness and virulence of a reference urosepsis isolate—extraintestinal pathogenic Escherichia coli strain CFT073. By screening a series of deletion mutants, we discovered a previously uncharacterized gene, neaT, that is conditionally required by the pathogen during systemic infections. In vitro assays indicate that neaT can limit bacterial interactions with host phagocytes and alter the aggregative properties of CFT073. The neaT gene is localized within an integrated P2-like bacteriophage in CFT073, but was rarely found within other proteobacterial genomes. Sequence-based analyses revealed that neaT homologues are present, but discordantly conserved, within a phyletically diverse set of bacterial species. In CFT073, neaT appears to be unameliorated, having an exceptionally A+T-rich composition along with a notably altered codon bias. These data suggest that neaT was recently brought into the proteobacterial pan-genome from an extra-phyletic source. Interestingly, even in G+C-poor genomes, as found within the Firmicutes lineage, neaT-like genes are often unameliorated. Sequence-level features of neaT homologues challenge the common supposition that the A+T-rich nature of many recently acquired genes reflects the nucleotide composition of their genomes of origin. In total, these findings highlight the complexity of the evolutionary forces that can affect the acquisition, utilization, and assimilation of rare genes that promote the niche-dependent fitness and virulence of a bacterial pathogen. PMID:23459509

  2. Niche-specific cognitive strategies

    DEFF Research Database (Denmark)

    Hulgard, K.; Ratcliffe, J. M.

    2014-01-01

    Related species with different diets are predicted to rely on different cognitive strategies: those best suited for locating available and appropriate foods. Here we tested two predictions of the niche-specific cognitive strategies hypothesis in bats, which suggests that predatory species should...... the niche-specific cognitive strategies hypothesis and suggest that for gleaning and clutter-resistant aerial hawking bats, learning to associate shape with food interferes with subsequent spatial memory learning....

  3. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments

    KAUST Repository

    Wang, Yong; Yang, Jiang Ke; Lee, On On; Li, Tie Gang; Al-Suwailem, Abdulaziz M.; Danchin, Antoine; Qian, Pei-Yuan

    2011-01-01

    The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers) of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed. © 2011 Wang et al.

  4. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments

    KAUST Repository

    Wang, Yong

    2011-12-21

    The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers) of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed. © 2011 Wang et al.

  5. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments.

    Directory of Open Access Journals (Sweden)

    Yong Wang

    Full Text Available The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed.

  6. Analysis of 16S rRNA and mxaF genes revealing insights into Methylobacterium niche-specific plant association

    Science.gov (United States)

    Dourado, Manuella Nóbrega; Andreote, Fernando Dini; Dini-Andreote, Francisco; Conti, Raphael; Araújo, Janete Magali; Araújo, Welington Luiz

    2012-01-01

    The genus Methylobacterium comprises pink-pigmented facultative methylotrophic (PPFM) bacteria, known to be an important plant-associated bacterial group. Species of this group, described as plant-nodulating, have the dual capacity of producing cytokinin and enzymes, such as pectinase and cellulase, involved in systemic resistance induction and nitrogen fixation under specific plant environmental conditions. The aim hereby was to evaluate the phylogenetic distribution of Methylobacterium spp. isolates from different host plants. Thus, a comparative analysis between sequences from structural (16S rRNA) and functional mxaF (which codifies for a subunit of the enzyme methanol dehydrogenase) ubiquitous genes, was undertaken. Notably, some Methylobacterium spp. isolates are generalists through colonizing more than one host plant, whereas others are exclusively found in certain specific plant-species. Congruency between phylogeny and specific host inhabitance was higher in the mxaF gene than in the 16S rRNA, a possible indication of function-based selection in this niche. Therefore, in a first stage, plant colonization by Methylobacterium spp. could represent generalist behavior, possibly related to microbial competition and adaptation to a plant environment. Otherwise, niche-specific colonization is apparently impelled by the host plant. PMID:22481887

  7. Analysis of 16S rRNA and mxaF genes reveling insights into Methylobacterium niche-specific plant association

    Directory of Open Access Journals (Sweden)

    Manuella Nóbrega Dourado

    2012-01-01

    Full Text Available The genus Methylobacterium comprises pink-pigmented facultative methylotrophic (PPFM bacteria, known to be an important plant-associated bacterial group. Species of this group, described as plant-nodulating, have the dual capacity of producing cytokinin and enzymes, such as pectinase and cellulase, involved in systemic resistance induction and nitrogen fixation under specific plant environmental conditions. The aim hereby was to evaluate the phylogenetic distribution of Methylobacterium spp. isolates from different host plants. Thus, a comparative analysis between sequences from structural (16S rRNA and functional mxaF (which codifies for a subunit of the enzyme methanol dehydrogenase ubiquitous genes, was undertaken. Notably, some Methylobacterium spp. isolates are generalists through colonizing more than one host plant, whereas others are exclusively found in certain specific plant-species. Congruency between phylogeny and specific host inhabitance was higher in the mxaF gene than in the 16S rRNA, a possible indication of function-based selection in this niche. Therefore, in a first stage, plant colonization by Methylobacterium spp. could represent generalist behavior, possibly related to microbial competition and adaptation to a plant environment. Otherwise, niche-specific colonization is apparently impelled by the host plant.

  8. Analysis of 16S rRNA and mxaF genes revealing insights into Methylobacterium niche-specific plant association.

    Science.gov (United States)

    Dourado, Manuella Nóbrega; Andreote, Fernando Dini; Dini-Andreote, Francisco; Conti, Raphael; Araújo, Janete Magali; Araújo, Welington Luiz

    2012-01-01

    The genus Methylobacterium comprises pink-pigmented facultative methylotrophic (PPFM) bacteria, known to be an important plant-associated bacterial group. Species of this group, described as plant-nodulating, have the dual capacity of producing cytokinin and enzymes, such as pectinase and cellulase, involved in systemic resistance induction and nitrogen fixation under specific plant environmental conditions. The aim hereby was to evaluate the phylogenetic distribution of Methylobacterium spp. isolates from different host plants. Thus, a comparative analysis between sequences from structural (16S rRNA) and functional mxaF (which codifies for a subunit of the enzyme methanol dehydrogenase) ubiquitous genes, was undertaken. Notably, some Methylobacterium spp. isolates are generalists through colonizing more than one host plant, whereas others are exclusively found in certain specific plant-species. Congruency between phylogeny and specific host inhabitance was higher in the mxaF gene than in the 16S rRNA, a possible indication of function-based selection in this niche. Therefore, in a first stage, plant colonization by Methylobacterium spp. could represent generalist behavior, possibly related to microbial competition and adaptation to a plant environment. Otherwise, niche-specific colonization is apparently impelled by the host plant.

  9. Gene set analysis for GWAS

    DEFF Research Database (Denmark)

    Debrabant, Birgit; Soerensen, Mette

    2014-01-01

    Abstract We discuss the use of modified Kolmogorov-Smirnov (KS) statistics in the context of gene set analysis and review corresponding null and alternative hypotheses. Especially, we show that, when enhancing the impact of highly significant genes in the calculation of the test statistic, the co...

  10. Gene set analysis using variance component tests.

    Science.gov (United States)

    Huang, Yen-Tsung; Lin, Xihong

    2013-06-28

    Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.

  11. Gene set analysis of the EADGENE chicken data-set

    DEFF Research Database (Denmark)

    Skarman, Axel; Jiang, Li; Hornshøj, Henrik

    2009-01-01

     Abstract Background: Gene set analysis is considered to be a way of improving our biological interpretation of the observed expression patterns. This paper describes different methods applied to analyse expression data from a chicken DNA microarray dataset. Results: Applying different gene set...... analyses to the chicken expression data led to different ranking of the Gene Ontology terms tested. A method for prediction of possible annotations was applied. Conclusion: Biological interpretation based on gene set analyses dependent on the statistical method used. Methods for predicting the possible...

  12. Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool

    KAUST Repository

    Zhang, Weipeng

    2015-07-14

    The biology of biofilm in deep-sea environments is barely being explored. Here, biofilms were developed at the brine pool (characterized by limited carbon sources) and the normal bottom water adjacent to Thuwal cold seeps. Comparative metagenomics based on 50 Gb datasets identified polysaccharide degradation, nitrate reduction, and proteolysis as enriched functional categories for brine biofilms. The genomes of two dominant species: a novel deltaproteobacterium and a novel epsilonproteobacterium in the brine biofilms were reconstructed. Despite rather small genome sizes, the deltaproteobacterium possessed enhanced polysaccharide fermentation pathways, whereas the epsilonproteobacterium was a versatile nitrogen reactor possessing nar, nap and nif gene clusters. These metabolic functions, together with specific regulatory and hypersaline-tolerant genes, made the two bacteria unique compared with their close relatives including those from hydrothermal vents. Moreover, these functions were regulated by biofilm development, as both the abundance and the expression level of key functional genes were higher in later-stage biofilms, and co-occurrences between the two dominant bacteria were demonstrated. Collectively, unique mechanisms were revealed: i) polysaccharides fermentation, proteolysis interacted with nitrogen cycling to form a complex chain for energy generation; ii) remarkably, exploiting and organizing niche-specific functions would be an important strategy for biofilm-dependent adaptation to the extreme conditions.

  13. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    Science.gov (United States)

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  14. Gene set analysis for interpreting genetic studies

    DEFF Research Database (Denmark)

    Pers, Tune H

    2016-01-01

    Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...

  15. Delimiting Coalescence Genes (C-Genes) in Phylogenomic Data Sets.

    Science.gov (United States)

    Springer, Mark S; Gatesy, John

    2018-02-26

    coalescence methods have emerged as a popular alternative for inferring species trees with large genomic datasets, because these methods explicitly account for incomplete lineage sorting. However, statistical consistency of summary coalescence methods is not guaranteed unless several model assumptions are true, including the critical assumption that recombination occurs freely among but not within coalescence genes (c-genes), which are the fundamental units of analysis for these methods. Each c-gene has a single branching history, and large sets of these independent gene histories should be the input for genome-scale coalescence estimates of phylogeny. By contrast, numerous studies have reported the results of coalescence analyses in which complete protein-coding sequences are treated as c-genes even though exons for these loci can span more than a megabase of DNA. Empirical estimates of recombination breakpoints suggest that c-genes may be much shorter, especially when large clades with many species are the focus of analysis. Although this idea has been challenged recently in the literature, the inverse relationship between c-gene size and increased taxon sampling in a dataset-the 'recombination ratchet'-is a fundamental property of c-genes. For taxonomic groups characterized by genes with long intron sequences, complete protein-coding sequences are likely not valid c-genes and are inappropriate units of analysis for summary coalescence methods unless they occur in recombination deserts that are devoid of incomplete lineage sorting (ILS). Finally, it has been argued that coalescence methods are robust when the no-recombination within loci assumption is violated, but recombination must matter at some scale because ILS, a by-product of recombination, is the raison d'etre for coalescence methods. That is, extensive recombination is required to yield the large number of independently segregating c-genes used to infer a species tree. If coalescent methods are powerful

  16. Gene set analysis of purine and pyrimidine antimetabolites cancer therapies.

    Science.gov (United States)

    Fridley, Brooke L; Batzler, Anthony; Li, Liang; Li, Fang; Matimba, Alice; Jenkins, Gregory D; Ji, Yuan; Wang, Liewei; Weinshilboum, Richard M

    2011-11-01

    Responses to therapies, either with regard to toxicities or efficacy, are expected to involve complex relationships of gene products within the same molecular pathway or functional gene set. Therefore, pathways or gene sets, as opposed to single genes, may better reflect the true underlying biology and may be more appropriate units for analysis of pharmacogenomic studies. Application of such methods to pharmacogenomic studies may enable the detection of more subtle effects of multiple genes in the same pathway that may be missed by assessing each gene individually. A gene set analysis of 3821 gene sets is presented assessing the association between basal messenger RNA expression and drug cytotoxicity using ethnically defined human lymphoblastoid cell lines for two classes of drugs: pyrimidines [gemcitabine (dFdC) and arabinoside] and purines [6-thioguanine and 6-mercaptopurine]. The gene set nucleoside-diphosphatase activity was found to be significantly associated with both dFdC and arabinoside, whereas gene set γ-aminobutyric acid catabolic process was associated with dFdC and 6-thioguanine. These gene sets were significantly associated with the phenotype even after adjusting for multiple testing. In addition, five associated gene sets were found in common between the pyrimidines and two gene sets for the purines (3',5'-cyclic-AMP phosphodiesterase activity and γ-aminobutyric acid catabolic process) with a P value of less than 0.0001. Functional validation was attempted with four genes each in gene sets for thiopurine and pyrimidine antimetabolites. All four genes selected from the pyrimidine gene sets (PSME3, CANT1, ENTPD6, ADRM1) were validated, but only one (PDE4D) was validated for the thiopurine gene sets. In summary, results from the gene set analysis of pyrimidine and purine therapies, used often in the treatment of various cancers, provide novel insight into the relationship between genomic variation and drug response.

  17. MAGMA: generalized gene-set analysis of GWAS data.

    NARCIS (Netherlands)

    de Leeuw, C.A.; Mooij, J.M.; Heskes, T.; Posthuma, D.

    2015-01-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical

  18. MAGMA: Generalized Gene-Set Analysis of GWAS Data

    NARCIS (Netherlands)

    de Leeuw, C.A.; Mooij, J.M.; Heskes, T.; Posthuma, D.

    2015-01-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical

  19. Studying the Complex Expression Dependences between Sets of Coexpressed Genes

    Directory of Open Access Journals (Sweden)

    Mario Huerta

    2014-01-01

    Full Text Available Organisms simplify the orchestration of gene expression by coregulating genes whose products function together in the cell. The use of clustering methods to obtain sets of coexpressed genes from expression arrays is very common; nevertheless there are no appropriate tools to study the expression networks among these sets of coexpressed genes. The aim of the developed tools is to allow studying the complex expression dependences that exist between sets of coexpressed genes. For this purpose, we start detecting the nonlinear expression relationships between pairs of genes, plus the coexpressed genes. Next, we form networks among sets of coexpressed genes that maintain nonlinear expression dependences between all of them. The expression relationship between the sets of coexpressed genes is defined by the expression relationship between the skeletons of these sets, where this skeleton represents the coexpressed genes with a well-defined nonlinear expression relationship with the skeleton of the other sets. As a result, we can study the nonlinear expression relationships between a target gene and other sets of coexpressed genes, or start the study from the skeleton of the sets, to study the complex relationships of activation and deactivation between the sets of coexpressed genes that carry out the different cellular processes present in the expression experiments.

  20. MAGMA: generalized gene-set analysis of GWAS data.

    Science.gov (United States)

    de Leeuw, Christiaan A; Mooij, Joris M; Heskes, Tom; Posthuma, Danielle

    2015-04-01

    By aggregating data for complex traits in a biologically meaningful way, gene and gene-set analysis constitute a valuable addition to single-marker analysis. However, although various methods for gene and gene-set analysis currently exist, they generally suffer from a number of issues. Statistical power for most methods is strongly affected by linkage disequilibrium between markers, multi-marker associations are often hard to detect, and the reliance on permutation to compute p-values tends to make the analysis computationally very expensive. To address these issues we have developed MAGMA, a novel tool for gene and gene-set analysis. The gene analysis is based on a multiple regression model, to provide better statistical performance. The gene-set analysis is built as a separate layer around the gene analysis for additional flexibility. This gene-set analysis also uses a regression structure to allow generalization to analysis of continuous properties of genes and simultaneous analysis of multiple gene sets and other gene properties. Simulations and an analysis of Crohn's Disease data are used to evaluate the performance of MAGMA and to compare it to a number of other gene and gene-set analysis tools. The results show that MAGMA has significantly more power than other tools for both the gene and the gene-set analysis, identifying more genes and gene sets associated with Crohn's Disease while maintaining a correct type 1 error rate. Moreover, the MAGMA analysis of the Crohn's Disease data was found to be considerably faster as well.

  1. Using the gene ontology to scan multilevel gene sets for associations in genome wide association studies.

    Science.gov (United States)

    Schaid, Daniel J; Sinnwell, Jason P; Jenkins, Gregory D; McDonnell, Shannon K; Ingle, James N; Kubo, Michiaki; Goss, Paul E; Costantino, Joseph P; Wickerham, D Lawrence; Weinshilboum, Richard M

    2012-01-01

    Gene-set analyses have been widely used in gene expression studies, and some of the developed methods have been extended to genome wide association studies (GWAS). Yet, complications due to linkage disequilibrium (LD) among single nucleotide polymorphisms (SNPs), and variable numbers of SNPs per gene and genes per gene-set, have plagued current approaches, often leading to ad hoc "fixes." To overcome some of the current limitations, we developed a general approach to scan GWAS SNP data for both gene-level and gene-set analyses, building on score statistics for generalized linear models, and taking advantage of the directed acyclic graph structure of the gene ontology when creating gene-sets. However, other types of gene-set structures can be used, such as the popular Kyoto Encyclopedia of Genes and Genomes (KEGG). Our approach combines SNPs into genes, and genes into gene-sets, but assures that positive and negative effects of genes on a trait do not cancel. To control for multiple testing of many gene-sets, we use an efficient computational strategy that accounts for LD and provides accurate step-down adjusted P-values for each gene-set. Application of our methods to two different GWAS provide guidance on the potential strengths and weaknesses of our proposed gene-set analyses. © 2011 Wiley Periodicals, Inc.

  2. Principles for the organization of gene-sets.

    Science.gov (United States)

    Li, Wentian; Freudenberg, Jan; Oswald, Michaela

    2015-12-01

    A gene-set, an important concept in microarray expression analysis and systems biology, is a collection of genes and/or their products (i.e. proteins) that have some features in common. There are many different ways to construct gene-sets, but a systematic organization of these ways is lacking. Gene-sets are mainly organized ad hoc in current public-domain databases, with group header names often determined by practical reasons (such as the types of technology in obtaining the gene-sets or a balanced number of gene-sets under a header). Here we aim at providing a gene-set organization principle according to the level at which genes are connected: homology, physical map proximity, chemical interaction, biological, and phenotypic-medical levels. We also distinguish two types of connections between genes: actual connection versus sharing of a label. Actual connections denote direct biological interactions, whereas shared label connection denotes shared membership in a group. Some extensions of the framework are also addressed such as overlapping of gene-sets, modules, and the incorporation of other non-protein-coding entities such as microRNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Comparing the Healthy Nose and Nasopharynx Microbiota Reveals Continuity As Well As Niche-Specificity

    Directory of Open Access Journals (Sweden)

    Ilke De Boeck

    2017-11-01

    Full Text Available To improve our understanding of upper respiratory tract (URT diseases and the underlying microbial pathogenesis, a better characterization of the healthy URT microbiome is crucial. In this first large-scale study, we obtained more insight in the URT microbiome of healthy adults. Hereto, we collected paired nasal and nasopharyngeal swabs from 100 healthy participants in a citizen-science project. High-throughput 16S rRNA gene V4 amplicon sequencing was performed and samples were processed using the Divisive Amplicon Denoising Algorithm 2 (DADA2 algorithm. This allowed us to identify the bacterial richness and diversity of the samples in terms of amplicon sequence variants (ASVs, with special attention to intragenus variation. We found both niches to have a low overall species richness and uneven distribution. Moreover, based on hierarchical clustering, nasopharyngeal samples could be grouped into some bacterial community types at genus level, of which four were supported to some extent by prediction strength evaluation: one intermixed type with a higher bacterial diversity where Staphylococcus, Corynebacterium, and Dolosigranulum appeared main bacterial members in different relative abundances, and three types dominated by either Moraxella, Streptococcus, or Fusobacterium. Some of these bacterial community types such as Streptococcus and Fusobacterium were nasopharynx-specific and never occurred in the nose. No clear association between the nasopharyngeal bacterial profiles at genus level and the variables age, gender, blood type, season of sampling, or common respiratory allergies was found in this study population, except for smoking showing a positive association with Corynebacterium and Staphylococcus. Based on the fine-scale resolution of the ASVs, both known commensal and potential pathogenic bacteria were found within several genera – particularly in Streptococcus and Moraxella – in our healthy study population. Of interest, the

  4. A Bayesian variable selection procedure for ranking overlapping gene sets

    DEFF Research Database (Denmark)

    Skarman, Axel; Mahdi Shariati, Mohammad; Janss, Luc

    2012-01-01

    Background Genome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits. Different methods to prioritize gene sets, such as the genes in a given molecular pathway, have been de...

  5. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    Science.gov (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  6. Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data

    Directory of Open Access Journals (Sweden)

    Tintle Nathan L

    2012-08-01

    Full Text Available Abstract Background Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. Results We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Conclusions Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.

  7. A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

    Science.gov (United States)

    Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong

    2015-01-01

    Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.

  8. Zebrafish Expression Ontology of Gene Sets (ZEOGS): a tool to analyze enrichment of zebrafish anatomical terms in large gene sets.

    Science.gov (United States)

    Prykhozhij, Sergey V; Marsico, Annalisa; Meijsing, Sebastiaan H

    2013-09-01

    The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene expression

  9. Zebrafish Expression Ontology of Gene Sets (ZEOGS): A Tool to Analyze Enrichment of Zebrafish Anatomical Terms in Large Gene Sets

    Science.gov (United States)

    Marsico, Annalisa

    2013-01-01

    Abstract The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene

  10. Discovery of cancer common and specific driver gene sets

    Science.gov (United States)

    2017-01-01

    Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295

  11. Time-Course Gene Set Analysis for Longitudinal Gene Expression Data.

    Directory of Open Access Journals (Sweden)

    Boris P Hejblum

    2015-06-01

    Full Text Available Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial, and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package.

  12. Effect of the absolute statistic on gene-sampling gene-set analysis methods.

    Science.gov (United States)

    Nam, Dougu

    2017-06-01

    Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.

  13. A pursuit of lineage-specific and niche-specific proteome features in the world of archaea.

    Science.gov (United States)

    Roy Chowdhury, Anindya; Dutta, Chitra

    2012-06-12

    Archaea evoke interest among researchers for two enigmatic characteristics -a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.

  14. A pursuit of lineage-specific and niche-specific proteome features in the world of archaea

    Directory of Open Access Journals (Sweden)

    Roy Chowdhury Anindya

    2012-06-01

    Full Text Available Abstract Background Archaea evoke interest among researchers for two enigmatic characteristics –a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Results Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Conclusions Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.

  15. APPRIS 2017: principal isoforms for multiple gene sets

    Science.gov (United States)

    Rodriguez-Rivas, Juan; Di Domenico, Tomás; Vázquez, Jesús; Valencia, Alfonso

    2018-01-01

    Abstract The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the ‘principal’ isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants. PMID:29069475

  16. Ranking metrics in gene set enrichment analysis: do they matter?

    Science.gov (United States)

    Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna

    2017-05-12

    There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner

  17. Model-based gene set analysis for Bioconductor.

    Science.gov (United States)

    Bauer, Sebastian; Robinson, Peter N; Gagneur, Julien

    2011-07-01

    Gene Ontology and other forms of gene-category analysis play a major role in the evaluation of high-throughput experiments in molecular biology. Single-category enrichment analysis procedures such as Fisher's exact test tend to flag large numbers of redundant categories as significant, which can complicate interpretation. We have recently developed an approach called model-based gene set analysis (MGSA), that substantially reduces the number of redundant categories returned by the gene-category analysis. In this work, we present the Bioconductor package mgsa, which makes the MGSA algorithm available to users of the R language. Our package provides a simple and flexible application programming interface for applying the approach. The mgsa package has been made available as part of Bioconductor 2.8. It is released under the conditions of the Artistic license 2.0. peter.robinson@charite.de; julien.gagneur@embl.de.

  18. GeneTopics - interpretation of gene sets via literature-driven topic models

    Science.gov (United States)

    2013-01-01

    Background Annotation of a set of genes is often accomplished through comparison to a library of labelled gene sets such as biological processes or canonical pathways. However, this approach might fail if the employed libraries are not up to date with the latest research, don't capture relevant biological themes or are curated at a different level of granularity than is required to appropriately analyze the input gene set. At the same time, the vast biomedical literature offers an unstructured repository of the latest research findings that can be tapped to provide thematic sub-groupings for any input gene set. Methods Our proposed method relies on a gene-specific text corpus and extracts commonalities between documents in an unsupervised manner using a topic model approach. We automatically determine the number of topics summarizing the corpus and calculate a gene relevancy score for each topic allowing us to eliminate non-specific topics. As a result we obtain a set of literature topics in which each topic is associated with a subset of the input genes providing directly interpretable keywords and corresponding documents for literature research. Results We validate our method based on labelled gene sets from the KEGG metabolic pathway collection and the genetic association database (GAD) and show that the approach is able to detect topics consistent with the labelled annotation. Furthermore, we discuss the results on three different types of experimentally derived gene sets, (1) differentially expressed genes from a cardiac hypertrophy experiment in mice, (2) altered transcript abundance in human pancreatic beta cells, and (3) genes implicated by GWA studies to be associated with metabolite levels in a healthy population. In all three cases, we are able to replicate findings from the original papers in a quick and semi-automated manner. Conclusions Our approach provides a novel way of automatically generating meaningful annotations for gene sets that are directly

  19. GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Chris Cheadle

    2007-01-01

    Full Text Available Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.Results: We have developed (gene set matrix analysis GSMA as a useful method for the rapid testing of group-wise up- or downregulation of gene expression simultaneously for multiple lists of genes (gene sets against entire distributions of gene expression changes (datasets for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.

  20. Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool

    KAUST Repository

    Zhang, Weipeng; Wang, Yong; Bougouffa, Salim; Tian, Renmao; Cao, Huiluo; Li, Yongxin; Cai, Lin; Wong, Yue Him; Zhang, Gen; Zhou, Guowei; Zhang, Xixiang; Bajic, Vladimir B.; Al-Suwailem, Abdulaziz M.; Qian, Pei-Yuan

    2015-01-01

    in the brine biofilms were reconstructed. Despite rather small genome sizes, the deltaproteobacterium possessed enhanced polysaccharide fermentation pathways, whereas the epsilonproteobacterium was a versatile nitrogen reactor possessing nar, nap and nif gene

  1. Integrative analysis of survival-associated gene sets in breast cancer.

    Science.gov (United States)

    Varn, Frederick S; Ung, Matthew H; Lou, Shao Ke; Cheng, Chao

    2015-03-12

    Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient's cancer. Identifying robust gene sets that are consistently predictive of a patient's clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set's activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network's topology and applied the GSAS metric to characterize its role in patient survival. Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used

  2. Identification of a robust gene signature that predicts breast cancer outcome in independent data sets

    International Nuclear Information System (INIS)

    Korkola, James E; Waldman, Frederic M; Blaveri, Ekaterina; DeVries, Sandy; Moore, Dan H II; Hwang, E Shelley; Chen, Yunn-Yi; Estep, Anne LH; Chew, Karen L; Jensen, Ronald H

    2007-01-01

    Breast cancer is a heterogeneous disease, presenting with a wide range of histologic, clinical, and genetic features. Microarray technology has shown promise in predicting outcome in these patients. We profiled 162 breast tumors using expression microarrays to stratify tumors based on gene expression. A subset of 55 tumors with extensive follow-up was used to identify gene sets that predicted outcome. The predictive gene set was further tested in previously published data sets. We used different statistical methods to identify three gene sets associated with disease free survival. A fourth gene set, consisting of 21 genes in common to all three sets, also had the ability to predict patient outcome. To validate the predictive utility of this derived gene set, it was tested in two published data sets from other groups. This gene set resulted in significant separation of patients on the basis of survival in these data sets, correctly predicting outcome in 62–65% of patients. By comparing outcome prediction within subgroups based on ER status, grade, and nodal status, we found that our gene set was most effective in predicting outcome in ER positive and node negative tumors. This robust gene selection with extensive validation has identified a predictive gene set that may have clinical utility for outcome prediction in breast cancer patients

  3. Three gene expression vector sets for concurrently expressing multiple genes in Saccharomyces cerevisiae.

    Science.gov (United States)

    Ishii, Jun; Kondo, Takashi; Makino, Harumi; Ogura, Akira; Matsuda, Fumio; Kondo, Akihiko

    2014-05-01

    Yeast has the potential to be used in bulk-scale fermentative production of fuels and chemicals due to its tolerance for low pH and robustness for autolysis. However, expression of multiple external genes in one host yeast strain is considerably labor-intensive due to the lack of polycistronic transcription. To promote the metabolic engineering of yeast, we generated systematic and convenient genetic engineering tools to express multiple genes in Saccharomyces cerevisiae. We constructed a series of multi-copy and integration vector sets for concurrently expressing two or three genes in S. cerevisiae by embedding three classical promoters. The comparative expression capabilities of the constructed vectors were monitored with green fluorescent protein, and the concurrent expression of genes was monitored with three different fluorescent proteins. Our multiple gene expression tool will be helpful to the advanced construction of genetically engineered yeast strains in a variety of research fields other than metabolic engineering. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  4. Tracking difference in gene expression in a time-course experiment using gene set enrichment analysis.

    Directory of Open Access Journals (Sweden)

    Pui Shan Wong

    Full Text Available Fistulifera sp. strain JPCC DA0580 is a newly sequenced pennate diatom that is capable of simultaneously growing and accumulating lipids. This is a unique trait, not found in other related microalgae so far. It is able to accumulate between 40 to 60% of its cell weight in lipids, making it a strong candidate for the production of biofuel. To investigate this characteristic, we used RNA-Seq data gathered at four different times while Fistulifera sp. strain JPCC DA0580 was grown in oil accumulating and non-oil accumulating conditions. We then adapted gene set enrichment analysis (GSEA to investigate the relationship between the difference in gene expression of 7,822 genes and metabolic functions in our data. We utilized information in the KEGG pathway database to create the gene sets and changed GSEA to use re-sampling so that data from the different time points could be included in the analysis. Our GSEA method identified photosynthesis, lipid synthesis and amino acid synthesis related pathways as processes that play a significant role in oil production and growth in Fistulifera sp. strain JPCC DA0580. In addition to GSEA, we visualized the results by creating a network of compounds and reactions, and plotted the expression data on top of the network. This made existing graph algorithms available to us which we then used to calculate a path that metabolizes glucose into triacylglycerol (TAG in the smallest number of steps. By visualizing the data this way, we observed a separate up-regulation of genes at different times instead of a concerted response. We also identified two metabolic paths that used less reactions than the one shown in KEGG and showed that the reactions were up-regulated during the experiment. The combination of analysis and visualization methods successfully analyzed time-course data, identified important metabolic pathways and provided new hypotheses for further research.

  5. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  6. Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior.

    Science.gov (United States)

    Windhorst, Dafna A; Mileva-Seitz, Viara R; Rippe, Ralph C A; Tiemeier, Henning; Jaddoe, Vincent W V; Verhulst, Frank C; van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J

    2016-08-01

    In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental interaction (GWEI) studies in the search for genetic variation underlying individual differences in behavior. Genetic variants in 12 autosomal dopaminergic genes were available in an ethnically homogenous part of a population-based cohort. Harsh parenting was assessed with maternal (n = 1881) and paternal (n = 1710) reports at age 3. Externalizing behavior was assessed with the Child Behavior Checklist (CBCL) at age 5 (71 ± 3.7 months). We conducted gene-set analyses of the association between variation in dopaminergic genes and externalizing behavior, stratified for harsh parenting. The association was statistically significant or approached significance for children without harsh parenting experiences, but was absent in the group with harsh parenting. Similarly, significant associations between single genes and externalizing behavior were only found in the group without harsh parenting. Effect sizes in the groups with and without harsh parenting did not differ significantly. Gene-environment interaction tests were conducted for individual genetic variants, resulting in two significant interaction effects (rs1497023 and rs4922132) after correction for multiple testing. Our findings are suggestive of G × E interplay, with associations between dopamine genes and externalizing behavior present in children without harsh parenting, but not in children with harsh parenting experiences. Harsh parenting may overrule the role of genetic factors in externalizing behavior. Gene-based and gene-set

  7. Uniform approximation is more appropriate for Wilcoxon Rank-Sum Test in gene set analysis.

    Directory of Open Access Journals (Sweden)

    Zhide Fang

    Full Text Available Gene set analysis is widely used to facilitate biological interpretations in the analyses of differential expression from high throughput profiling data. Wilcoxon Rank-Sum (WRS test is one of the commonly used methods in gene set enrichment analysis. It compares the ranks of genes in a gene set against those of genes outside the gene set. This method is easy to implement and it eliminates the dichotomization of genes into significant and non-significant in a competitive hypothesis testing. Due to the large number of genes being examined, it is impractical to calculate the exact null distribution for the WRS test. Therefore, the normal distribution is commonly used as an approximation. However, as we demonstrate in this paper, the normal approximation is problematic when a gene set with relative small number of genes is tested against the large number of genes in the complementary set. In this situation, a uniform approximation is substantially more powerful, more accurate, and less intensive in computation. We demonstrate the advantage of the uniform approximations in Gene Ontology (GO term analysis using simulations and real data sets.

  8. Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior

    NARCIS (Netherlands)

    J. Windhorst (Judith); V. Mileva-Seitz (Viara); R.C.A. Rippe (Ralph C.A.); H.W. Tiemeier (Henning); V.W.V. Jaddoe (Vincent); F.C. Verhulst (Frank); M.H. van IJzendoorn (Rien); M.J. Bakermans-Kranenburg (Marian)

    2016-01-01

    textabstractBackground: In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and

  9. Constellation Map: Downstream visualization and interpretation of gene set enrichment results [version 1; referees: 2 approved

    Directory of Open Access Journals (Sweden)

    Yan Tan

    2015-06-01

    Full Text Available Summary: Gene set enrichment analysis (GSEA approaches are widely used to identify coordinately regulated genes associated with phenotypes of interest. Here, we present Constellation Map, a tool to visualize and interpret the results when enrichment analyses yield a long list of significantly enriched gene sets. Constellation Map identifies commonalities that explain the enrichment of multiple top-scoring gene sets and maps the relationships between them. Constellation Map can help investigators take full advantage of GSEA and facilitates the biological interpretation of enrichment results. Availability: Constellation Map is freely available as a GenePattern module at http://www.genepattern.org.

  10. Gene set of nuclear-encoded mitochondrial regulators is enriched for common inherited variation in obesity.

    Directory of Open Access Journals (Sweden)

    Nadja Knoll

    Full Text Available There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1 16 nuclear regulators of mitochondrial genes, (2 91 genes for oxidative phosphorylation and (3 966 nuclear-encoded mitochondrial genes. Gene set enrichment analysis (GSEA showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents and a population-based GWAS sample (KORA F4, n = 1,743. A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50(th and 95(th percentile of the set of all gene-wise corrected p-values as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50(th percentile for the set of the 16 nuclear regulators of mitochondrial genes (p(GSEA,50 = 0.0103. This finding was not confirmed in the trios (p(GSEA,50 = 0.5991, but in KORA (p(GSEA,50 = 0.0398. The meta-analysis again indicated a trend for enrichment (p(MAGENTA,50 = 0.1052, p(MAGENTA,75 = 0.0251. The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes.

  11. Candidate genes for COPD in two large data sets.

    Science.gov (United States)

    Bakke, P S; Zhu, G; Gulsvik, A; Kong, X; Agusti, A G N; Calverley, P M A; Donner, C F; Levy, R D; Make, B J; Paré, P D; Rennard, S I; Vestbo, J; Wouters, E F M; Anderson, W; Lomas, D A; Silverman, E K; Pillai, S G

    2011-02-01

    Lack of reproducibility of findings has been a criticism of genetic association studies on complex diseases, such as chronic obstructive pulmonary disease (COPD). We selected 257 polymorphisms of 16 genes with reported or potential relationships to COPD and genotyped these variants in a case-control study that included 953 COPD cases and 956 control subjects. We explored the association of these polymorphisms to three COPD phenotypes: a COPD binary phenotype and two quantitative traits (post-bronchodilator forced expiratory volume in 1 s (FEV₁) % predicted and FEV₁/forced vital capacity (FVC)). The polymorphisms significantly associated to these phenotypes in this first study were tested in a second, family-based study that included 635 pedigrees with 1,910 individuals. Significant associations to the binary COPD phenotype in both populations were seen for STAT1 (rs13010343) and NFKBIB/SIRT2 (rs2241704) (p<0.05). Single-nucleotide polymorphisms rs17467825 and rs1155563 of the GC gene were significantly associated with FEV₁ % predicted and FEV₁/FVC, respectively, in both populations (p<0.05). This study has replicated associations to COPD phenotypes in the STAT1, NFKBIB/SIRT2 and GC genes in two independent populations, the associations of the former two genes representing novel findings.

  12. Comparative study on gene set and pathway topology-based enrichment methods.

    Science.gov (United States)

    Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim

    2015-10-22

    Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both

  13. Phylogenetics and evolution of Trx SET genes in fully sequenced land plants.

    Science.gov (United States)

    Zhu, Xinyu; Chen, Caoyi; Wang, Baohua

    2012-04-01

    Plant Trx SET proteins are involved in H3K4 methylation and play a key role in plant floral development. Genes encoding Trx SET proteins constitute a multigene family in which the copy number varies among plant species and functional divergence appears to have occurred repeatedly. To investigate the evolutionary history of the Trx SET gene family, we made a comprehensive evolutionary analysis on this gene family from 13 major representatives of green plants. A novel clustering (here named as cpTrx clade), which included the III-1, III-2, and III-4 orthologous groups, previously resolved was identified. Our analysis showed that plant Trx proteins possessed a variety of domain organizations and gene structures among paralogs. Additional domains such as PHD, PWWP, and FYR were early integrated into primordial SET-PostSET domain organization of cpTrx clade. We suggested that the PostSET domain was lost in some members of III-4 orthologous group during the evolution of land plants. At least four classes of gene structures had been formed at the early evolutionary stage of land plants. Three intronless orphan Trx SET genes from the Physcomitrella patens (moss) were identified, and supposedly, their parental genes have been eliminated from the genome. The structural differences among evolutionary groups of plant Trx SET genes with different functions were described, contributing to the design of further experimental studies.

  14. An Independent Filter for Gene Set Testing Based on Spectral Enrichment

    NARCIS (Netherlands)

    Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

    2015-01-01

    Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in

  15. Principal Angle Enrichment Analysis (PAEA): Dimensionally Reduced Multivariate Gene Set Enrichment Analysis Tool.

    Science.gov (United States)

    Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi

    2015-11-01

    Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.

  16. Annotating gene sets by mining large literature collections with protein networks.

    Science.gov (United States)

    Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

    2018-01-01

    Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.

  17. The Molecular Signatures Database (MSigDB) hallmark gene set collection.

    Science.gov (United States)

    Liberzon, Arthur; Birger, Chet; Thorvaldsdóttir, Helga; Ghandi, Mahmoud; Mesirov, Jill P; Tamayo, Pablo

    2015-12-23

    The Molecular Signatures Database (MSigDB) is one of the most widely used and comprehensive databases of gene sets for performing gene set enrichment analysis. Since its creation, MSigDB has grown beyond its roots in metabolic disease and cancer to include >10,000 gene sets. These better represent a wider range of biological processes and diseases, but the utility of the database is reduced by increased redundancy across, and heterogeneity within, gene sets. To address this challenge, here we use a combination of automated approaches and expert curation to develop a collection of "hallmark" gene sets as part of MSigDB. Each hallmark in this collection consists of a "refined" gene set, derived from multiple "founder" sets, that conveys a specific biological state or process and displays coherent expression. The hallmarks effectively summarize most of the relevant information of the original founder sets and, by reducing both variation and redundancy, provide more refined and concise inputs for gene set enrichment analysis.

  18. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

    Science.gov (United States)

    Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-03-01

    Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics

  19. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, D.A. van; Goeman, J.J.; Jong, E. de; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set

  20. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, van D.A.M.; Goeman, J.J.; Jong, de E.; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    Background: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set

  1. The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis

    DEFF Research Database (Denmark)

    Debrabant, Birgit

    2017-01-01

    MOTIVATION: Competitive gene set analysis intends to assess whether a specific set of genes is more associated with a trait than the remaining genes. However, the statistical models assumed to date to underly these methods do not enable a clear cut formulation of the competitive null hypothesis....... This is a major handicap to the interpretation of results obtained from a gene set analysis. RESULTS: This work presents a hierarchical statistical model based on the notion of dependence measures, which overcomes this problem. The two levels of the model naturally reflect the modular structure of many gene set...... analysis methods. We apply the model to show that the popular GSEA method, which recently has been claimed to test the self-contained null hypothesis, actually tests the competitive null if the weight parameter is zero. However, for this result to hold strictly, the choice of the dependence measures...

  2. FunGeneNet: a web tool to estimate enrichment of functional interactions in experimental gene sets.

    Science.gov (United States)

    Tiys, Evgeny S; Ivanisenko, Timofey V; Demenkov, Pavel S; Ivanisenko, Vladimir A

    2018-02-09

    Estimation of functional connectivity in gene sets derived from genome-wide or other biological experiments is one of the essential tasks of bioinformatics. A promising approach for solving this problem is to compare gene networks built using experimental gene sets with random networks. One of the resources that make such an analysis possible is CrossTalkZ, which uses the FunCoup database. However, existing methods, including CrossTalkZ, do not take into account individual types of interactions, such as protein/protein interactions, expression regulation, transport regulation, catalytic reactions, etc., but rather work with generalized types characterizing the existence of any connection between network members. We developed the online tool FunGeneNet, which utilizes the ANDSystem and STRING to reconstruct gene networks using experimental gene sets and to estimate their difference from random networks. To compare the reconstructed networks with random ones, the node permutation algorithm implemented in CrossTalkZ was taken as a basis. To study the FunGeneNet applicability, the functional connectivity analysis of networks constructed for gene sets involved in the Gene Ontology biological processes was conducted. We showed that the method sensitivity exceeds 0.8 at a specificity of 0.95. We found that the significance level of the difference between gene networks of biological processes and random networks is determined by the type of connections considered between objects. At the same time, the highest reliability is achieved for the generalized form of connections that takes into account all the individual types of connections. By taking examples of the thyroid cancer networks and the apoptosis network, it is demonstrated that key participants in these processes are involved in the interactions of those types by which these networks differ from random ones. FunGeneNet is a web tool aimed at proving the functionality of networks in a wide range of sizes of

  3. Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

    Directory of Open Access Journals (Sweden)

    Andrew Williams

    2015-12-01

    Full Text Available Background: The presence of diverse types of nanomaterials (NMs in commerce is growing at an exponential pace. As a result, human exposure to these materials in the environment is inevitable, necessitating the need for rapid and reliable toxicity testing methods to accurately assess the potential hazards associated with NMs. In this study, we applied biclustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related biclusters of genes showing similar expression profiles were identified. The identified biclusters were then used to conduct a gene set enrichment analysis on pulmonary gene expression profiles derived from mice exposed to nano-titanium dioxide (nano-TiO2, carbon black (CB or carbon nanotubes (CNTs to determine the disease significance of these data-driven gene sets.Results: Biclusters representing inflammation (chemokine activity, DNA binding, cell cycle, apoptosis, reactive oxygen species (ROS and fibrosis processes were identified. All of the NM studies were significant with respect to the bicluster related to chemokine activity (DAVID; FDR p-value = 0.032. The bicluster related to pulmonary fibrosis was enriched in studies where toxicity induced by CNT and CB studies was investigated, suggesting the potential for these materials to induce lung fibrosis. The pro-fibrogenic potential of CNTs is well established. Although CB has not been shown to induce fibrosis, it induces stronger inflammatory, oxidative stress and DNA damage responses than nano-TiO2 particles.Conclusion: The results of the analysis correctly identified all NMs to be inflammogenic and only CB and CNTs as potentially fibrogenic. In addition to identifying several

  4. Genome-wide survey and developmental expression mapping of zebrafish SET domain-containing genes.

    Directory of Open Access Journals (Sweden)

    Xiao-Jian Sun

    Full Text Available SET domain-containing proteins represent an evolutionarily conserved family of epigenetic regulators, which are responsible for most histone lysine methylation. Since some of these genes have been revealed to be essential for embryonic development, we propose that the zebrafish, a vertebrate model organism possessing many advantages for developmental studies, can be utilized to study the biological functions of these genes and the related epigenetic mechanisms during early development. To this end, we have performed a genome-wide survey of zebrafish SET domain genes. 58 genes total have been identified. Although gene duplication events give rise to several lineage-specific paralogs, clear reciprocal orthologous relationship reveals high conservation between zebrafish and human SET domain genes. These data were further subject to an evolutionary analysis ranging from yeast to human, leading to the identification of putative clusters of orthologous groups (COGs of this gene family. By means of whole-mount mRNA in situ hybridization strategy, we have also carried out a developmental expression mapping of these genes. A group of maternal SET domain genes, which are implicated in the programming of histone modification states in early development, have been identified and predicted to be responsible for all known sites of SET domain-mediated histone methylation. Furthermore, some genes show specific expression patterns in certain tissues at certain stages, suggesting the involvement of epigenetic mechanisms in the development of these systems. These results provide a global view of zebrafish SET domain histone methyltransferases in evolutionary and developmental dimensions and pave the way for using zebrafish to systematically study the roles of these genes during development.

  5. Gene set analysis: limitations in popular existing methods and proposed improvements.

    Science.gov (United States)

    Mishra, Pashupati; Törönen, Petri; Leino, Yrjö; Holm, Liisa

    2014-10-01

    Gene set analysis is the analysis of a set of genes that collectively contribute to a biological process. Most popular gene set analysis methods are based on empirical P-value that requires large number of permutations. Despite numerous gene set analysis methods developed in the past decade, the most popular methods still suffer from serious limitations. We present a gene set analysis method (mGSZ) based on Gene Set Z-scoring function (GSZ) and asymptotic P-values. Asymptotic P-value calculation requires fewer permutations, and thus speeds up the gene set analysis process. We compare the GSZ-scoring function with seven popular gene set scoring functions and show that GSZ stands out as the best scoring function. In addition, we show improved performance of the GSA method when the max-mean statistics is replaced by the GSZ scoring function. We demonstrate the importance of both gene and sample permutations by showing the consequences in the absence of one or the other. A comparison of asymptotic and empirical methods of P-value estimation demonstrates a clear advantage of asymptotic P-value over empirical P-value. We show that mGSZ outperforms the state-of-the-art methods based on two different evaluations. We compared mGSZ results with permutation and rotation tests and show that rotation does not improve our asymptotic P-values. We also propose well-known asymptotic distribution models for three of the compared methods. mGSZ is available as R package from cran.r-project.org. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  6. CAsubtype: An R Package to Identify Gene Sets Predictive of Cancer Subtypes and Clinical Outcomes.

    Science.gov (United States)

    Kong, Hualei; Tong, Pan; Zhao, Xiaodong; Sun, Jielin; Li, Hua

    2018-03-01

    In the past decade, molecular classification of cancer has gained high popularity owing to its high predictive power on clinical outcomes as compared with traditional methods commonly used in clinical practice. In particular, using gene expression profiles, recent studies have successfully identified a number of gene sets for the delineation of cancer subtypes that are associated with distinct prognosis. However, identification of such gene sets remains a laborious task due to the lack of tools with flexibility, integration and ease of use. To reduce the burden, we have developed an R package, CAsubtype, to efficiently identify gene sets predictive of cancer subtypes and clinical outcomes. By integrating more than 13,000 annotated gene sets, CAsubtype provides a comprehensive repertoire of candidates for new cancer subtype identification. For easy data access, CAsubtype further includes the gene expression and clinical data of more than 2000 cancer patients from TCGA. CAsubtype first employs principal component analysis to identify gene sets (from user-provided or package-integrated ones) with robust principal components representing significantly large variation between cancer samples. Based on these principal components, CAsubtype visualizes the sample distribution in low-dimensional space for better understanding of the distinction between samples and classifies samples into subgroups with prevalent clustering algorithms. Finally, CAsubtype performs survival analysis to compare the clinical outcomes between the identified subgroups, assessing their clinical value as potentially novel cancer subtypes. In conclusion, CAsubtype is a flexible and well-integrated tool in the R environment to identify gene sets for cancer subtype identification and clinical outcome prediction. Its simple R commands and comprehensive data sets enable efficient examination of the clinical value of any given gene set, thus facilitating hypothesis generating and testing in biological and

  7. Identification of a conserved set of upregulated genes in mouse skeletal muscle hypertrophy and regrowth.

    Science.gov (United States)

    Chaillou, Thomas; Jackson, Janna R; England, Jonathan H; Kirby, Tyler J; Richards-White, Jena; Esser, Karyn A; Dupont-Versteegden, Esther E; McCarthy, John J

    2015-01-01

    The purpose of this study was to compare the gene expression profile of mouse skeletal muscle undergoing two forms of growth (hypertrophy and regrowth) with the goal of identifying a conserved set of differentially expressed genes. Expression profiling by microarray was performed on the plantaris muscle subjected to 1, 3, 5, 7, 10, and 14 days of hypertrophy or regrowth following 2 wk of hind-limb suspension. We identified 97 differentially expressed genes (≥2-fold increase or ≥50% decrease compared with control muscle) that were conserved during the two forms of muscle growth. The vast majority (∼90%) of the differentially expressed genes was upregulated and occurred at a single time point (64 out of 86 genes), which most often was on the first day of the time course. Microarray analysis from the conserved upregulated genes showed a set of genes related to contractile apparatus and stress response at day 1, including three genes involved in mechanotransduction and four genes encoding heat shock proteins. Our analysis further identified three cell cycle-related genes at day and several genes associated with extracellular matrix (ECM) at both days 3 and 10. In conclusion, we have identified a core set of genes commonly upregulated in two forms of muscle growth that could play a role in the maintenance of sarcomere stability, ECM remodeling, cell proliferation, fast-to-slow fiber type transition, and the regulation of skeletal muscle growth. These findings suggest conserved regulatory mechanisms involved in the adaptation of skeletal muscle to increased mechanical loading. Copyright © 2015 the American Physiological Society.

  8. Mechanism-based biomarker gene sets for glutathione depletion-related hepatotoxicity in rats

    International Nuclear Information System (INIS)

    Gao Weihua; Mizukawa, Yumiko; Nakatsu, Noriyuki; Minowa, Yosuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

    2010-01-01

    Chemical-induced glutathione depletion is thought to be caused by two types of toxicological mechanisms: PHO-type glutathione depletion [glutathione conjugated with chemicals such as phorone (PHO) or diethyl maleate (DEM)], and BSO-type glutathione depletion [i.e., glutathione synthesis inhibited by chemicals such as L-buthionine-sulfoximine (BSO)]. In order to identify mechanism-based biomarker gene sets for glutathione depletion in rat liver, male SD rats were treated with various chemicals including PHO (40, 120 and 400 mg/kg), DEM (80, 240 and 800 mg/kg), BSO (150, 450 and 1500 mg/kg), and bromobenzene (BBZ, 10, 100 and 300 mg/kg). Liver samples were taken 3, 6, 9 and 24 h after administration and examined for hepatic glutathione content, physiological and pathological changes, and gene expression changes using Affymetrix GeneChip Arrays. To identify differentially expressed probe sets in response to glutathione depletion, we focused on the following two courses of events for the two types of mechanisms of glutathione depletion: a) gene expression changes occurring simultaneously in response to glutathione depletion, and b) gene expression changes after glutathione was depleted. The gene expression profiles of the identified probe sets for the two types of glutathione depletion differed markedly at times during and after glutathione depletion, whereas Srxn1 was markedly increased for both types as glutathione was depleted, suggesting that Srxn1 is a key molecule in oxidative stress related to glutathione. The extracted probe sets were refined and verified using various compounds including 13 additional positive or negative compounds, and they established two useful marker sets. One contained three probe sets (Akr7a3, Trib3 and Gstp1) that could detect conjugation-type glutathione depletors any time within 24 h after dosing, and the other contained 14 probe sets that could detect glutathione depletors by any mechanism. These two sets, with appropriate scoring

  9. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    Science.gov (United States)

    2013-01-01

    Background Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the

  10. Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

    DEFF Research Database (Denmark)

    Ersland, Kari M; Christoforou, Andrea; Stansberg, Christine

    2012-01-01

    the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n...

  11. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  12. Investigating the effect of paralogs on microarray gene-set analysis

    LENUS (Irish Health Repository)

    Faure, Andre J

    2011-01-24

    Abstract Background In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. Results We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http:\\/\\/www.cbio.uct.ac.za\\/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. Conclusions The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.

  13. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    Directory of Open Access Journals (Sweden)

    Hettne Kristina M

    2013-01-01

    Full Text Available Abstract Background Availability of chemical response-specific lists of genes (gene sets for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM, and that these can be used with gene set analysis (GSA methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human and 588 (mouse gene sets from the Comparative Toxicogenomics Database (CTD. We tested for significant differential expression (SDE (false discovery rate -corrected p-values Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.

  14. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    Science.gov (United States)

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  15. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    K.M. Hettne (Kristina); J. Boorsma (Jeffrey); D.A.M. van Dartel (Dorien A M); J.J. Goeman (Jelle); E.C. de Jong (Esther); A.H. Piersma (Aldert); R.H. Stierum (Rob); J. Kleinjans (Jos); J.A. Kors (Jan)

    2013-01-01

    textabstractBackground: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with

  16. Optimal structural inference of signaling pathways from unordered and overlapping gene sets.

    Science.gov (United States)

    Acharya, Lipi R; Judeh, Thair; Wang, Guangdi; Zhu, Dongxiao

    2012-02-15

    A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing methods accommodating discrete data do not explicitly consider signal cascading mechanisms that characterize a signaling pathway. Novel computational methods are thus needed to fully utilize gene sets and broaden the scope from focusing only on pairwise interactions to the more general cascading events in the inference of signaling pathway structures. We propose a gene set based simulated annealing (SA) algorithm for the reconstruction of signaling pathway structures. A signaling pathway structure is a directed graph containing up to a few hundred nodes and many overlapping signal cascades, where each cascade represents a chain of molecular interactions from the cell surface to the nucleus. Gene sets in our context refer to discrete sets of genes participating in signal cascades, the basic building blocks of a signaling pathway, with no prior information about gene orderings in the cascades. From a compendium of gene sets related to a pathway, SA aims to search for signal cascades that characterize the optimal signaling pathway structure. In the search process, the extent of overlap among signal cascades is used to measure the optimality of a structure. Throughout, we treat gene sets as random samples from a first-order Markov chain model. We evaluated the performance of SA in three case studies. In the first study conducted on 83 KEGG pathways, SA demonstrated a significantly better performance than Bayesian network methods. Since both SA and Bayesian network methods accommodate discrete data, use a 'search and score' network learning strategy and output a directed network, they can be compared in terms of performance and computational time. In the second study, we compared SA and

  17. Meta-analysis of differentiating mouse embryonic stem cell gene expression kinetics reveals early change of a small gene set.

    Directory of Open Access Journals (Sweden)

    Clive H Glover

    2006-11-01

    Full Text Available Stem cell differentiation involves critical changes in gene expression. Identification of these should provide endpoints useful for optimizing stem cell propagation as well as potential clues about mechanisms governing stem cell maintenance. Here we describe the results of a new meta-analysis methodology applied to multiple gene expression datasets from three mouse embryonic stem cell (ESC lines obtained at specific time points during the course of their differentiation into various lineages. We developed methods to identify genes with expression changes that correlated with the altered frequency of functionally defined, undifferentiated ESC in culture. In each dataset, we computed a novel statistical confidence measure for every gene which captured the certainty that a particular gene exhibited an expression pattern of interest within that dataset. This permitted a joint analysis of the datasets, despite the different experimental designs. Using a ranking scheme that favored genes exhibiting patterns of interest, we focused on the top 88 genes whose expression was consistently changed when ESC were induced to differentiate. Seven of these (103728_at, 8430410A17Rik, Klf2, Nr0b1, Sox2, Tcl1, and Zfp42 showed a rapid decrease in expression concurrent with a decrease in frequency of undifferentiated cells and remained predictive when evaluated in additional maintenance and differentiating protocols. Through a novel meta-analysis, this study identifies a small set of genes whose expression is useful for identifying changes in stem cell frequencies in cultures of mouse ESC. The methods and findings have broader applicability to understanding the regulation of self-renewal of other stem cell types.

  18. Identification of a set of genes showing regionally enriched expression in the mouse brain

    Directory of Open Access Journals (Sweden)

    Marra Marco A

    2008-07-01

    Full Text Available Abstract Background The Pleiades Promoter Project aims to improve gene therapy by designing human mini-promoters ( Results We have utilized LongSAGE to identify regionally enriched transcripts in the adult mouse brain. As supplemental strategies, we also performed a meta-analysis of published literature and inspected the Allen Brain Atlas in situ hybridization data. From a set of approximately 30,000 mouse genes, 237 were identified as showing specific or enriched expression in 30 target regions of the mouse brain. GO term over-representation among these genes revealed co-involvement in various aspects of central nervous system development and physiology. Conclusion Using a multi-faceted expression validation approach, we have identified mouse genes whose human orthologs are good candidates for design of mini-promoters. These mouse genes represent molecular markers in several discrete brain regions/cell-types, which could potentially provide a mechanistic explanation of unique functions performed by each region. This set of markers may also serve as a resource for further studies of gene regulatory elements influencing brain expression.

  19. Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder

    DEFF Research Database (Denmark)

    Naaijen, Jill; Bralten, Janita; Poelmans, Geert

    2017-01-01

    Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance...... within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms...... is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants...

  20. Selection and validation of a set of reliable reference genes for quantitative sod gene expression analysis in C. elegans

    Directory of Open Access Journals (Sweden)

    Vandesompele Jo

    2008-01-01

    Full Text Available Abstract Background In the nematode Caenorhabditis elegans the conserved Ins/IGF-1 signaling pathway regulates many biological processes including life span, stress response, dauer diapause and metabolism. Detection of differentially expressed genes may contribute to a better understanding of the mechanism by which the Ins/IGF-1 signaling pathway regulates these processes. Appropriate normalization is an essential prerequisite for obtaining accurate and reproducible quantification of gene expression levels. The aim of this study was to establish a reliable set of reference genes for gene expression analysis in C. elegans. Results Real-time quantitative PCR was used to evaluate the expression stability of 12 candidate reference genes (act-1, ama-1, cdc-42, csq-1, eif-3.C, mdh-1, gpd-2, pmp-3, tba-1, Y45F10D.4, rgs-6 and unc-16 in wild-type, three Ins/IGF-1 pathway mutants, dauers and L3 stage larvae. After geNorm analysis, cdc-42, pmp-3 and Y45F10D.4 showed the most stable expression pattern and were used to normalize 5 sod expression levels. Significant differences in mRNA levels were observed for sod-1 and sod-3 in daf-2 relative to wild-type animals, whereas in dauers sod-1, sod-3, sod-4 and sod-5 are differentially expressed relative to third stage larvae. Conclusion Our findings emphasize the importance of accurate normalization using stably expressed reference genes. The methodology used in this study is generally applicable to reliably quantify gene expression levels in the nematode C. elegans using quantitative PCR.

  1. Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    LENUS (Irish Health Repository)

    2011-10-05

    Abstract Background We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant.

  2. Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes

    Directory of Open Access Journals (Sweden)

    Reverter Antonio

    2008-09-01

    Full Text Available Abstract Background The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association. Methods We adopted an a priori approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported. Results Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes. Conclusion We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing

  3. Can survival prediction be improved by merging gene expression data sets?

    Directory of Open Access Journals (Sweden)

    Haleh Yasrebi

    Full Text Available BACKGROUND: High-throughput gene expression profiling technologies generating a wealth of data, are increasingly used for characterization of tumor biopsies for clinical trials. By applying machine learning algorithms to such clinically documented data sets, one hopes to improve tumor diagnosis, prognosis, as well as prediction of treatment response. However, the limited number of patients enrolled in a single trial study limits the power of machine learning approaches due to over-fitting. One could partially overcome this limitation by merging data from different studies. Nevertheless, such data sets differ from each other with regard to technical biases, patient selection criteria and follow-up treatment. It is therefore not clear at all whether the advantage of increased sample size outweighs the disadvantage of higher heterogeneity of merged data sets. Here, we present a systematic study to answer this question specifically for breast cancer data sets. We use survival prediction based on Cox regression as an assay to measure the added value of merged data sets. RESULTS: Using time-dependent Receiver Operating Characteristic-Area Under the Curve (ROC-AUC and hazard ratio as performance measures, we see in overall no significant improvement or deterioration of survival prediction with merged data sets as compared to individual data sets. This apparently was due to the fact that a few genes with strong prognostic power were not available on all microarray platforms and thus were not retained in the merged data sets. Surprisingly, we found that the overall best performance was achieved with a single-gene predictor consisting of CYB5D1. CONCLUSIONS: Merging did not deteriorate performance on average despite (a The diversity of microarray platforms used. (b The heterogeneity of patients cohorts. (c The heterogeneity of breast cancer disease. (d Substantial variation of time to death or relapse. (e The reduced number of genes in the merged data

  4. Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets

    KAUST Repository

    Permina, Elizaveta A.; Medvedeva, Yulia; Baeck, Pia M.; Hegde, Shubhada R.; Mande, Shekhar C.; Makeev, Vsevolod J.

    2013-01-01

    interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set

  5. Using OWL reasoning to support the generation of novel gene sets for enrichment analysis.

    Science.gov (United States)

    Osumi-Sutherland, David J; Ponta, Enrico; Courtot, Melanie; Parkinson, Helen; Badi, Laura

    2018-02-14

    The Gene Ontology (GO) consists of over 40,000 terms for biological processes, cell components and gene product activities linked into a graph structure by over 90,000 relationships. It has been used to annotate the functions and cellular locations of several million gene products. The graph structure is used by a variety of tools to group annotated genes into sets whose products share function or location. These gene sets are widely used to interpret the results of genomics experiments by assessing which sets are significantly over- or under-represented in results lists. F Hoffmann-La Roche Ltd. has developed a bespoke, manually maintained controlled vocabulary (RCV) for use in over-representation analysis. Many terms in this vocabulary group GO terms in novel ways that cannot easily be derived using the graph structure of the GO. For example, some RCV terms group GO terms by the cell, chemical or tissue type they refer to. Recent improvements in the content and formal structure of the GO make it possible to use logical queries in Web Ontology Language (OWL) to automatically map these cross-cutting classifications to sets of GO terms. We used this approach to automate mapping between RCV and GO, largely replacing the increasingly unsustainable manual mapping process. We then tested the utility of the resulting groupings for over-representation analysis. We successfully mapped 85% of RCV terms to logical OWL definitions and showed that these could be used to recapitulate and extend manual mappings between RCV terms and the sets of GO terms subsumed by them. We also show that gene sets derived from the resulting GO terms sets can be used to detect the signatures of cell and tissue types in whole genome expression data. The rich formal structure of the GO makes it possible to use reasoning to dynamically generate novel, biologically relevant groupings of GO terms. GO term groupings generated with this approach can be used in. over-representation analysis to detect

  6. Shrinkage covariance matrix approach based on robust trimmed mean in gene sets detection

    Science.gov (United States)

    Karjanto, Suryaefiza; Ramli, Norazan Mohamed; Ghani, Nor Azura Md; Aripin, Rasimah; Yusop, Noorezatty Mohd

    2015-02-01

    Microarray involves of placing an orderly arrangement of thousands of gene sequences in a grid on a suitable surface. The technology has made a novelty discovery since its development and obtained an increasing attention among researchers. The widespread of microarray technology is largely due to its ability to perform simultaneous analysis of thousands of genes in a massively parallel manner in one experiment. Hence, it provides valuable knowledge on gene interaction and function. The microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints. Therefore, the sample covariance matrix in Hotelling's T2 statistic is not positive definite and become singular, thus it cannot be inverted. In this research, the Hotelling's T2 statistic is combined with a shrinkage approach as an alternative estimation to estimate the covariance matrix to detect significant gene sets. The use of shrinkage covariance matrix overcomes the singularity problem by converting an unbiased to an improved biased estimator of covariance matrix. Robust trimmed mean is integrated into the shrinkage matrix to reduce the influence of outliers and consequently increases its efficiency. The performance of the proposed method is measured using several simulation designs. The results are expected to outperform existing techniques in many tested conditions.

  7. Niche specificity of ammonia-oxidizing archaeal and bacterial communities in a freshwater wetland receiving municipal wastewater in Daqing, Northeast China.

    Science.gov (United States)

    Lee, Kwok-Ho; Wang, Yong-Feng; Li, Hui; Gu, Ji-Dong

    2014-12-01

    Ecophysiological differences between ammonia-oxidizing bacteria (AOB) and ammonia-oxidizing archaea (AOA) enable them to adapt to different niches in complex freshwater wetland ecosystems. The community characters of AOA and AOB in the different niches in a freshwater wetland receiving municipal wastewater, as well as the physicochemical parameters of sediment/soil samples, were investigated in this study. AOA community structures varied and separated from each other among four different niches. Wetland vegetation including aquatic macrophytes and terrestrial plants affected the AOA community composition but less for AOB, whereas sediment depths might contribute to the AOB community shift. The diversity of AOA communities was higher than that of AOB across all four niches. Archaeal and bacterial amoA genes (encoding for the alpha-subunit of ammonia monooxygenases) were most diverse in the dry-land niche, indicating O2 availability might favor ammonia oxidation. The majority of AOA amoA sequences belonged to the Soil/sediment Cluster B in the freshwater wetland ecosystems, while the dominant AOB amoA sequences were affiliated with Nitrosospira-like cluster. In the Nitrosospira-like cluster, AOB amoA gene sequences affiliated with the uncultured ammonia-oxidizing beta-proteobacteria constituted the largest portion (99%). Moreover, independent methods for phylogenetic tree analysis supported high parsimony bootstrap values. As a consequence, it is proposed that Nitrosospira-like amoA gene sequences recovered in this study represent a potentially novel cluster, grouping with the sequences from Gulf of Mexico deposited in the public databases.

  8. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses.

    Science.gov (United States)

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-03-01

    comprehensive gene data set of sex pheromone biosynthesis and degradation enzyme related genes in DBM created by genome- and transcriptome-wide identification, characterization and expression profiling. Our findings provide a basis to better understand the function of genes with tissue enriched expression. The results also provide information on the genes involved in sex pheromone biosynthesis and degradation, and may be useful to identify potential gene targets for pest control strategies by disrupting the insect-insect communication using pheromone-based behavioral antagonists.

  9. Coverage and characteristics of the Affymetrix GeneChip Human Mapping 100K SNP set.

    Directory of Open Access Journals (Sweden)

    2006-05-01

    Full Text Available Improvements in technology have made it possible to conduct genome-wide association mapping at costs within reach of academic investigators, and experiments are currently being conducted with a variety of high-throughput platforms. To provide an appropriate context for interpreting results of such studies, we summarize here results of an investigation of one of the first of these technologies to be publicly available, the Affymetrix GeneChip Human Mapping 100K set of single nucleotide polymorphisms (SNPs. In a systematic analysis of the pattern and distribution of SNPs in the Mapping 100K set, we find that SNPs in this set are undersampled from coding regions (both nonsynonymous and synonymous and oversampled from regions outside genes, relative to SNPs in the overall HapMap database. In addition, we utilize a novel multilocus linkage disequilibrium (LD coefficient based on information content (analogous to the information content scores commonly used for linkage mapping that is equivalent to the familiar measure r2 in the special case of two loci. Using this approach, we are able to summarize for any subset of markers, such as the Affymetrix Mapping 100K set, the information available for association mapping in that subset, relative to the information available in the full set of markers included in the HapMap, and highlight circumstances in which this multilocus measure of LD provides substantial additional insight about the haplotype structure in a region over pairwise measures of LD.

  10. ADAGE signature analysis: differential expression analysis with data-defined gene sets.

    Science.gov (United States)

    Tan, Jie; Huyck, Matthew; Hu, Dongbo; Zelaya, René A; Hogan, Deborah A; Greene, Casey S

    2017-11-22

    Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data. Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr. We designed

  11. A cross-study gene set enrichment analysis identifies critical pathways in endometriosis

    Directory of Open Access Journals (Sweden)

    Bai Chunyan

    2009-09-01

    Full Text Available Abstract Background Endometriosis is an enigmatic disease. Gene expression profiling of endometriosis has been used in several studies, but few studies went further to classify subtypes of endometriosis based on expression patterns and to identify possible pathways involved in endometriosis. Some of the observed pathways are more inconsistent between the studies, and these candidate pathways presumably only represent a fraction of the pathways involved in endometriosis. Methods We applied a standardised microarray preprocessing and gene set enrichment analysis to six independent studies, and demonstrated increased concordance between these gene datasets. Results We find 16 up-regulated and 19 down-regulated pathways common in ovarian endometriosis data sets, 22 up-regulated and one down-regulated pathway common in peritoneal endometriosis data sets. Among them, 12 up-regulated and 1 down-regulated were found consistent between ovarian and peritoneal endometriosis. The main canonical pathways identified are related to immunological and inflammatory disease. Early secretory phase has the most over-represented pathways in the three uterine cycle phases. There are no overlapping significant pathways between the dataset from human endometrial endothelial cells and the datasets from ovarian endometriosis which used whole tissues. Conclusion The study of complex diseases through pathway analysis is able to highlight genes weakly connected to the phenotype which may be difficult to detect by using classical univariate statistics. By standardised microarray preprocessing and GSEA, we have increased the concordance in identifying many biological mechanisms involved in endometriosis. The identified gene pathways will shed light on the understanding of endometriosis and promote the development of novel therapies.

  12. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  13. Reduced Set of Virulence Genes Allows High Accuracy Prediction of Bacterial Pathogenicity in Humans

    Science.gov (United States)

    Iraola, Gregorio; Vazquez, Gustavo; Spangenberg, Lucía; Naya, Hugo

    2012-01-01

    Although there have been great advances in understanding bacterial pathogenesis, there is still a lack of integrative information about what makes a bacterium a human pathogen. The advent of high-throughput sequencing technologies has dramatically increased the amount of completed bacterial genomes, for both known human pathogenic and non-pathogenic strains; this information is now available to investigate genetic features that determine pathogenic phenotypes in bacteria. In this work we determined presence/absence patterns of different virulence-related genes among more than finished bacterial genomes from both human pathogenic and non-pathogenic strains, belonging to different taxonomic groups (i.e: Actinobacteria, Gammaproteobacteria, Firmicutes, etc.). An accuracy of 95% using a cross-fold validation scheme with in-fold feature selection is obtained when classifying human pathogens and non-pathogens. A reduced subset of highly informative genes () is presented and applied to an external validation set. The statistical model was implemented in the BacFier v1.0 software (freely available at ), that displays not only the prediction (pathogen/non-pathogen) and an associated probability for pathogenicity, but also the presence/absence vector for the analyzed genes, so it is possible to decipher the subset of virulence genes responsible for the classification on the analyzed genome. Furthermore, we discuss the biological relevance for bacterial pathogenesis of the core set of genes, corresponding to eight functional categories, all with evident and documented association with the phenotypes of interest. Also, we analyze which functional categories of virulence genes were more distinctive for pathogenicity in each taxonomic group, which seems to be a completely new kind of information and could lead to important evolutionary conclusions. PMID:22916122

  14. Identification of a core set of rhizobial infection genes using data from single cell-types

    Directory of Open Access Journals (Sweden)

    Da-Song eChen

    2015-07-01

    Full Text Available Genome-wide expression studies on nodulation have varied in their scale from entire root systems to dissected nodules or root sections containing nodule primordia. More recently efforts have focused on developing methods for isolation of root hairs from infected plants and the application of laser-capture microdissection technology to nodules. Here we analyze two published data sets to identify a core set of infection genes that are expressed in the nodule and in root hairs during infection. Among the genes identified were those encoding phenylpropanoid biosynthesis enzymes including Chalcone-O-Methyltransferase which is required for the production of the potent Nod gene inducer 4’,4-dihydroxy-2-methoxychalcone. A promoter-GUS analysis in transgenic hairy roots for two genes encoding Chalcone-O-Methyltransferase isoforms revealed their expression in rhizobially infected root hairs and the nodule infection zone but not in the nitrogen fixation zone. We also describe a group of Rhizobially Induced Peroxidases whose expression overlaps with the production of superoxide in rhizobially infected root hairs and in nodules and roots. Finally, we identify a cohort of co-regulated transcription factors as candidate regulators of these processes.

  15. Expression map of a complete set of gustatory receptor genes in chemosensory organs of Bombyx mori.

    Science.gov (United States)

    Guo, Huizhen; Cheng, Tingcai; Chen, Zhiwei; Jiang, Liang; Guo, Youbing; Liu, Jianqiu; Li, Shenglong; Taniai, Kiyoko; Asaoka, Kiyoshi; Kadono-Okuda, Keiko; Arunkumar, Kallare P; Wu, Jiaqi; Kishino, Hirohisa; Zhang, Huijie; Seth, Rakesh K; Gopinathan, Karumathil P; Montagné, Nicolas; Jacquin-Joly, Emmanuelle; Goldsmith, Marian R; Xia, Qingyou; Mita, Kazuei

    2017-03-01

    Most lepidopteran species are herbivores, and interaction with host plants affects their gene expression and behavior as well as their genome evolution. Gustatory receptors (Grs) are expected to mediate host plant selection, feeding, oviposition and courtship behavior. However, due to their high diversity, sequence divergence and extremely low level of expression it has been difficult to identify precisely a complete set of Grs in Lepidoptera. By manual annotation and BAC sequencing, we improved annotation of 43 gene sequences compared with previously reported Grs in the most studied lepidopteran model, the silkworm, Bombyx mori, and identified 7 new tandem copies of BmGr30 on chromosome 7, bringing the total number of BmGrs to 76. Among these, we mapped 68 genes to chromosomes in a newly constructed chromosome distribution map and 8 genes to scaffolds; we also found new evidence for large clusters of BmGrs, especially from the bitter receptor family. RNA-seq analysis of diverse BmGr expression patterns in chemosensory organs of larvae and adults enabled us to draw a precise organ specific map of BmGr expression. Interestingly, most of the clustered genes were expressed in the same tissues and more than half of the genes were expressed in larval maxillae, larval thoracic legs and adult legs. For example, BmGr63 showed high expression levels in all organs in both larval and adult stages. By contrast, some genes showed expression limited to specific developmental stages or organs and tissues. BmGr19 was highly expressed in larval chemosensory organs (especially antennae and thoracic legs), the single exon genes BmGr53 and BmGr67 were expressed exclusively in larval tissues, the BmGr27-BmGr31 gene cluster on chr7 displayed a high expression level limited to adult legs and the candidate CO 2 receptor BmGr2 was highly expressed in adult antennae, where few other Grs were expressed. Transcriptional analysis of the Grs in B. mori provides a valuable new reference for

  16. Identification of the Core Set of Carbon-Associated Genes in a Bioenergy Grassland Soil.

    Directory of Open Access Journals (Sweden)

    Adina Howe

    Full Text Available Despite the central role of soil microbial communities in global carbon (C cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the "core" set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP. Of 226,887 sequences associated with known enzymes involved in the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. In soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.

  17. Genome-Wide Temporal Expression Profiling in Caenorhabditis elegans Identifies a Core Gene Set Related to Long-Term Memory.

    Science.gov (United States)

    Freytag, Virginie; Probst, Sabine; Hadziselimovic, Nils; Boglari, Csaba; Hauser, Yannick; Peter, Fabian; Gabor Fenyves, Bank; Milnik, Annette; Demougin, Philippe; Vukojevic, Vanja; de Quervain, Dominique J-F; Papassotiropoulos, Andreas; Stetak, Attila

    2017-07-12

    The identification of genes related to encoding, storage, and retrieval of memories is a major interest in neuroscience. In the current study, we analyzed the temporal gene expression changes in a neuronal mRNA pool during an olfactory long-term associative memory (LTAM) in Caenorhabditis elegans hermaphrodites. Here, we identified a core set of 712 (538 upregulated and 174 downregulated) genes that follows three distinct temporal peaks demonstrating multiple gene regulation waves in LTAM. Compared with the previously published positive LTAM gene set (Lakhina et al., 2015), 50% of the identified upregulated genes here overlap with the previous dataset, possibly representing stimulus-independent memory-related genes. On the other hand, the remaining genes were not previously identified in positive associative memory and may specifically regulate aversive LTAM. Our results suggest a multistep gene activation process during the formation and retrieval of long-term memory and define general memory-implicated genes as well as conditioning-type-dependent gene sets. SIGNIFICANCE STATEMENT The identification of genes regulating different steps of memory is of major interest in neuroscience. Identification of common memory genes across different learning paradigms and the temporal activation of the genes are poorly studied. Here, we investigated the temporal aspects of Caenorhabditis elegans gene expression changes using aversive olfactory associative long-term memory (LTAM) and identified three major gene activation waves. Like in previous studies, aversive LTAM is also CREB dependent, and CREB activity is necessary immediately after training. Finally, we define a list of memory paradigm-independent core gene sets as well as conditioning-dependent genes. Copyright © 2017 the authors 0270-6474/17/376661-12$15.00/0.

  18. Microarray analysis identifies a common set of cellular genes modulated by different HCV replicon clones

    Directory of Open Access Journals (Sweden)

    Gerosolimo Germano

    2008-06-01

    Full Text Available Abstract Background Hepatitis C virus (HCV RNA synthesis and protein expression affect cell homeostasis by modulation of gene expression. The impact of HCV replication on global cell transcription has not been fully evaluated. Thus, we analysed the expression profiles of different clones of human hepatoma-derived Huh-7 cells carrying a self-replicating HCV RNA which express all viral proteins (HCV replicon system. Results First, we compared the expression profile of HCV replicon clone 21-5 with both the Huh-7 parental cells and the 21-5 cured (21-5c cells. In these latter, the HCV RNA has been eliminated by IFN-α treatment. To confirm data, we also analyzed microarray results from both the 21-5 and two other HCV replicon clones, 22-6 and 21-7, compared to the Huh-7 cells. The study was carried out by using the Applied Biosystems (AB Human Genome Survey Microarray v1.0 which provides 31,700 probes that correspond to 27,868 human genes. Microarray analysis revealed a specific transcriptional program induced by HCV in replicon cells respect to both IFN-α-cured and Huh-7 cells. From the original datasets of differentially expressed genes, we selected by Venn diagrams a final list of 38 genes modulated by HCV in all clones. Most of the 38 genes have never been described before and showed high fold-change associated with significant p-value, strongly supporting data reliability. Classification of the 38 genes by Panther System identified functional categories that were significantly enriched in this gene set, such as histones and ribosomal proteins as well as extracellular matrix and intracellular protein traffic. The dataset also included new genes involved in lipid metabolism, extracellular matrix and cytoskeletal network, which may be critical for HCV replication and pathogenesis. Conclusion Our data provide a comprehensive analysis of alterations in gene expression induced by HCV replication and reveal modulation of new genes potentially useful

  19. Systems-based biological concordance and predictive reproducibility of gene set discovery methods in cardiovascular disease.

    Science.gov (United States)

    Azuaje, Francisco; Zheng, Huiru; Camargo, Anyela; Wang, Haiying

    2011-08-01

    The discovery of novel disease biomarkers is a crucial challenge for translational bioinformatics. Demonstration of both their classification power and reproducibility across independent datasets are essential requirements to assess their potential clinical relevance. Small datasets and multiplicity of putative biomarker sets may explain lack of predictive reproducibility. Studies based on pathway-driven discovery approaches have suggested that, despite such discrepancies, the resulting putative biomarkers tend to be implicated in common biological processes. Investigations of this problem have been mainly focused on datasets derived from cancer research. We investigated the predictive and functional concordance of five methods for discovering putative biomarkers in four independently-generated datasets from the cardiovascular disease domain. A diversity of biosignatures was identified by the different methods. However, we found strong biological process concordance between them, especially in the case of methods based on gene set analysis. With a few exceptions, we observed lack of classification reproducibility using independent datasets. Partial overlaps between our putative sets of biomarkers and the primary studies exist. Despite the observed limitations, pathway-driven or gene set analysis can predict potentially novel biomarkers and can jointly point to biomedically-relevant underlying molecular mechanisms. Copyright © 2011 Elsevier Inc. All rights reserved.

  20. Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

    Science.gov (United States)

    Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

    2017-08-01

    Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10 -10 ), MGC57346 (p value=6.92×10 -7 ), BLK (p value=1.01×10 -6 ), XKR6 (p value=1.11×10 -6 ), C17ORF69 (p value=1.12×10 -6 ) and KIAA1267 (p value=4.00×10 -6 ). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.

  1. Genome-Wide Gene Set Analysis for Identification of Pathways Associated with Alcohol Dependence

    Science.gov (United States)

    Biernacka, Joanna M.; Geske, Jennifer; Jenkins, Gregory D.; Colby, Colin; Rider, David N.; Karpyak, Victor M.; Choi, Doo-Sup; Fridley, Brooke L.

    2013-01-01

    It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the “Synthesis and Degradation of Ketone Bodies” pathway. Our results also support the potential involvement of the “Neuroactive Ligand Receptor Interaction” pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence. PMID:22717047

  2. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

    International Nuclear Information System (INIS)

    Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-01-01

    Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC

  3. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

    Energy Technology Data Exchange (ETDEWEB)

    Pandi, Narayanan Sathiya, E-mail: sathiyapandi@gmail.com; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC.

  4. Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets

    Directory of Open Access Journals (Sweden)

    Karacali Bilge

    2007-10-01

    Full Text Available Abstract Background Independently derived expression profiles of the same biological condition often have few genes in common. In this study, we created populations of expression profiles from publicly available microarray datasets of cancer (breast, lymphoma and renal samples linked to clinical information with an iterative machine learning algorithm. ROC curves were used to assess the prediction error of each profile for classification. We compared the prediction error of profiles correlated with molecular phenotype against profiles correlated with relapse-free status. Prediction error of profiles identified with supervised univariate feature selection algorithms were compared to profiles selected randomly from a all genes on the microarray platform and b a list of known disease-related genes (a priori selection. We also determined the relevance of expression profiles on test arrays from independent datasets, measured on either the same or different microarray platforms. Results Highly discriminative expression profiles were produced on both simulated gene expression data and expression data from breast cancer and lymphoma datasets on the basis of ER and BCL-6 expression, respectively. Use of relapse-free status to identify profiles for prognosis prediction resulted in poorly discriminative decision rules. Supervised feature selection resulted in more accurate classifications than random or a priori selection, however, the difference in prediction error decreased as the number of features increased. These results held when decision rules were applied across-datasets to samples profiled on the same microarray platform. Conclusion Our results show that many gene sets predict molecular phenotypes accurately. Given this, expression profiles identified using different training datasets should be expected to show little agreement. In addition, we demonstrate the difficulty in predicting relapse directly from microarray data using supervised machine

  5. MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants.

    Science.gov (United States)

    Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

    2014-11-01

    MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11-14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14-16 Type II MADS-box genes. The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS-box genes for the development of gymnosperms. This study is

  6. Evaluation of endogenous control genes for gene expression studies across multiple tissues and in the specific sets of fat- and muscle-type samples of the pig.

    Science.gov (United States)

    Gu, Y R; Li, M Z; Zhang, K; Chen, L; Jiang, A A; Wang, J Y; Li, X W

    2011-08-01

    To normalize a set of quantitative real-time PCR (q-PCR) data, it is essential to determine an optimal number/set of housekeeping genes, as the abundance of housekeeping genes can vary across tissues or cells during different developmental stages, or even under certain environmental conditions. In this study, of the 20 commonly used endogenous control genes, 13, 18 and 17 genes exhibited credible stability in 56 different tissues, 10 types of adipose tissue and five types of muscle tissue, respectively. Our analysis clearly showed that three optimal housekeeping genes are adequate for an accurate normalization, which correlated well with the theoretical optimal number (r ≥ 0.94). In terms of economical and experimental feasibility, we recommend the use of the three most stable housekeeping genes for calculating the normalization factor. Based on our results, the three most stable housekeeping genes in all analysed samples (TOP2B, HSPCB and YWHAZ) are recommended for accurate normalization of q-PCR data. We also suggest that two different sets of housekeeping genes are appropriate for 10 types of adipose tissue (the HSPCB, ALDOA and GAPDH genes) and five types of muscle tissue (the TOP2B, HSPCB and YWHAZ genes), respectively. Our report will serve as a valuable reference for other studies aimed at measuring tissue-specific mRNA abundance in porcine samples. © 2011 Blackwell Verlag GmbH.

  7. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer.

    Science.gov (United States)

    Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC. Copyright © 2013 Elsevier Inc. All rights reserved.

  8. Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder: association to overlapping traits in ADHD and autism.

    Science.gov (United States)

    Naaijen, J; Bralten, J; Poelmans, G; Glennon, J C; Franke, B; Buitelaar, J K

    2017-01-10

    Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome-wide association levels. The GABA gene set showed nominally significant association with inhibition (P=0.04), but this did not survive correction for multiple comparisons. None of single gene or single variant associations was significant on their own. By analyzing multiple genetic variants within candidate gene sets together, we were able to find genetic associations supporting the involvement of excitatory and inhibitory neurotransmitter systems in ADHD and ASD symptom severity in ADHD.

  9. Meta-analysis of Drosophila circadian microarray studies identifies a novel set of rhythmically expressed genes.

    Directory of Open Access Journals (Sweden)

    Kevin P Keegan

    2007-11-01

    Full Text Available Five independent groups have reported microarray studies that identify dozens of rhythmically expressed genes in the fruit fly Drosophila melanogaster. Limited overlap among the lists of discovered genes makes it difficult to determine which, if any, exhibit truly rhythmic patterns of expression. We reanalyzed data from all five reports and found two sources for the observed discrepancies, the use of different expression pattern detection algorithms and underlying variation among the datasets. To improve upon the methods originally employed, we developed a new analysis that involves compilation of all existing data, application of identical transformation and standardization procedures followed by ANOVA-based statistical prescreening, and three separate classes of post hoc analysis: cross-correlation to various cycling waveforms, autocorrelation, and a previously described fast Fourier transform-based technique. Permutation-based statistical tests were used to derive significance measures for all post hoc tests. We find application of our method, most significantly the ANOVA prescreening procedure, significantly reduces the false discovery rate relative to that observed among the results of the original five reports while maintaining desirable statistical power. We identify a set of 81 cycling transcripts previously found in one or more of the original reports as well as a novel set of 133 transcripts not found in any of the original studies. We introduce a novel analysis method that compensates for variability observed among the original five Drosophila circadian array reports. Based on the statistical fidelity of our meta-analysis results, and the results of our initial validation experiments (quantitative RT-PCR, we predict many of our newly found genes to be bona fide cyclers, and suggest that they may lead to new insights into the pathways through which clock mechanisms regulate behavioral rhythms.

  10. An ancient dental gene set governs development and continuous regeneration of teeth in sharks.

    Science.gov (United States)

    Rasch, Liam J; Martin, Kyle J; Cooper, Rory L; Metscher, Brian D; Underwood, Charlie J; Fraser, Gareth J

    2016-07-15

    The evolution of oral teeth is considered a major contributor to the overall success of jawed vertebrates. This is especially apparent in cartilaginous fishes including sharks and rays, which develop elaborate arrays of highly specialized teeth, organized in rows and retain the capacity for life-long regeneration. Perpetual regeneration of oral teeth has been either lost or highly reduced in many other lineages including important developmental model species, so cartilaginous fishes are uniquely suited for deep comparative analyses of tooth development and regeneration. Additionally, sharks and rays can offer crucial insights into the characters of the dentition in the ancestor of all jawed vertebrates. Despite this, tooth development and regeneration in chondrichthyans is poorly understood and remains virtually uncharacterized from a developmental genetic standpoint. Using the emerging chondrichthyan model, the catshark (Scyliorhinus spp.), we characterized the expression of genes homologous to those known to be expressed during stages of early dental competence, tooth initiation, morphogenesis, and regeneration in bony vertebrates. We have found that expression patterns of several genes from Hh, Wnt/β-catenin, Bmp and Fgf signalling pathways indicate deep conservation over ~450 million years of tooth development and regeneration. We describe how these genes participate in the initial emergence of the shark dentition and how they are redeployed during regeneration of successive tooth generations. We suggest that at the dawn of the vertebrate lineage, teeth (i) were most likely continuously regenerative structures, and (ii) utilised a core set of genes from members of key developmental signalling pathways that were instrumental in creating a dental legacy redeployed throughout vertebrate evolution. These data lay the foundation for further experimental investigations utilizing the unique regenerative capacity of chondrichthyan models to answer evolutionary

  11. Development of a set of SNP markers present in expressed genes of the apple.

    Science.gov (United States)

    Chagné, David; Gasic, Ksenija; Crowhurst, Ross N; Han, Yuepeng; Bassett, Heather C; Bowatte, Deepa R; Lawrence, Timothy J; Rikkerink, Erik H A; Gardiner, Susan E; Korban, Schuyler S

    2008-11-01

    Molecular markers associated with gene coding regions are useful tools for bridging functional and structural genomics. Due to their high abundance in plant genomes, single nucleotide polymorphisms (SNPs) are present within virtually all genomic regions, including most coding sequences. The objective of this study was to develop a set of SNPs for the apple by taking advantage of the wealth of genomics resources available for the apple, including a large collection of expressed sequenced tags (ESTs). Using bioinformatics tools, a search for SNPs within an EST database of approximately 350,000 sequences developed from a variety of apple accessions was conducted. This resulted in the identification of a total of 71,482 putative SNPs. As the apple genome is reported to be an ancient polyploid, attempts were made to verify whether those SNPs detected in silico were attributable either to allelic polymorphisms or to gene duplication or paralogous or homeologous sequence variations. To this end, a set of 464 PCR primer pairs was designed, PCR was amplified using two subsets of plants, and the PCR products were sequenced. The SNPs retrieved from these sequences were then mapped onto apple genetic maps, including a newly constructed map of a Royal Gala x A689-24 cross and a Malling 9 x Robusta 5, map using a bin mapping strategy. The SNP genotyping was performed using the high-resolution melting (HRM) technique. A total of 93 new markers containing 210 coding SNPs were successfully mapped. This new set of SNP markers for the apple offers new opportunities for understanding the genetic control of important horticultural traits using quantitative trait loci (QTL) or linkage disequilibrium analysis. These also serve as useful markers for aligning physical and genetic maps, and as potential transferable markers across the Rosaceae family.

  12. DNMT1 is associated with cell cycle and DNA replication gene sets in diffuse large B-cell lymphoma.

    Science.gov (United States)

    Loo, Suet Kee; Ab Hamid, Suzina Sheikh; Musa, Mustaffa; Wong, Kah Keng

    2018-01-01

    Dysregulation of DNA (cytosine-5)-methyltransferase 1 (DNMT1) is associated with the pathogenesis of various types of cancer. It has been previously shown that DNMT1 is frequently expressed in diffuse large B-cell lymphoma (DLBCL), however its functions remain to be elucidated in the disease. In this study, we gene expression profiled (GEP) shRNA targeting DNMT1(shDNMT1)-treated germinal center B-cell-like DLBCL (GCB-DLBCL)-derived cell line (i.e. HT) compared with non-silencing shRNA (control shRNA)-treated HT cells. Independent gene set enrichment analysis (GSEA) performed using GEPs of shRNA-treated HT cells and primary GCB-DLBCL cases derived from two publicly-available datasets (i.e. GSE10846 and GSE31312) produced three separate lists of enriched gene sets for each gene sets collection from Molecular Signatures Database (MSigDB). Subsequent Venn analysis identified 268, 145 and six consensus gene sets from analyzing gene sets in C2 collection (curated gene sets), C5 sub-collection [gene sets from gene ontology (GO) biological process ontology] and Hallmark collection, respectively to be enriched in positive correlation with DNMT1 expression profiles in shRNA-treated HT cells, GSE10846 and GSE31312 datasets [false discovery rate (FDR) 0.8) with DNMT1 expression and significantly downregulated (log fold-change <-1.35; p<0.05) following DNMT1 silencing in HT cells. These results suggest the involvement of DNMT1 in the activation of cell cycle and DNA replication in DLBCL cells. Copyright © 2017 Elsevier GmbH. All rights reserved.

  13. Identification of sparsely distributed clusters of cis-regulatory elements in sets of co-expressed genes

    OpenAIRE

    Kreiman, Gabriel

    2004-01-01

    Sequence information and high‐throughput methods to measure gene expression levels open the door to explore transcriptional regulation using computational tools. Combinatorial regulation and sparseness of regulatory elements throughout the genome allow organisms to control the spatial and temporal patterns of gene expression. Here we study the organization of cis‐regulatory elements in sets of co‐regulated genes. We build an algorithm to search for combinations of transcription factor binding...

  14. A novel CpG island set identifies tissue-specific methylation at developmental gene loci.

    Directory of Open Access Journals (Sweden)

    Robert Illingworth

    2008-01-01

    Full Text Available CpG islands (CGIs are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.

  15. Transcriptome-wide selection of a reliable set of reference genes for gene expression studies in potato cyst nematodes (Globodera spp.).

    Science.gov (United States)

    Sabeh, Michael; Duceppe, Marc-Olivier; St-Arnaud, Marc; Mimee, Benjamin

    2018-01-01

    Relative gene expression analyses by qRT-PCR (quantitative reverse transcription PCR) require an internal control to normalize the expression data of genes of interest and eliminate the unwanted variation introduced by sample preparation. A perfect reference gene should have a constant expression level under all the experimental conditions. However, the same few housekeeping genes selected from the literature or successfully used in previous unrelated experiments are often routinely used in new conditions without proper validation of their stability across treatments. The advent of RNA-Seq and the availability of public datasets for numerous organisms are opening the way to finding better reference genes for expression studies. Globodera rostochiensis is a plant-parasitic nematode that is particularly yield-limiting for potato. The aim of our study was to identify a reliable set of reference genes to study G. rostochiensis gene expression. Gene expression levels from an RNA-Seq database were used to identify putative reference genes and were validated with qRT-PCR analysis. Three genes, GR, PMP-3, and aaRS, were found to be very stable within the experimental conditions of this study and are proposed as reference genes for future work.

  16. A Meta-Analysis of Multiple Matched Copy Number and Transcriptomics Data Sets for Inferring Gene Regulatory Relationships

    Science.gov (United States)

    Newton, Richard; Wernisch, Lorenz

    2014-01-01

    Inferring gene regulatory relationships from observational data is challenging. Manipulation and intervention is often required to unravel causal relationships unambiguously. However, gene copy number changes, as they frequently occur in cancer cells, might be considered natural manipulation experiments on gene expression. An increasing number of data sets on matched array comparative genomic hybridisation and transcriptomics experiments from a variety of cancer pathologies are becoming publicly available. Here we explore the potential of a meta-analysis of thirty such data sets. The aim of our analysis was to assess the potential of in silico inference of trans-acting gene regulatory relationships from this type of data. We found sufficient correlation signal in the data to infer gene regulatory relationships, with interesting similarities between data sets. A number of genes had highly correlated copy number and expression changes in many of the data sets and we present predicted potential trans-acted regulatory relationships for each of these genes. The study also investigates to what extent heterogeneity between cell types and between pathologies determines the number of statistically significant predictions available from a meta-analysis of experiments. PMID:25148247

  17. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    2016-01-01

    Full Text Available Among non-small cell lung cancer (NSCLC, adenocarcinoma (AC, and squamous cell carcinoma (SCC are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR, can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.

  18. Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

    Directory of Open Access Journals (Sweden)

    Jihye Kim

    2013-09-01

    Full Text Available Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr < 0.05. Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

  19. A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements.

    Directory of Open Access Journals (Sweden)

    Eugeny A Elisaphenko

    2008-06-01

    Full Text Available X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC. Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA.

  20. GSHR, a Web-Based Platform Provides Gene Set-Level Analyses of Hormone Responses in Arabidopsis

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ran

    2018-01-01

    Full Text Available Phytohormones regulate diverse aspects of plant growth and environmental responses. Recent high-throughput technologies have promoted a more comprehensive profiling of genes regulated by different hormones. However, these omics data generally result in large gene lists that make it challenging to interpret the data and extract insights into biological significance. With the rapid accumulation of theses large-scale experiments, especially the transcriptomic data available in public databases, a means of using this information to explore the transcriptional networks is needed. Different platforms have different architectures and designs, and even similar studies using the same platform may obtain data with large variances because of the highly dynamic and flexible effects of plant hormones; this makes it difficult to make comparisons across different studies and platforms. Here, we present a web server providing gene set-level analyses of Arabidopsis thaliana hormone responses. GSHR collected 333 RNA-seq and 1,205 microarray datasets from the Gene Expression Omnibus, characterizing transcriptomic changes in Arabidopsis in response to phytohormones including abscisic acid, auxin, brassinosteroids, cytokinins, ethylene, gibberellins, jasmonic acid, salicylic acid, and strigolactones. These data were further processed and organized into 1,368 gene sets regulated by different hormones or hormone-related factors. By comparing input gene lists to these gene sets, GSHR helped to identify gene sets from the input gene list regulated by different phytohormones or related factors. Together, GSHR links prior information regarding transcriptomic changes induced by hormones and related factors to newly generated data and facilities cross-study and cross-platform comparisons; this helps facilitate the mining of biologically significant information from large-scale datasets. The GSHR is freely available at http://bioinfo.sibs.ac.cn/GSHR/.

  1. A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records.

    Science.gov (United States)

    Jiang, Li; Edwards, Stefan M; Thomsen, Bo; Workman, Christopher T; Guldbrandtsen, Bernt; Sørensen, Peter

    2014-09-24

    Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic profile of genes with respect to their connection to disease phenotypes. The importance of protein-protein interaction networks in the genetic heterogeneity of common diseases or complex traits is becoming increasingly recognized. Thus, the development of a network-based approach combined with phenotypic profiling would be useful for disease gene prioritization. We developed a random-set scoring model and implemented it to quantify phenotype relevance in a network-based disease gene-prioritization approach. We validated our approach based on different gene phenotypic profiles, which were generated from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. We have implemented and validated a network-based approach to prioritize genes for human diseases based on their phenotypic profile. We have devised a powerful and transparent tool to identify and rank candidate genes. Our global gene prioritization provides a unique resource for the biological interpretation of data

  2. Performance of single and concatenated sets of mitochondrial genes at inferring metazoan relationships relative to full mitogenome data.

    Directory of Open Access Journals (Sweden)

    Justin C Havird

    Full Text Available Mitochondrial (mt genes are some of the most popular and widely-utilized genetic loci in phylogenetic studies of metazoan taxa. However, their linked nature has raised questions on whether using the entire mitogenome for phylogenetics is overkill (at best or pseudoreplication (at worst. Moreover, no studies have addressed the comparative phylogenetic utility of mitochondrial genes across individual lineages within the entire Metazoa. To comment on the phylogenetic utility of individual mt genes as well as concatenated subsets of genes, we analyzed mitogenomic data from 1865 metazoan taxa in 372 separate lineages spanning genera to subphyla. Specifically, phylogenies inferred from these datasets were statistically compared to ones generated from all 13 mt protein-coding (PC genes (i.e., the "supergene" set to determine which single genes performed "best" at, and the minimum number of genes required to, recover the "supergene" topology. Surprisingly, the popular marker COX1 performed poorest, while ND5, ND4, and ND2 were most likely to reproduce the "supergene" topology. Averaged across all lineages, the longest ∼2 mt PC genes were sufficient to recreate the "supergene" topology, although this average increased to ∼5 genes for datasets with 40 or more taxa. Furthermore, concatenation of the three "best" performing mt PC genes outperformed that of the three longest mt PC genes (i.e, ND5, COX1, and ND4. Taken together, while not all mt PC genes are equally interchangeable in phylogenetic studies of the metazoans, some subset can serve as a proxy for the 13 mt PC genes. However, the exact number and identity of these genes is specific to the lineage in question and cannot be applied indiscriminately across the Metazoa.

  3. Comprehensive Analysis of MILE Gene Expression Data Set Advances Discovery of Leukaemia Type and Subtype Biomarkers.

    Science.gov (United States)

    Labaj, Wojciech; Papiez, Anna; Polanski, Andrzej; Polanska, Joanna

    2017-03-01

    Large collections of data in studies on cancer such as leukaemia provoke the necessity of applying tailored analysis algorithms to ensure supreme information extraction. In this work, a custom-fit pipeline is demonstrated for thorough investigation of the voluminous MILE gene expression data set. Three analyses are accomplished, each for gaining a deeper understanding of the processes underlying leukaemia types and subtypes. First, the main disease groups are tested for differential expression against the healthy control as in a standard case-control study. Here, the basic knowledge on molecular mechanisms is confirmed quantitatively and by literature references. Second, pairwise comparison testing is performed for juxtaposing the main leukaemia types among each other. In this case by means of the Dice coefficient similarity measure the general relations are pointed out. Moreover, lists of candidate main leukaemia group biomarkers are proposed. Finally, with this approach being successful, the third analysis provides insight into all of the studied subtypes, followed by the emergence of four leukaemia subtype biomarkers. In addition, the class enhanced DEG signature obtained on the basis of novel pipeline processing leads to significantly better classification power of multi-class data classifiers. The developed methodology consisting of batch effect adjustment, adaptive noise and feature filtration coupled with adequate statistical testing and biomarker definition proves to be an effective approach towards knowledge discovery in high-throughput molecular biology experiments.

  4. Quantitative modeling of gene networks of biological systems using fuzzy Petri nets and fuzzy sets

    Directory of Open Access Journals (Sweden)

    Raed I. Hamed

    2018-01-01

    Full Text Available Quantitative demonstrating of organic frameworks has turned into an essential computational methodology in the configuration of novel and investigation of existing natural frameworks. Be that as it may, active information that portrays the framework's elements should be known keeping in mind the end goal to get pertinent results with the routine displaying strategies. This information is frequently robust or even difficult to get. Here, we exhibit a model of quantitative fuzzy rational demonstrating approach that can adapt to obscure motor information and hence deliver applicable results despite the fact that dynamic information is fragmented or just dubiously characterized. Besides, the methodology can be utilized as a part of the blend with the current cutting edge quantitative demonstrating strategies just in specific parts of the framework, i.e., where the data are absent. The contextual analysis of the methodology suggested in this paper is performed on the model of nine-quality genes. We propose a kind of FPN model in light of fuzzy sets to manage the quantitative modeling of biological systems. The tests of our model appear that the model is practical and entirely powerful for information impersonation and thinking of fuzzy expert frameworks.

  5. The SET1 Complex Selects Actively Transcribed Target Genes via Multivalent Interaction with CpG Island Chromatin.

    Science.gov (United States)

    Brown, David A; Di Cerbo, Vincenzo; Feldmann, Angelika; Ahn, Jaewoo; Ito, Shinsuke; Blackledge, Neil P; Nakayama, Manabu; McClellan, Michael; Dimitrova, Emilia; Turberfield, Anne H; Long, Hannah K; King, Hamish W; Kriaucionis, Skirmantas; Schermelleh, Lothar; Kutateladze, Tatiana G; Koseki, Haruhiko; Klose, Robert J

    2017-09-05

    Chromatin modifications and the promoter-associated epigenome are important for the regulation of gene expression. However, the mechanisms by which chromatin-modifying complexes are targeted to the appropriate gene promoters in vertebrates and how they influence gene expression have remained poorly defined. Here, using a combination of live-cell imaging and functional genomics, we discover that the vertebrate SET1 complex is targeted to actively transcribed gene promoters through CFP1, which engages in a form of multivalent chromatin reading that involves recognition of non-methylated DNA and histone H3 lysine 4 trimethylation (H3K4me3). CFP1 defines SET1 complex occupancy on chromatin, and its multivalent interactions are required for the SET1 complex to place H3K4me3. In the absence of CFP1, gene expression is perturbed, suggesting that normal targeting and function of the SET1 complex are central to creating an appropriately functioning vertebrate promoter-associated epigenome. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  6. The SET1 Complex Selects Actively Transcribed Target Genes via Multivalent Interaction with CpG Island Chromatin

    Directory of Open Access Journals (Sweden)

    David A. Brown

    2017-09-01

    Full Text Available Chromatin modifications and the promoter-associated epigenome are important for the regulation of gene expression. However, the mechanisms by which chromatin-modifying complexes are targeted to the appropriate gene promoters in vertebrates and how they influence gene expression have remained poorly defined. Here, using a combination of live-cell imaging and functional genomics, we discover that the vertebrate SET1 complex is targeted to actively transcribed gene promoters through CFP1, which engages in a form of multivalent chromatin reading that involves recognition of non-methylated DNA and histone H3 lysine 4 trimethylation (H3K4me3. CFP1 defines SET1 complex occupancy on chromatin, and its multivalent interactions are required for the SET1 complex to place H3K4me3. In the absence of CFP1, gene expression is perturbed, suggesting that normal targeting and function of the SET1 complex are central to creating an appropriately functioning vertebrate promoter-associated epigenome.

  7. Genomic determinants of sporulation in Bacilli and Clostridia: towards the minimal set of sporulation-specific genes.

    Science.gov (United States)

    Galperin, Michael Y; Mekhedov, Sergei L; Puigbo, Pere; Smirnov, Sergey; Wolf, Yuri I; Rigden, Daniel J

    2012-11-01

    Three classes of low-G+C Gram-positive bacteria (Firmicutes), Bacilli, Clostridia and Negativicutes, include numerous members that are capable of producing heat-resistant endospores. Spore-forming firmicutes include many environmentally important organisms, such as insect pathogens and cellulose-degrading industrial strains, as well as human pathogens responsible for such diseases as anthrax, botulism, gas gangrene and tetanus. In the best-studied model organism Bacillus subtilis, sporulation involves over 500 genes, many of which are conserved among other bacilli and clostridia. This work aimed to define the genomic requirements for sporulation through an analysis of the presence of sporulation genes in various firmicutes, including those with smaller genomes than B. subtilis. Cultivable spore-formers were found to have genomes larger than 2300 kb and encompass over 2150 protein-coding genes of which 60 are orthologues of genes that are apparently essential for sporulation in B. subtilis. Clostridial spore-formers lack, among others, spoIIB, sda, spoVID and safA genes and have non-orthologous displacements of spoIIQ and spoIVFA, suggesting substantial differences between bacilli and clostridia in the engulfment and spore coat formation steps. Many B. subtilis sporulation genes, particularly those encoding small acid-soluble spore proteins and spore coat proteins, were found only in the family Bacillaceae, or even in a subset of Bacillus spp. Phylogenetic profiles of sporulation genes, compiled in this work, confirm the presence of a common sporulation gene core, but also illuminate the diversity of the sporulation processes within various lineages. These profiles should help further experimental studies of uncharacterized widespread sporulation genes, which would ultimately allow delineation of the minimal set(s) of sporulation-specific genes in Bacilli and Clostridia. Published 2012. This article is a U.S. Government work and is in the public domain in the USA.

  8. Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

    Directory of Open Access Journals (Sweden)

    Calvo-Dmgz D.

    2012-12-01

    Full Text Available DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS. The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significat differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.

  9. Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods

    Science.gov (United States)

    Väremo, Leif; Nielsen, Jens; Nookaew, Intawat

    2013-01-01

    Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, we suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation. PMID:23444143

  10. Platform dependence of inference on gene-wise and gene-set involvement in human lung development

    Directory of Open Access Journals (Sweden)

    Kho Alvin T

    2009-06-01

    Full Text Available Abstract Background With the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age. Results We performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis. Conclusion We conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms.

  11. Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

    Directory of Open Access Journals (Sweden)

    Pugalendhi Ganesh Kumar

    Full Text Available This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR-based method for redefining the criterion function of f-information (FI to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA, which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS. Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA, particle swarm optimization (PSO, and artificial bee colony algorithm (ABC on all the datasets. In the global cancer map with repeated measurements (GCM_RM dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%. In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively

  12. Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets

    KAUST Repository

    Permina, Elizaveta A.

    2013-01-01

    Identification of bacterial modulons from series of gene expression measurements on microarrays is a principal problem, especially relevant for inadequately studied but practically important species. Usage of a priori information on regulatory interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set of genes essential for a regulon is used to control modulon updating. Essential genes for a regulon were selected as a subset of regulon genes highly related by different measures to each other. Using Escherichia coli as a model, we studied how modulon identification depends on the data, including the microarray experiments set, the adopted relevance measure and the regulon itself. We have found that results of modulon identification are highly dependent on all parameters studied and thus the resulting modulon varies substantially depending on the identification procedure. Yet, modulons that were identified correctly displayed higher stability during iterations, which allows developing a procedure for reliable modulon identification in the case of less studied species where the known regulatory interactions are sparse. Copyright © 2013 Taylor & Francis.

  13. The Schizophrenia-Associated BRD1 Gene Regulates Behavior, Neurotransmission, and Expression of Schizophrenia Risk Enriched Gene Sets in Mice.

    Science.gov (United States)

    Qvist, Per; Christensen, Jane Hvarregaard; Vardya, Irina; Rajkumar, Anto Praveen; Mørk, Arne; Paternoster, Veerle; Füchtbauer, Ernst-Martin; Pallesen, Jonatan; Fryland, Tue; Dyrvig, Mads; Hauberg, Mads Engel; Lundsberg, Birgitte; Fejgin, Kim; Nyegaard, Mette; Jensen, Kimmo; Nyengaard, Jens Randel; Mors, Ole; Didriksen, Michael; Børglum, Anders Dupont

    2017-07-01

    The schizophrenia-associated BRD1 gene encodes a transcriptional regulator whose comprehensive chromatin interactome is enriched with schizophrenia risk genes. However, the biology underlying the disease association of BRD1 remains speculative. This study assessed the transcriptional drive of a schizophrenia-associated BRD1 risk variant in vitro. Accordingly, to examine the effects of reduced Brd1 expression, we generated a genetically modified Brd1 +/- mouse and subjected it to behavioral, electrophysiological, molecular, and integrative genomic analyses with focus on schizophrenia-relevant parameters. Brd1 +/- mice displayed cerebral histone H3K14 hypoacetylation and a broad range of behavioral changes with translational relevance to schizophrenia. These behaviors were accompanied by striatal dopamine/serotonin abnormalities and cortical excitation-inhibition imbalances involving loss of parvalbumin immunoreactive interneurons. RNA-sequencing analyses of cortical and striatal micropunches from Brd1 +/- and wild-type mice revealed differential expression of genes enriched for schizophrenia risk, including several schizophrenia genome-wide association study risk genes (e.g., calcium channel subunits [Cacna1c and Cacnb2], cholinergic muscarinic receptor 4 [Chrm4)], dopamine receptor D 2 [Drd2], and transcription factor 4 [Tcf4]). Integrative analyses further found differentially expressed genes to cluster in functional networks and canonical pathways associated with mental illness and molecular signaling processes (e.g., glutamatergic, monoaminergic, calcium, cyclic adenosine monophosphate [cAMP], dopamine- and cAMP-regulated neuronal phosphoprotein 32 kDa [DARPP-32], and cAMP responsive element binding protein signaling [CREB]). Our study bridges the gap between genetic association and pathogenic effects and yields novel insights into the unfolding molecular changes in the brain of a new schizophrenia model that incorporates genetic risk at three levels: allelic

  14. A reference gene set for sex pheromone biosynthesis and degradation genes from the diamondback moth, Plutella xylostella, based on genome and transcriptome digital gene expression analyses

    OpenAIRE

    He, Peng; Zhang, Yun-Fei; Hong, Duan-Yang; Wang, Jun; Wang, Xing-Liang; Zuo, Ling-Hua; Tang, Xian-Fu; Xu, Wei-Ming; He, Ming

    2017-01-01

    Background Female moths synthesize species-specific sex pheromone components and release them to attract male moths, which depend on precise sex pheromone chemosensory system to locate females. Two types of genes involved in the sex pheromone biosynthesis and degradation pathways play essential roles in this important moth behavior. To understand the function of genes in the sex pheromone pathway, this study investigated the genome-wide and digital gene expression of sex pheromone biosynthesi...

  15. Gene-set analysis based on the pharmacological profiles of drugs to identify repurposing opportunities in schizophrenia.

    Science.gov (United States)

    de Jong, Simone; Vidler, Lewis R; Mokrab, Younes; Collier, David A; Breen, Gerome

    2016-08-01

    Genome-wide association studies (GWAS) have identified thousands of novel genetic associations for complex genetic disorders, leading to the identification of potential pharmacological targets for novel drug development. In schizophrenia, 108 conservatively defined loci that meet genome-wide significance have been identified and hundreds of additional sub-threshold associations harbour information on the genetic aetiology of the disorder. In the present study, we used gene-set analysis based on the known binding targets of chemical compounds to identify the 'drug pathways' most strongly associated with schizophrenia-associated genes, with the aim of identifying potential drug repositioning opportunities and clues for novel treatment paradigms, especially in multi-target drug development. We compiled 9389 gene sets (2496 with unique gene content) and interrogated gene-based p-values from the PGC2-SCZ analysis. Although no single drug exceeded experiment wide significance (corrected pneratinib. This is a proof of principle analysis showing the potential utility of GWAS data of schizophrenia for the direct identification of candidate drugs and molecules that show polypharmacy. © The Author(s) 2016.

  16. Identification and Validation of a New Set of Five Genes for Prediction of Risk in Early Breast Cancer

    Directory of Open Access Journals (Sweden)

    Giorgio Mustacchi

    2013-05-01

    Full Text Available Molecular tests predicting the outcome of breast cancer patients based on gene expression levels can be used to assist in making treatment decisions after consideration of conventional markers. In this study we identified a subset of 20 mRNA differentially regulated in breast cancer analyzing several publicly available array gene expression data using R/Bioconductor package. Using RTqPCR we evaluate 261 consecutive invasive breast cancer cases not selected for age, adjuvant treatment, nodal and estrogen receptor status from paraffin embedded sections. The biological samples dataset was split into a training (137 cases and a validation set (124 cases. The gene signature was developed on the training set and a multivariate stepwise Cox analysis selected five genes independently associated with DFS: FGF18 (HR = 1.13, p = 0.05, BCL2 (HR = 0.57, p = 0.001, PRC1 (HR = 1.51, p = 0.001, MMP9 (HR = 1.11, p = 0.08, SERF1a (HR = 0.83, p = 0.007. These five genes were combined into a linear score (signature weighted according to the coefficients of the Cox model, as: 0.125FGF18 − 0.560BCL2 + 0.409PRC1 + 0.104MMP9 − 0.188SERF1A (HR = 2.7, 95% CI = 1.9–4.0, p < 0.001. The signature was then evaluated on the validation set assessing the discrimination ability by a Kaplan Meier analysis, using the same cut offs classifying patients at low, intermediate or high risk of disease relapse as defined on the training set (p < 0.001. Our signature, after a further clinical validation, could be proposed as prognostic signature for disease free survival in breast cancer patients where the indication for adjuvant chemotherapy added to endocrine treatment is uncertain.

  17. Identification of a set of endogenous reference genes for miRNA expression studies in Parkinson's disease blood samples.

    Science.gov (United States)

    Serafin, Alice; Foco, Luisa; Blankenburg, Hagen; Picard, Anne; Zanigni, Stefano; Zanon, Alessandra; Pramstaller, Peter P; Hicks, Andrew A; Schwienbacher, Christine

    2014-10-10

    Research on microRNAs (miRNAs) is becoming an increasingly attractive field, as these small RNA molecules are involved in several physiological functions and diseases. To date, only few studies have assessed the expression of blood miRNAs related to Parkinson's disease (PD) using microarray and quantitative real-time PCR (qRT-PCR). Measuring miRNA expression involves normalization of qRT-PCR data using endogenous reference genes for calibration, but their choice remains a delicate problem with serious impact on the resulting expression levels. The aim of the present study was to evaluate the suitability of a set of commonly used small RNAs as normalizers and to identify which of these miRNAs might be considered reliable reference genes in qRT-PCR expression analyses on PD blood samples. Commonly used reference genes snoRNA RNU24, snRNA RNU6B, snoRNA Z30 and miR-103a-3p were selected from the literature. We then analyzed the effect of using these genes as reference, alone or in any possible combination, on the measured expression levels of the target genes miR-30b-5p and miR-29a-3p, which have been previously reported to be deregulated in PD blood samples. We identified RNU24 and Z30 as a reliable and stable pair of reference genes in PD blood samples.

  18. A set of vectors for introduction of antibiotic resistance genes by in vitro Cre-mediated recombination

    Directory of Open Access Journals (Sweden)

    Vassetzky Yegor S

    2008-12-01

    Full Text Available Abstract Background Introduction of new antibiotic resistance genes in the plasmids of interest is a frequent task in molecular cloning practice. Classical approaches involving digestion with restriction endonucleases and ligation are time-consuming. Findings We have created a set of insertion vectors (pINS carrying genes that provide resistance to various antibiotics (puromycin, blasticidin and G418 and containing a loxP site. Each vector (pINS-Puro, pINS-Blast or pINS-Neo contains either a chloramphenicol or a kanamycin resistance gene and is unable to replicate in most E. coli strains as it contains a conditional R6Kγ replication origin. Introduction of the antibiotic resistance genes into the vector of interest is achieved by Cre-mediated recombination between the replication-incompetent pINS and a replication-competent target vector. The recombination mix is then transformed into E. coli and selected by the resistance marker (kanamycin or chloramphenicol present in pINS, which allows to recover the recombinant plasmids with 100% efficiency. Conclusion Here we propose a simple strategy that allows to introduce various antibiotic-resistance genes into any plasmid containing a replication origin, an ampicillin resistance gene and a loxP site.

  19. Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells

    Directory of Open Access Journals (Sweden)

    Monticone Massimiliano

    2012-08-01

    Full Text Available Abstract Background Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting. We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type or highly diffuse single tumor cell infiltration (HD-type. Methods We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM. Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Results Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. Conclusions This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting. Massimiliano Monticone and Antonio Daga contributed equally to this work

  20. Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells.

    Science.gov (United States)

    Monticone, Massimiliano; Daga, Antonio; Candiani, Simona; Romeo, Francesco; Mirisola, Valentina; Viaggi, Silvia; Melloni, Ilaria; Pedemonte, Simona; Zona, Gianluigi; Giaretti, Walter; Pfeffer, Ulrich; Castagnola, Patrizio

    2012-08-17

    Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma) experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting.We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type) or highly diffuse single tumor cell infiltration (HD-type). We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM). Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting.Massimiliano Monticone and Antonio Daga contributed equally to this work.

  1. Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells

    International Nuclear Information System (INIS)

    Monticone, Massimiliano; Giaretti, Walter; Pfeffer, Ulrich; Daga, Antonio; Candiani, Simona; Romeo, Francesco; Mirisola, Valentina; Viaggi, Silvia; Melloni, Ilaria; Pedemonte, Simona; Zona, Gianluigi

    2012-01-01

    Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma) experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting. We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type) or highly diffuse single tumor cell infiltration (HD-type). We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM). Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting. Massimiliano Monticone and Antonio Daga contributed equally to this work

  2. Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting

    DEFF Research Database (Denmark)

    Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero, Laura

    2016-01-01

    indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies.European Journal of Human Genetics advance online publication, 21 September 2016; doi:10.1038/ejhg.2016.118....

  3. Comparative genomics identification of a novel set of temporally regulated hedgehog target genes in the retina.

    Science.gov (United States)

    McNeill, Brian; Perez-Iratxeta, Carol; Mazerolle, Chantal; Furimsky, Marosh; Mishina, Yuji; Andrade-Navarro, Miguel A; Wallace, Valerie A

    2012-03-01

    The hedgehog (Hh) signaling pathway is involved in numerous developmental and adult processes with many links to cancer. In vertebrates, the activity of the Hh pathway is mediated primarily through three Gli transcription factors (Gli1, 2 and 3) that can serve as transcriptional activators or repressors. The identification of Gli target genes is essential for the understanding of the Hh-mediated processes. We used a comparative genomics approach using the mouse and human genomes to identify 390 genes that contained conserved Gli binding sites. RT-qPCR validation of 46 target genes in E14.5 and P0.5 retinal explants revealed that Hh pathway activation resulted in the modulation of 30 of these targets, 25 of which demonstrated a temporal regulation. Further validation revealed that the expression of Bok, FoxA1, Sox8 and Wnt7a was dependent upon Sonic Hh (Shh) signaling in the retina and their regulation is under positive and negative controls by Gli2 and Gli3, respectively. We also show using chromatin immunoprecipitation that Gli2 binds to the Sox8 promoter, suggesting that Sox8 is an Hh-dependent direct target of Gli2. Finally, we demonstrate that the Hh pathway also modulates the expression of Sox9 and Sox10, which together with Sox8 make up the SoxE group. Previously, it has been shown that Hh and SoxE group genes promote Müller glial cell development in the retina. Our data are consistent with the possibility for a role of SoxE group genes downstream of Hh signaling on Müller cell development. Crown Copyright © 2012. Published by Elsevier Inc. All rights reserved.

  4. The imprinted brain: how genes set the balance between autism and psychosis.

    Science.gov (United States)

    Badcock, Christopher

    2011-06-01

    The imprinted brain theory proposes that autism spectrum disorder (ASD) represents a paternal bias in the expression of imprinted genes. This is reflected in a preference for mechanistic cognition and in the corresponding mentalistic deficits symptomatic of ASD. Psychotic spectrum disorder (PSD) would correspondingly result from an imbalance in favor of maternal and/or X-chromosome gene expression. If differences in gene expression were reflected locally in the human brain as mouse models and other evidence suggests they are, ASD would represent not so much an 'extreme male brain' as an extreme paternal one, with PSD correspondingly representing an extreme maternal brain. To the extent that copy number variation resembles imprinting and aneuploidy in nullifying or multiplying the expression of particular genes, it has been found to conform to the diametric model of mental illness peculiar to the imprinted brain theory. The fact that nongenetic factors such as nutrition in pregnancy can mimic and/or interact with imprinted gene expression suggests that the theory might even be able to explain the notable effect of maternal starvation on the risk of PSD - not to mention the 'autism epidemic' of modern affluent societies. Finally, the theory suggests that normality represents balanced cognition, and that genius is an extraordinary extension of cognitive configuration in both mentalistic and mechanistic directions. Were it to be proven correct, the imprinted brain theory would represent one of the biggest single advances in our understanding of the mind and of mental illness that has ever taken place, and would revolutionize psychiatric diagnosis, prevention and treatment - not to mention our understanding of epigenomics.

  5. An 80-gene set to predict response to preoperative chemoradiotherapy for rectal cancer by principle component analysis.

    Science.gov (United States)

    Empuku, Shinichiro; Nakajima, Kentaro; Akagi, Tomonori; Kaneko, Kunihiko; Hijiya, Naoki; Etoh, Tsuyoshi; Shiraishi, Norio; Moriyama, Masatsugu; Inomata, Masafumi

    2016-05-01

    Preoperative chemoradiotherapy (CRT) for locally advanced rectal cancer not only improves the postoperative local control rate, but also induces downstaging. However, it has not been established how to individually select patients who receive effective preoperative CRT. The aim of this study was to identify a predictor of response to preoperative CRT for locally advanced rectal cancer. This study is additional to our multicenter phase II study evaluating the safety and efficacy of preoperative CRT using oral fluorouracil (UMIN ID: 03396). From April, 2009 to August, 2011, 26 biopsy specimens obtained prior to CRT were analyzed by cyclopedic microarray analysis. Response to CRT was evaluated according to a histological grading system using surgically resected specimens. To decide on the number of genes for dividing into responder and non-responder groups, we statistically analyzed the data using a dimension reduction method, a principle component analysis. Of the 26 cases, 11 were responders and 15 non-responders. No significant difference was found in clinical background data between the two groups. We determined that the optimal number of genes for the prediction of response was 80 of 40,000 and the functions of these genes were analyzed. When comparing non-responders with responders, genes expressed at a high level functioned in alternative splicing, whereas those expressed at a low level functioned in the septin complex. Thus, an 80-gene expression set that predicts response to preoperative CRT for locally advanced rectal cancer was identified using a novel statistical method.

  6. Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects.

    Science.gov (United States)

    Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling; Wang, Xianhui; Kang, Le

    2017-06-01

    The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain-containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. © The Authors 2017. Published by Oxford University Press.

  7. The impact of ACE gene polymorphism on the incidence and phenotype of sarcoidosis in rural and urban settings.

    Science.gov (United States)

    Kieszko, Robert; Krawczyk, Paweł; Powrózek, Tomasz; Szudy-Szczyrek, Aneta; Szczyrek, Michał; Homa, Iwona; Daniluk, Jadwiga; Milanowski, Janusz

    2016-12-01

    Sarcoidosis is a multisystem granulomatous disease of unknown etiology. Current theory on the etiology of this disease involves participation of genetic factors and unknown antigens present in the patients' environment. The aim of the study was to evaluate the prevalence of different polymorphic forms of the ACE gene in healthy individuals and sarcoidosis patients, and to estimate the risk of sarcoidosis in carriers of different ACE genotypes living in rural and urban settings. The study group included 180 patients with pulmonary sarcoidosis. Assessment of the disease was based on clinical features, laboratory and imaging examinations, as well as bronchoscopy with bronchoalveolar lavage (BAL). ACE gene polymorphism was examined in DNA isolated from peripheral blood or BAL fluid (BALF) leukocytes. Incidence of sarcoidosis was not influenced by gender, age or place of residence of the patients. There were no differences in the frequency of particular genotypes in patients with sarcoidosis and in healthy individuals. The risk of disease did not depend on the ACE gene polymorphism. There were no differences in the frequencies of the different genotypes and alleles of the ACE gene in patients with sarcoidosis divided by gender, age and place of residence or by clinical manifestation of sarcoidosis. Our results do not support the previous concept which suggested a higher incidence of sarcoidosis in individuals living in rural areas and in carriers of selected ACE genotypes. It is possible that this is related to the changing environment of rural areas, increasing urbanization and pollution.

  8. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

    Directory of Open Access Journals (Sweden)

    Byregowda Munishamappa

    2010-03-01

    .8% in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8% markers with an average of four alleles per marker and an average polymorphic information content (PIC value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.

  9. Candidate genes for chronic obstructive pulmonary disease in two large data sets

    DEFF Research Database (Denmark)

    Bakke, P S; Zhu, G; Gulsvik, A

    2011-01-01

    Lack of reproducibility of findings has been a criticism of genetic association studies in complex diseases like chronic obstructive pulmonary disease (COPD). We selected 257 polymorphisms of 16 genes with reported or potential relationshipsto COPD and genotyped these variants in a case......-control study which included 953 COPD cases and 956 control subjects. We explored the association of these polymorphisms to three COPD phenotypes: a COPD binary phenotype and two quantitative traits (post bronchodilator FEV1 in percent predicted and FEV1/FVC). The polymorphisms significantly associated...... to these phenotypes in this first study were tested in a second, family based, study that included 635 pedigrees with 1910 individuals. Significant associations to the binary COPD phenotype in both populations were seen for STAT1 (rs13010343) and NFKBIB/SIRT2 (rs2241704) (p

  10. Repression of Middle Sporulation Genes in Saccharomyces cerevisiae by the Sum1-Rfm1-Hst1 Complex Is Maintained by Set1 and H3K4 Methylation

    Science.gov (United States)

    Jaiswal, Deepika; Jezek, Meagan; Quijote, Jeremiah; Lum, Joanna; Choi, Grace; Kulkarni, Rushmie; Park, DoHwan; Green, Erin M.

    2017-01-01

    The conserved yeast histone methyltransferase Set1 targets H3 lysine 4 (H3K4) for mono, di, and trimethylation and is linked to active transcription due to the euchromatic distribution of these methyl marks and the recruitment of Set1 during transcription. However, loss of Set1 results in increased expression of multiple classes of genes, including genes adjacent to telomeres and middle sporulation genes, which are repressed under normal growth conditions because they function in meiotic progression and spore formation. The mechanisms underlying Set1-mediated gene repression are varied, and still unclear in some cases, although repression has been linked to both direct and indirect action of Set1, associated with noncoding transcription, and is often dependent on the H3K4me2 mark. We show that Set1, and particularly the H3K4me2 mark, are implicated in repression of a subset of middle sporulation genes during vegetative growth. In the absence of Set1, there is loss of the DNA-binding transcriptional regulator Sum1 and the associated histone deacetylase Hst1 from chromatin in a locus-specific manner. This is linked to increased H4K5ac at these loci and aberrant middle gene expression. These data indicate that, in addition to DNA sequence, histone modification status also contributes to proper localization of Sum1. Our results also show that the role for Set1 in middle gene expression control diverges as cells receive signals to undergo meiosis. Overall, this work dissects an unexplored role for Set1 in gene-specific repression, and provides important insights into a new mechanism associated with the control of gene expression linked to meiotic differentiation. PMID:29066473

  11. A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database

    Directory of Open Access Journals (Sweden)

    Tripputi Mark

    2006-10-01

    Full Text Available Abstract Background Many of the most popular pre-processing methods for Affymetrix expression arrays, such as RMA, gcRMA, and PLIER, simultaneously analyze data across a set of predetermined arrays to improve precision of the final measures of expression. One problem associated with these algorithms is that expression measurements for a particular sample are highly dependent on the set of samples used for normalization and results obtained by normalization with a different set may not be comparable. A related problem is that an organization producing and/or storing large amounts of data in a sequential fashion will need to either re-run the pre-processing algorithm every time an array is added or store them in batches that are pre-processed together. Furthermore, pre-processing of large numbers of arrays requires loading all the feature-level data into memory which is a difficult task even with modern computers. We utilize a scheme that produces all the information necessary for pre-processing using a very large training set that can be used for summarization of samples outside of the training set. All subsequent pre-processing tasks can be done on an individual array basis. We demonstrate the utility of this approach by defining a new version of the Robust Multi-chip Averaging (RMA algorithm which we refer to as refRMA. Results We assess performance based on multiple sets of samples processed over HG U133A Affymetrix GeneChip® arrays. We show that the refRMA workflow, when used in conjunction with a large, biologically diverse training set, results in the same general characteristics as that of RMA in its classic form when comparing overall data structure, sample-to-sample correlation, and variation. Further, we demonstrate that the refRMA workflow and reference set can be robustly applied to naïve organ types and to benchmark data where its performance indicates respectable results. Conclusion Our results indicate that a biologically diverse

  12. A new sequence data set of SSU rRNA gene for Scleractinia and its phylogenetic and ecological applications

    KAUST Repository

    Arrigoni, Roberto; Vacherie, Benoî t; Benzoni, Francesca; Stefani, Fabrizio; Karsenti, Eric; Jaillon, Olivier; Not, Fabrice; Nunes, Flavia; Payri, Claude; Wincker, Patrick; Barbe, Valé rie

    2016-01-01

    Scleractinian corals (i.e. hard corals) play a fundamental role in building and maintaining coral reefs, one of the most diverse ecosystems on Earth. Nevertheless, their phylogenies remain largely unresolved and little is known about dispersal and survival of their planktonic larval phase. The small subunit ribosomal RNA (SSU rRNA) is a commonly used gene for DNA barcoding in several metazoans, and small variable regions of SSU rRNA are widely adopted as barcode marker to investigate marine plankton community structure worldwide. Here, we provide a large sequence data set of the complete SSU rRNA gene from 298 specimens, representing all known extant reef coral families and a total of 106 genera. The secondary structure was extremely conserved within the order with few exceptions due to insertions or deletions occurring in the variable regions. Remarkable differences in SSU rRNA length and base composition were detected between and within acroporids (Acropora, Montipora, Isopora and Alveopora) compared to other corals. The V4 and V9 regions seem to be promising barcode loci because variation at commonly used barcode primer binding sites was extremely low, while their levels of divergence allowed families and genera to be distinguished. A time-calibrated phylogeny of Scleractinia is provided, and mutation rate heterogeneity is demonstrated across main lineages. The use of this data set as a valuable reference for investigating aspects of ecology, biology, molecular taxonomy and evolution of scleractinian corals is discussed.

  13. Transcriptional differences between normal and glioma-derived glial progenitor cells identify a core set of dysregulated genes.

    Science.gov (United States)

    Auvergne, Romane M; Sim, Fraser J; Wang, Su; Chandler-Militello, Devin; Burch, Jaclyn; Al Fanek, Yazan; Davis, Danielle; Benraiss, Abdellatif; Walter, Kevin; Achanta, Pragathi; Johnson, Mahlon; Quinones-Hinojosa, Alfredo; Natesan, Sridaran; Ford, Heide L; Goldman, Steven A

    2013-06-27

    Glial progenitor cells (GPCs) are a potential source of malignant gliomas. We used A2B5-based sorting to extract tumorigenic GPCs from human gliomas spanning World Health Organization grades II-IV. Messenger RNA profiling identified a cohort of genes that distinguished A2B5+ glioma tumor progenitor cells (TPCs) from A2B5+ GPCs isolated from normal white matter. A core set of genes and pathways was substantially dysregulated in A2B5+ TPCs, which included the transcription factor SIX1 and its principal cofactors, EYA1 and DACH2. Small hairpin RNAi silencing of SIX1 inhibited the expansion of glioma TPCs in vitro and in vivo, suggesting a critical and unrecognized role of the SIX1-EYA1-DACH2 system in glioma genesis or progression. By comparing the expression patterns of glioma TPCs with those of normal GPCs, we have identified a discrete set of pathways by which glial tumorigenesis may be better understood and more specifically targeted. Copyright © 2013 The Authors. Published by Elsevier Inc. All rights reserved.

  14. A new sequence data set of SSU rRNA gene for Scleractinia and its phylogenetic and ecological applications

    KAUST Repository

    Arrigoni, Roberto

    2016-11-27

    Scleractinian corals (i.e. hard corals) play a fundamental role in building and maintaining coral reefs, one of the most diverse ecosystems on Earth. Nevertheless, their phylogenies remain largely unresolved and little is known about dispersal and survival of their planktonic larval phase. The small subunit ribosomal RNA (SSU rRNA) is a commonly used gene for DNA barcoding in several metazoans, and small variable regions of SSU rRNA are widely adopted as barcode marker to investigate marine plankton community structure worldwide. Here, we provide a large sequence data set of the complete SSU rRNA gene from 298 specimens, representing all known extant reef coral families and a total of 106 genera. The secondary structure was extremely conserved within the order with few exceptions due to insertions or deletions occurring in the variable regions. Remarkable differences in SSU rRNA length and base composition were detected between and within acroporids (Acropora, Montipora, Isopora and Alveopora) compared to other corals. The V4 and V9 regions seem to be promising barcode loci because variation at commonly used barcode primer binding sites was extremely low, while their levels of divergence allowed families and genera to be distinguished. A time-calibrated phylogeny of Scleractinia is provided, and mutation rate heterogeneity is demonstrated across main lineages. The use of this data set as a valuable reference for investigating aspects of ecology, biology, molecular taxonomy and evolution of scleractinian corals is discussed.

  15. Reconstruction of gene regulatory modules from RNA silencing of IFN-α modulators: experimental set-up and inference method.

    Science.gov (United States)

    Grassi, Angela; Di Camillo, Barbara; Ciccarese, Francesco; Agnusdei, Valentina; Zanovello, Paola; Amadori, Alberto; Finesso, Lorenzo; Indraccolo, Stefano; Toffolo, Gianna Maria

    2016-03-12

    Inference of gene regulation from expression data may help to unravel regulatory mechanisms involved in complex diseases or in the action of specific drugs. A challenging task for many researchers working in the field of systems biology is to build up an experiment with a limited budget and produce a dataset suitable to reconstruct putative regulatory modules worth of biological validation. Here, we focus on small-scale gene expression screens and we introduce a novel experimental set-up and a customized method of analysis to make inference on regulatory modules starting from genetic perturbation data, e.g. knockdown and overexpression data. To illustrate the utility of our strategy, it was applied to produce and analyze a dataset of quantitative real-time RT-PCR data, in which interferon-α (IFN-α) transcriptional response in endothelial cells is investigated by RNA silencing of two candidate IFN-α modulators, STAT1 and IFIH1. A putative regulatory module was reconstructed by our method, revealing an intriguing feed-forward loop, in which STAT1 regulates IFIH1 and they both negatively regulate IFNAR1. STAT1 regulation on IFNAR1 was object of experimental validation at the protein level. Detailed description of the experimental set-up and of the analysis procedure is reported, with the intent to be of inspiration for other scientists who want to realize similar experiments to reconstruct gene regulatory modules starting from perturbations of possible regulators. Application of our approach to the study of IFN-α transcriptional response modulators in endothelial cells has led to many interesting novel findings and new biological hypotheses worth of validation.

  16. Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification

    Science.gov (United States)

    2018-01-01

    One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes. PMID:29470520

  17. Gene Sets for Utilization of Primary and Secondary Nutrition Supplies in the Distal Gut of Endangered Iberian Lynx

    Science.gov (United States)

    Alcaide, María; Messina, Enzo; Richter, Michael; Bargiela, Rafael; Peplies, Jörg; Huws, Sharon A.; Newbold, Charles J.; Golyshin, Peter N.; Simón, Miguel A.; López, Guillermo; Yakimov, Michail M.; Ferrer, Manuel

    2012-01-01

    Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus) fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads) related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of ‘presumptive’ aquaporin aqpZ genes and genes encoding ‘active’ lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(amino)lipids, glyco(amino)glycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases) in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80–100% wild rabbits) but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely

  18. Gene sets for utilization of primary and secondary nutrition supplies in the distal gut of endangered Iberian lynx.

    Directory of Open Access Journals (Sweden)

    María Alcaide

    Full Text Available Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of 'presumptive' aquaporin aqpZ genes and genes encoding 'active' lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(aminolipids, glyco(aminoglycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80-100% wild rabbits but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely

  19. Optimization to the Culture Conditions for Phellinus Production with Regression Analysis and Gene-Set Based Genetic Algorithm

    Science.gov (United States)

    Li, Zhongwei; Xin, Yuezhen; Wang, Xun; Sun, Beibei; Xia, Shengyu; Li, Hui

    2016-01-01

    Phellinus is a kind of fungus and is known as one of the elemental components in drugs to avoid cancers. With the purpose of finding optimized culture conditions for Phellinus production in the laboratory, plenty of experiments focusing on single factor were operated and large scale of experimental data were generated. In this work, we use the data collected from experiments for regression analysis, and then a mathematical model of predicting Phellinus production is achieved. Subsequently, a gene-set based genetic algorithm is developed to optimize the values of parameters involved in culture conditions, including inoculum size, PH value, initial liquid volume, temperature, seed age, fermentation time, and rotation speed. These optimized values of the parameters have accordance with biological experimental results, which indicate that our method has a good predictability for culture conditions optimization. PMID:27610365

  20. NetGen: a novel network-based probabilistic generative model for gene set functional enrichment analysis.

    Science.gov (United States)

    Sun, Duanchen; Liu, Yinliang; Zhang, Xiang-Sun; Wu, Ling-Yun

    2017-09-21

    High-throughput experimental techniques have been dramatically improved and widely applied in the past decades. However, biological interpretation of the high-throughput experimental results, such as differential expression gene sets derived from microarray or RNA-seq experiments, is still a challenging task. Gene Ontology (GO) is commonly used in the functional enrichment studies. The GO terms identified via current functional enrichment analysis tools often contain direct parent or descendant terms in the GO hierarchical structure. Highly redundant terms make users difficult to analyze the underlying biological processes. In this paper, a novel network-based probabilistic generative model, NetGen, was proposed to perform the functional enrichment analysis. An additional protein-protein interaction (PPI) network was explicitly used to assist the identification of significantly enriched GO terms. NetGen achieved a superior performance than the existing methods in the simulation studies. The effectiveness of NetGen was explored further on four real datasets. Notably, several GO terms which were not directly linked with the active gene list for each disease were identified. These terms were closely related to the corresponding diseases when accessed to the curated literatures. NetGen has been implemented in the R package CopTea publicly available at GitHub ( http://github.com/wulingyun/CopTea/ ). Our procedure leads to a more reasonable and interpretable result of the functional enrichment analysis. As a novel term combination-based functional enrichment analysis method, NetGen is complementary to current individual term-based methods, and can help to explore the underlying pathogenesis of complex diseases.

  1. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    Science.gov (United States)

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  2. Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

    Science.gov (United States)

    Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

    2009-01-01

    Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494

  3. A MultiSite GatewayTM vector set for the functional analysis of genes in the model Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Nagels Durand Astrid

    2012-09-01

    Full Text Available Abstract Background Recombinatorial cloning using the GatewayTM technology has been the method of choice for high-throughput omics projects, resulting in the availability of entire ORFeomes in GatewayTM compatible vectors. The MultiSite GatewayTM system allows combining multiple genetic fragments such as promoter, ORF and epitope tag in one single reaction. To date, this technology has not been accessible in the yeast Saccharomyces cerevisiae, one of the most widely used experimental systems in molecular biology, due to the lack of appropriate destination vectors. Results Here, we present a set of three-fragment MultiSite GatewayTM destination vectors that have been developed for gene expression in S. cerevisiae and that allow the assembly of any promoter, open reading frame, epitope tag arrangement in combination with any of four auxotrophic markers and three distinct replication mechanisms. As an example of its applicability, we used yeast three-hybrid to provide evidence for the assembly of a ternary complex of plant proteins involved in jasmonate signalling and consisting of the JAZ, NINJA and TOPLESS proteins. Conclusion Our vectors make MultiSite GatewayTM cloning accessible in S. cerevisiae and implement a fast and versatile cloning method for the high-throughput functional analysis of (heterologous proteins in one of the most widely used model organisms for molecular biology research.

  4. Association of Protein Translation and Extracellular Matrix Gene Sets with Breast Cancer Metastasis: Findings Uncovered on Analysis of Multiple Publicly Available Datasets Using Individual Patient Data Approach.

    Directory of Open Access Journals (Sweden)

    Nilotpal Chowdhury

    Full Text Available Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis.The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets.Four microarray series (having 742 patients were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate - adjusted for expression of Cell cycle related genes and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA.Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed.To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and

  5. Association of Protein Translation and Extracellular Matrix Gene Sets with Breast Cancer Metastasis: Findings Uncovered on Analysis of Multiple Publicly Available Datasets Using Individual Patient Data Approach.

    Science.gov (United States)

    Chowdhury, Nilotpal; Sapru, Shantanu

    2015-01-01

    Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis. The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS) in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets. Four microarray series (having 742 patients) were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate - adjusted for expression of Cell cycle related genes) and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA). Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM) gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed. To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and interesting

  6. The map-1 gene family in root-knot nematodes, Meloidogyne spp.: a set of taxonomically restricted genes specific to clonal species.

    Directory of Open Access Journals (Sweden)

    Iva Tomalova

    Full Text Available Taxonomically restricted genes (TRGs, i.e., genes that are restricted to a limited subset of phylogenetically related organisms, may be important in adaptation. In parasitic organisms, TRG-encoded proteins are possible determinants of the specificity of host-parasite interactions. In the root-knot nematode (RKN Meloidogyne incognita, the map-1 gene family encodes expansin-like proteins that are secreted into plant tissues during parasitism, thought to act as effectors to promote successful root infection. MAP-1 proteins exhibit a modular architecture, with variable number and arrangement of 58 and 13-aa domains in their central part. Here, we address the evolutionary origins of this gene family using a combination of bioinformatics and molecular biology approaches. Map-1 genes were solely identified in one single member of the phylum Nematoda, i.e., the genus Meloidogyne, and not detected in any other nematode, thus indicating that the map-1 gene family is indeed a TRG family. A phylogenetic analysis of the distribution of map-1 genes in RKNs further showed that these genes are specifically present in species that reproduce by mitotic parthenogenesis, with the exception of M. floridensis, and could not be detected in RKNs reproducing by either meiotic parthenogenesis or amphimixis. These results highlight the divergence between mitotic and meiotic RKN species as a critical transition in the evolutionary history of these parasites. Analysis of the sequence conservation and organization of repeated domains in map-1 genes suggests that gene duplication(s together with domain loss/duplication have contributed to the evolution of the map-1 family, and that some strong selection mechanism may be acting upon these genes to maintain their functional role(s in the specificity of the plant-RKN interactions.

  7. A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records

    DEFF Research Database (Denmark)

    Jiang, Li; Edwards, Stefan M.; Thomsen, Bo

    2014-01-01

    from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text...

  8. Interaction between Social/Psychosocial Factors and Genetic Variants on Body Mass Index: A Gene-Environment Interaction Analysis in a Longitudinal Setting.

    Science.gov (United States)

    Zhao, Wei; Ware, Erin B; He, Zihuai; Kardia, Sharon L R; Faul, Jessica D; Smith, Jennifer A

    2017-09-29

    Obesity, which develops over time, is one of the leading causes of chronic diseases such as cardiovascular disease. However, hundreds of BMI (body mass index)-associated genetic loci identified through large-scale genome-wide association studies (GWAS) only explain about 2.7% of BMI variation. Most common human traits are believed to be influenced by both genetic and environmental factors. Past studies suggest a variety of environmental features that are associated with obesity, including socioeconomic status and psychosocial factors. This study combines both gene/regions and environmental factors to explore whether social/psychosocial factors (childhood and adult socioeconomic status, social support, anger, chronic burden, stressful life events, and depressive symptoms) modify the effect of sets of genetic variants on BMI in European American and African American participants in the Health and Retirement Study (HRS). In order to incorporate longitudinal phenotype data collected in the HRS and investigate entire sets of single nucleotide polymorphisms (SNPs) within gene/region simultaneously, we applied a novel set-based test for gene-environment interaction in longitudinal studies (LGEWIS). Childhood socioeconomic status (parental education) was found to modify the genetic effect in the gene/region around SNP rs9540493 on BMI in European Americans in the HRS. The most significant SNP (rs9540488) by childhood socioeconomic status interaction within the rs9540493 gene/region was suggestively replicated in the Multi-Ethnic Study of Atherosclerosis (MESA) ( p = 0.07).

  9. Interaction between Social/Psychosocial Factors and Genetic Variants on Body Mass Index: A Gene-Environment Interaction Analysis in a Longitudinal Setting

    Directory of Open Access Journals (Sweden)

    Wei Zhao

    2017-09-01

    Full Text Available Obesity, which develops over time, is one of the leading causes of chronic diseases such as cardiovascular disease. However, hundreds of BMI (body mass index-associated genetic loci identified through large-scale genome-wide association studies (GWAS only explain about 2.7% of BMI variation. Most common human traits are believed to be influenced by both genetic and environmental factors. Past studies suggest a variety of environmental features that are associated with obesity, including socioeconomic status and psychosocial factors. This study combines both gene/regions and environmental factors to explore whether social/psychosocial factors (childhood and adult socioeconomic status, social support, anger, chronic burden, stressful life events, and depressive symptoms modify the effect of sets of genetic variants on BMI in European American and African American participants in the Health and Retirement Study (HRS. In order to incorporate longitudinal phenotype data collected in the HRS and investigate entire sets of single nucleotide polymorphisms (SNPs within gene/region simultaneously, we applied a novel set-based test for gene-environment interaction in longitudinal studies (LGEWIS. Childhood socioeconomic status (parental education was found to modify the genetic effect in the gene/region around SNP rs9540493 on BMI in European Americans in the HRS. The most significant SNP (rs9540488 by childhood socioeconomic status interaction within the rs9540493 gene/region was suggestively replicated in the Multi-Ethnic Study of Atherosclerosis (MESA (p = 0.07.

  10. Differential gene expression in granulosa cells from polycystic ovary syndrome patients with and without insulin resistance: identification of susceptibility gene sets through network analysis.

    Science.gov (United States)

    Kaur, Surleen; Archer, Kellie J; Devi, M Gouri; Kriplani, Alka; Strauss, Jerome F; Singh, Rita

    2012-10-01

    Polycystic ovary syndrome (PCOS) is a heterogeneous, genetically complex, endocrine disorder of uncertain etiology in women. Our aim was to compare the gene expression profiles in stimulated granulosa cells of PCOS women with and without insulin resistance vs. matched controls. This study included 12 normal ovulatory women (controls), 12 women with PCOS without evidence for insulin resistance (PCOS non-IR), and 16 women with insulin resistance (PCOS-IR) undergoing in vitro fertilization. Granulosa cell gene expression profiling was accomplished using Affymetrix Human Genome-U133 arrays. Differentially expressed genes were classified according to gene ontology using ingenuity pathway analysis tools. Microarray results for selected genes were confirmed by real-time quantitative PCR. A total of 211 genes were differentially expressed in PCOS non-IR and PCOS-IR granulosa cells (fold change≥1.5; P≤0.001) vs. matched controls. Diabetes mellitus and inflammation genes were significantly increased in PCOS-IR patients. Real-time quantitative PCR confirmed higher expression of NCF2 (2.13-fold), TCF7L2 (1.92-fold), and SERPINA1 (5.35-fold). Increased expression of inflammation genes ITGAX (3.68-fold) and TAB2 (1.86-fold) was confirmed in PCOS non-IR. Different cardiometabolic disease genes were differentially expressed in the two groups. Decreased expression of CAV1 (-3.58-fold) in PCOS non-IR and SPARC (-1.88-fold) in PCOS-IR was confirmed. Differential expression of genes involved in TGF-β signaling (IGF2R, increased; and HAS2, decreased), and oxidative stress (TXNIP, increased) was confirmed in both groups. Microarray analysis demonstrated differential expression of genes linked to diabetes mellitus, inflammation, cardiovascular diseases, and infertility in the granulosa cells of PCOS women with and without insulin resistance. Because these dysregulated genes are also involved in oxidative stress, lipid metabolism, and insulin signaling, we hypothesize that these

  11. Interaction between dopamine D2 receptor genotype and parental rule-setting in adolescent alcohol use: evidence for a gene-parenting interaction.

    NARCIS (Netherlands)

    Zwaluw, C.S. van der; Engels, R.C.E.M.; Vermulst, A.A.; Franke, B.; Buitelaar, J.K.; Verkes, R.J.; Scholte, R.H.

    2010-01-01

    Association studies investigating the link between the dopamine D2 receptor gene (DRD2) and alcohol (mis)use have shown inconsistent results. This may be due to lack of attention for environmental factors. High levels of parental rule-setting are associated with lower levels of adolescent alcohol

  12. Bridging cancer biology with the clinic: relative expression of a GRHL2-mediated gene-set pair predicts breast cancer metastasis.

    Directory of Open Access Journals (Sweden)

    Xinan Yang

    Full Text Available Identification and characterization of crucial gene target(s that will allow focused therapeutics development remains a challenge. We have interrogated the putative therapeutic targets associated with the transcription factor Grainy head-like 2 (GRHL2, a critical epithelial regulatory factor. We demonstrate the possibility to define the molecular functions of critical genes in terms of their personalized expression profiles, allowing appropriate functional conclusions to be derived. A novel methodology, relative expression analysis with gene-set pairs (RXA-GSP, is designed to explore the potential clinical utility of cancer-biology discovery. Observing that Grhl2-overexpression leads to increased metastatic potential in vitro, we established a model assuming Grhl2-induced or -inhibited genes confer poor or favorable prognosis respectively for cancer metastasis. Training on public gene expression profiles of 995 breast cancer patients, this method prioritized one gene-set pair (GRHL2, CDH2, FN1, CITED2, MKI67 versus CTNNB1 and CTNNA3 from all 2717 possible gene-set pairs (GSPs. The identified GSP significantly dichotomized 295 independent patients for metastasis-free survival (log-rank tested p = 0.002; severe empirical p = 0.035. It also showed evidence of clinical prognostication in another independent 388 patients collected from three studies (log-rank tested p = 3.3e-6. This GSP is independent of most traditional prognostic indicators, and is only significantly associated with the histological grade of breast cancer (p = 0.0017, a GRHL2-associated clinical character (p = 6.8e-6, Spearman correlation, suggesting that this GSP is reflective of GRHL2-mediated events. Furthermore, a literature review indicates the therapeutic potential of the identified genes. This research demonstrates a novel strategy to integrate both biological experiments and clinical gene expression profiles for extracting and elucidating the genomic

  13. A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records

    DEFF Research Database (Denmark)

    Jiang, Li; Edwards, Stefan M.; Thomsen, Bo

    2014-01-01

    from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining......Background: Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic...

  14. Selection and validation of a set of reliable reference genes for quantitative RT-PCR studies in the brain of the Cephalopod Mollusc Octopus vulgaris

    Directory of Open Access Journals (Sweden)

    Biffali Elio

    2009-07-01

    Full Text Available Abstract Background Quantitative real-time polymerase chain reaction (RT-qPCR is valuable for studying the molecular events underlying physiological and behavioral phenomena. Normalization of real-time PCR data is critical for a reliable mRNA quantification. Here we identify reference genes to be utilized in RT-qPCR experiments to normalize and monitor the expression of target genes in the brain of the cephalopod mollusc Octopus vulgaris, an invertebrate. Such an approach is novel for this taxon and of advantage in future experiments given the complexity of the behavioral repertoire of this species when compared with its relatively simple neural organization. Results We chose 16S, and 18S rRNA, actB, EEF1A, tubA and ubi as candidate reference genes (housekeeping genes, HKG. The expression of 16S and 18S was highly variable and did not meet the requirements of candidate HKG. The expression of the other genes was almost stable and uniform among samples. We analyzed the expression of HKG into two different set of animals using tissues taken from the central nervous system (brain parts and mantle (here considered as control tissue by BestKeeper, geNorm and NormFinder. We found that HKG expressions differed considerably with respect to brain area and octopus samples in an HKG-specific manner. However, when the mantle is treated as control tissue and the entire central nervous system is considered, NormFinder revealed tubA and ubi as the most suitable HKG pair. These two genes were utilized to evaluate the relative expression of the genes FoxP, creb, dat and TH in O. vulgaris. Conclusion We analyzed the expression profiles of some genes here identified for O. vulgaris by applying RT-qPCR analysis for the first time in cephalopods. We validated candidate reference genes and found the expression of ubi and tubA to be the most appropriate to evaluate the expression of target genes in the brain of different octopuses. Our results also underline the

  15. Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome

    Directory of Open Access Journals (Sweden)

    Gaora Peadar Ó

    2010-10-01

    Full Text Available Abstract Background Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Results Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p Conclusion Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of

  16. A tandem sequence motif acts as a distance-dependent enhancer in a set of genes involved in translation by binding the proteins NonO and SFPQ

    Directory of Open Access Journals (Sweden)

    Roepcke Stefan

    2011-12-01

    Full Text Available Abstract Background Bioinformatic analyses of expression control sequences in promoters of co-expressed or functionally related genes enable the discovery of common regulatory sequence motifs that might be involved in co-ordinated gene expression. By studying promoter sequences of the human ribosomal protein genes we recently identified a novel highly specific Localized Tandem Sequence Motif (LTSM. In this work we sought to identify additional genes and LTSM-binding proteins to elucidate potential regulatory mechanisms. Results Genome-wide analyses allowed finding a considerable number of additional LTSM-positive genes, the products of which are involved in translation, among them, translation initiation and elongation factors, and 5S rRNA. Electromobility shift assays then showed specific signals demonstrating the binding of protein complexes to LTSM in ribosomal protein gene promoters. Pull-down assays with LTSM-containing oligonucleotides and subsequent mass spectrometric analysis identified the related multifunctional nucleotide binding proteins NonO and SFPQ in the binding complex. Functional characterization then revealed that LTSM enhances the transcriptional activity of the promoters in dependency of the distance from the transcription start site. Conclusions Our data demonstrate the power of bioinformatic analyses for the identification of biologically relevant sequence motifs. LTSM and the here found LTSM-binding proteins NonO and SFPQ were discovered through a synergistic combination of bioinformatic and biochemical methods and are regulators of the expression of a set of genes of the translational apparatus in a distance-dependent manner.

  17. Taxonomically Different Co-Microsymbionts of a Relict Legume, Oxytropis popoviana, Have Complementary Sets of Symbiotic Genes and Together Increase the Efficiency of Plant Nodulation.

    Science.gov (United States)

    Safronova, Vera I; Belimov, Andrey A; Sazanova, Anna L; Chirak, Elizaveta R; Verkhozina, Alla V; Kuznetsova, Irina G; Andronov, Evgeny E; Puhalsky, Jan V; Tikhonovich, Igor A

    2018-06-20

    Ten rhizobial strains were isolated from root nodules of a relict legume Oxytropis popoviana Peschkova. For identification of the isolates, sequencing of rrs, the internal transcribed spacer region, and housekeeping genes recA, glnII, and rpoB was used. Nine fast-growing isolates were Mesorhizobium-related; eight strains were identified as M. japonicum and one isolate belonged to M. kowhaii. The only slow-growing isolate was identified as a Bradyrhizobium sp. Two strains, M. japonicum Opo-242 and Bradyrhizobium sp. strain Opo-243, were isolated from the same nodule. Symbiotic genes of these isolates were searched throughout the whole-genome sequences. The common nodABC genes and other symbiotic genes required for plant nodulation and nitrogen fixation were present in the isolate Opo-242. Strain Opo-243 did not contain the principal nod, nif, and fix genes; however, five genes (nodP, nodQ, nifL, nolK, and noeL) affecting the specificity of plant-rhizobia interactions but absent in isolate Opo-242 were detected. Strain Opo-243 could not induce nodules but significantly accelerated the root nodule formation after coinoculation with isolate Opo-242. Thus, we demonstrated that taxonomically different strains of the archaic symbiotic system can be co-microsymbionts infecting the same nodule and promoting the nodulation process due to complementary sets of symbiotic genes.

  18. Identification and Construction of Combinatory Cancer Hallmark-Based Gene Signature Sets to Predict Recurrence and Chemotherapy Benefit in Stage II Colorectal Cancer.

    Science.gov (United States)

    Gao, Shanwu; Tibiche, Chabane; Zou, Jinfeng; Zaman, Naif; Trifiro, Mark; O'Connor-McCourt, Maureen; Wang, Edwin

    2016-01-01

    Decisions regarding adjuvant therapy in patients with stage II colorectal cancer (CRC) have been among the most challenging and controversial in oncology over the past 20 years. To develop robust combinatory cancer hallmark-based gene signature sets (CSS sets) that more accurately predict prognosis and identify a subset of patients with stage II CRC who could gain survival benefits from adjuvant chemotherapy. Thirteen retrospective studies of patients with stage II CRC who had clinical follow-up and adjuvant chemotherapy were analyzed. Respective totals of 162 and 843 patients from 2 and 11 independent cohorts were used as the discovery and validation cohorts, respectively. A total of 1005 patients with stage II CRC were included in the 13 cohorts. Among them, 84 of 416 patients in 3 independent cohorts received fluorouracil-based adjuvant chemotherapy. Identification of CSS sets to predict relapse-free survival and identify a subset of patients with stage II CRC who could gain substantial survival benefits from fluorouracil-based adjuvant chemotherapy. Eight cancer hallmark-based gene signatures (30 genes each) were identified and used to construct CSS sets for determining prognosis. The CSS sets were validated in 11 independent cohorts of 767 patients with stage II CRC who did not receive adjuvant chemotherapy. The CSS sets accurately stratified patients into low-, intermediate-, and high-risk groups. Five-year relapse-free survival rates were 94%, 78%, and 45%, respectively, representing 60%, 28%, and 12% of patients with stage II disease. The 416 patients with CSS set-defined high-risk stage II CRC who received fluorouracil-based adjuvant chemotherapy showed a substantial gain in survival benefits from the treatment (ie, recurrence reduced by 30%-40% in 5 years). The CSS sets substantially outperformed other prognostic predictors of stage 2 CRC. They are more accurate and robust for prognostic predictions and facilitate the identification of patients with stage

  19. AnovArray: a set of SAS macros for the analysis of variance of gene expression data

    Directory of Open Access Journals (Sweden)

    Renard Jean-Paul

    2005-06-01

    Full Text Available Abstract Background Analysis of variance is a powerful approach to identify differentially expressed genes in a complex experimental design for microarray and macroarray data. The advantage of the anova model is the possibility to evaluate multiple sources of variation in an experiment. Results AnovArray is a package implementing ANOVA for gene expression data using SAS® statistical software. The originality of the package is 1 to quantify the different sources of variation on all genes together, 2 to provide a quality control of the model, 3 to propose two models for a gene's variance estimation and to perform a correction for multiple comparisons. Conclusion AnovArray is freely available at http://www-mig.jouy.inra.fr/stat/AnovArray and requires only SAS® statistical software.

  20. Two new loci and gene sets related to sex determination and cancer progression are associated with susceptibility to testicular germ cell tumor.

    Science.gov (United States)

    Kristiansen, Wenche; Karlsson, Robert; Rounge, Trine B; Whitington, Thomas; Andreassen, Bettina K; Magnusson, Patrik K; Fosså, Sophie D; Adami, Hans-Olov; Turnbull, Clare; Haugen, Trine B; Grotmol, Tom; Wiklund, Fredrik

    2015-07-15

    Genome-wide association (GWA) studies have reported 19 distinct susceptibility loci for testicular germ cell tumor (TGCT). A GWA study for TGCT was performed by genotyping 610 240 single-nucleotide polymorphisms (SNPs) in 1326 cases and 6687 controls from Sweden and Norway. No novel genome-wide significant associations were observed in this discovery stage. We put forward 27 SNPs from 15 novel regions and 12 SNPs previously reported, for replication in 710 case-parent triads and 289 cases and 290 controls. Predefined biological pathways and processes, in addition to a custom-built sex-determination gene set, were subject to enrichment analyses using Meta-Analysis Gene Set Enrichment of Variant Associations (M) and Improved Gene Set Enrichment Analysis for Genome-wide Association Study (I). In the combined meta-analysis, we observed genome-wide significant association for rs7501939 on chromosome 17q12 (OR = 0.78, 95% CI = 0.72-0.84, P = 1.1 × 10(-9)) and rs2195987 on chromosome 19p12 (OR = 0.76, 95% CI: 0.69-0.84, P = 3.2 × 10(-8)). The marker rs7501939 on chromosome 17q12 is located in an intron of the HNF1B gene, encoding a member of the homeodomain-containing superfamily of transcription factors. The sex-determination gene set (false discovery rate, FDRM cancer and apoptosis, was associated with TGCT (FDR utero are implicated in the development of TGCT. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  1. Heat Stress and Lipopolysaccharide Stimulation of Chicken Macrophage-Like Cell Line Activates Expression of Distinct Sets of Genes.

    Directory of Open Access Journals (Sweden)

    Anna Slawinska

    Full Text Available Acute heat stress requires immediate adjustment of the stressed individual to sudden changes of ambient temperatures. Chickens are particularly sensitive to heat stress due to development of insufficient physiological mechanisms to mitigate its effects. One of the symptoms of heat stress is endotoxemia that results from release of the lipopolysaccharide (LPS from the guts. Heat-related cytotoxicity is mitigated by the innate immune system, which is comprised mostly of phagocytic cells such as monocytes and macrophages. The objective of this study was to analyze the molecular responses of the chicken macrophage-like HD11 cell line to combined heat stress and lipopolysaccharide treatment in vitro. The cells were heat-stressed and then allowed a temperature-recovery period, during which the gene expression was investigated. LPS was added to the cells to mimic the heat-stress-related endotoxemia. Semi high-throughput gene expression analysis was used to study a gene panel comprised of heat shock proteins, stress-related genes, signaling molecules and immune response genes. HD11 cell line responded to heat stress with increased mRNA abundance of the HSP25, HSPA2 and HSPH1 chaperones as well as DNAJA4 and DNAJB6 co-chaperones. The anti-apoptotic gene BAG3 was also highly up-regulated, providing evidence that the cells expressed pro-survival processes. The immune response of the HD11 cell line to LPS in the heat stress environment (up-regulation of CCL4, CCL5, IL1B, IL8 and iNOS was higher than in thermoneutral conditions. However, the peak in the transcriptional regulation of the immune genes was after two hours of temperature-recovery. Therefore, we propose the potential influence of the extracellular heat shock proteins not only in mitigating effects of abiotic stress but also in triggering the higher level of the immune responses. Finally, use of correlation networks for the data analysis aided in discovering subtle differences in the gene

  2. Literature mining, gene-set enrichment and pathway analysis for target identification in Behçet's disease.

    Science.gov (United States)

    Wilson, Paul; Larminie, Christopher; Smith, Rona

    2016-01-01

    To use literature mining to catalogue Behçet's associated genes, and advanced computational methods to improve the understanding of the pathways and signalling mechanisms that lead to the typical clinical characteristics of Behçet's patients. To extend this technique to identify potential treatment targets for further experimental validation. Text mining methods combined with gene enrichment tools, pathway analysis and causal analysis algorithms. This approach identified 247 human genes associated with Behçet's disease and the resulting disease map, comprising 644 nodes and 19220 edges, captured important details of the relationships between these genes and their associated pathways, as described in diverse data repositories. Pathway analysis has identified how Behçet's associated genes are likely to participate in innate and adaptive immune responses. Causal analysis algorithms have identified a number of potential therapeutic strategies for further investigation. Computational methods have captured pertinent features of the prominent disease characteristics presented in Behçet's disease and have highlighted NOD2, ICOS and IL18 signalling as potential therapeutic strategies.

  3. Sexual and asexual oogenesis require the expression of unique and shared sets of genes in the insect Acyrthosiphon pisum

    Directory of Open Access Journals (Sweden)

    Gallot Aurore

    2012-02-01

    Full Text Available Abstract Background Although sexual reproduction is dominant within eukaryotes, asexual reproduction is widespread and has evolved independently as a derived trait in almost all major taxa. How asexuality evolved in sexual organisms is unclear. Aphids, such as Acyrthosiphon pisum, alternate between asexual and sexual reproductive means, as the production of parthenogenetic viviparous females or sexual oviparous females and males varies in response to seasonal photoperiodism. Consequently, sexual and asexual development in aphids can be analyzed simultaneously in genetically identical individuals. Results We compared the transcriptomes of aphid embryos in the stages of development during which the trajectory of oogenesis is determined for producing sexual or asexual gametes. This study design aimed at identifying genes involved in the onset of the divergent mechanisms that result in the sexual or asexual phenotype. We detected 33 genes that were differentially transcribed in sexual and asexual embryos. Functional annotation by gene ontology (GO showed a biological signature of oogenesis, cell cycle regulation, epigenetic regulation and RNA maturation. In situ hybridizations demonstrated that 16 of the differentially-transcribed genes were specifically expressed in germ cells and/or oocytes of asexual and/or sexual ovaries, and therefore may contribute to aphid oogenesis. We categorized these 16 genes by their transcription patterns in the two types of ovaries; they were: i expressed during sexual and asexual oogenesis; ii expressed during sexual and asexual oogenesis but with different localizations; or iii expressed only during sexual or asexual oogenesis. Conclusions Our results show that asexual and sexual oogenesis in aphids share common genetic programs but diverge by adapting specificities in their respective gene expression profiles in germ cells and oocytes.

  4. gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

    Science.gov (United States)

    Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

    2017-05-01

    Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.

  5. Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery.

    Science.gov (United States)

    Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R

    2016-05-27

    Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and

  6. Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

    Science.gov (United States)

    Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy

    2013-08-01

    Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association

  7. Assessment of topoisomerase II-alpha gene status by dual color chromogenic in situ hybridization in a set of Iraqi patients with invasive breast carcinoma

    Directory of Open Access Journals (Sweden)

    Rasha Abd Alraouf Neama

    2017-01-01

    Full Text Available Background: The human epidermal growth factor receptor 2(HER2 proto-oncogene is overexpressed or amplified in approximately 15%–25% of invasive breast cancers. Approximately 35% of HER2-amplified breast cancers have coamplification of the topoisomerase II-alpha (TOP2A gene encoding an enzyme that is a major target of anthracyclines. Hence, the determination of genetic alteration (amplification or deletion of both genes is considered as an important predictive factor that determines the response of breast cancer patients to treatment. The aims of this study are to determinate TOP2A status gene amplification in a set of Iraqi patients with breast cancer that have had an equivocal (2+ and positive HER2/neu by immunohistochemistry (IHC and to compare the results with estrogen receptor (ER and progesterone receptor (PR and HER2/neu status. Patients and Methods: A cross-sectional prospective study done on 53 patients with invasive breast carcinoma. Twenty-six out of total 53 cases were positive HER2/neu (3+, the remaining 27 equivocal HER2-IHC (2+ cases reanalyzed using dual-color chromogenic in situ hybridization (ZytoVision probe kit for further identification of HER2/neu gene amplification. Using chromogenic in situ hybridization (CISH, TOP2A gene status determination was done for all cases. Results: There is a direct significant correlation between TOP2A gene amplification and HER2/neu positivity, P < 0.05 in that 15 (39.4% out of 38 positive HER2/neu cases were associated with topoisomerase gene amplification. Regarding relation of topoisomerase gene to hormone receptor status (ER and PR, there was a significant negative relationship between the gene and ER receptor status. The higher level of gene amplification was noticed in ER and PR negative cases in about 13 (43.3% and 14 (48.2% for ER and PR, respectively. Conclusion: TOP2A gene status has a significantly positive correlation with HER2/neu status while it has a significantly negative

  8. Genetic variations in the CLNK gene and ZNF518B gene are associated with gout in case-control sample sets.

    Science.gov (United States)

    Jin, Tian-Bo; Ren, Yongchao; Shi, Xugang; Jiri, Mutu; He, Na; Feng, Tian; Yuan, Dongya; Kang, Longli

    2015-07-01

    A genome-wide association study of gout in European populations identified 12 genetic variants strongly associated with risk of gout, but it is unknown whether these variants are also associated with gout risk in Chinese populations. A total of 145 patients with gout and 310 healthy control patients were recruited for a case-control association study. Twelve SNPs of CLNK and ZNF518B gene were genotyped, and association analysis was performed. Odds ratios (ORs) with 95 % confidence intervals (CIs) were used to assess the association. Overall, we found four risk alleles for gout in patients: the allele "G" of rs2041215 and rs1686947 in the CLNK gene by dominant model (OR 1.66; 95 % CI 1.04-2.63; p = 0.031) (OR 2.19; 95 % CI 1.38-3.46; p = 0.001) and additive model (OR 1.39; 95 % CI 1.00-1.93; p = 0.049) (OR 1.67; 95 % CI 1.19-2.32; p = 0.003), respectively, and the allele "A" of rs10938799 and rs10016022 in ZNF518B gene by recessive model (OR 4.66; 95 % CI 1.44-15.09; p = 0.008) (OR 4.54; 95 % CI 1.23-16.76; p = 0.020). Further haplotype analysis showed that the TCATTCTGA haplotype of CLNK was more frequent among patients with gout (adjusted OR 0.48; 95 % CI 0.24-0.95; p = 0.036). Additionally, polymorphisms of rs2041215, rs10938799, and rs17467273 were also correlated with clinical pathological parameters. This study provides evidence for gout susceptibility genes, CLNK and ZNF518B, in a Chinese population, which may have potential as diagnostic and prognostic marker for gout patients.

  9. Genetic analysis and fine mapping of LH1 and LH2, a set of complementary genes controlling late heading in rice (Oryza sativa L.).

    Science.gov (United States)

    Liu, Shuang; Wang, Feng; Gao, Li Jun; Li, Jin Hua; Li, Rong Bai; Gao, Han Liang; Deng, Guo Fu; Yang, Jin Shui; Luo, Xiao Jin

    2012-12-01

    Heading date in rice (Oryza sativa L.) is a critical agronomic trait with a complex inheritance. To investigate the genetic basis and mechanism of gene interaction in heading date, we conducted genetic analysis on segregation populations derived from crosses among the indica cultivars Bo B, Yuefeng B and Baoxuan 2. A set of dominant complementary genes controlling late heading, designated LH1 and LH2, were detected by molecular marker mapping. Genetic analysis revealed that Baoxuan 2 contains both dominant genes, while Bo B and Yuefeng B each possess either LH1 or LH2. Using larger populations with segregant ratios of 3 : 1, we fine-mapped LH1 to a 63-kb region near the centromere of chromosome 7 flanked by markers RM5436 and RM8034, and LH2 to a 177-kb region on the short arm of chromosome 8 between flanking markers Indel22468-3 and RM25. Some candidate genes were identified through sequencing of Bo B and Yuefeng B in these target regions. Our work provides a solid foundation for further study on gene interaction in heading date and has application in marker-assisted breeding of photosensitive hybrid rice in China.

  10. Genome-wide methylation analysis identifies a core set of hypermethylated genes in CIMP-H colorectal cancer.

    Science.gov (United States)

    McInnes, Tyler; Zou, Donghui; Rao, Dasari S; Munro, Francesca M; Phillips, Vicky L; McCall, John L; Black, Michael A; Reeve, Anthony E; Guilford, Parry J

    2017-03-28

    Aberrant DNA methylation profiles are a characteristic of all known cancer types, epitomized by the CpG island methylator phenotype (CIMP) in colorectal cancer (CRC). Hypermethylation has been observed at CpG islands throughout the genome, but it is unclear which factors determine whether an individual island becomes methylated in cancer. DNA methylation in CRC was analysed using the Illumina HumanMethylation450K array. Differentially methylated loci were identified using Significance Analysis of Microarrays (SAM) and the Wilcoxon Signed Rank (WSR) test. Unsupervised hierarchical clustering was used to identify methylation subtypes in CRC. In this study we characterized the DNA methylation profiles of 94 CRC tissues and their matched normal counterparts. Consistent with previous studies, unsupervized hierarchical clustering of genome-wide methylation data identified three subtypes within the tumour samples, designated CIMP-H, CIMP-L and CIMP-N, that showed high, low and very low methylation levels, respectively. Differential methylation between normal and tumour samples was analysed at the individual CpG level, and at the gene level. The distribution of hypermethylation in CIMP-N tumours showed high inter-tumour variability and appeared to be highly stochastic in nature, whereas CIMP-H tumours exhibited consistent hypermethylation at a subset of genes, in addition to a highly variable background of hypermethylated genes. EYA4, TFPI2 and TLX1 were hypermethylated in more than 90% of all tumours examined. One-hundred thirty-two genes were hypermethylated in 100% of CIMP-H tumours studied and these were highly enriched for functions relating to skeletal system development (Bonferroni adjusted p value =2.88E-15), segment specification (adjusted p value =9.62E-11), embryonic development (adjusted p value =1.52E-04), mesoderm development (adjusted p value =1.14E-20), and ectoderm development (adjusted p value =7.94E-16). Our genome-wide characterization of DNA

  11. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  12. Poster: Observing change in crowded data sets in 3D space - Visualizing gene expression in human tissues

    KAUST Repository

    Rogowski, Marcin

    2013-03-01

    We have been confronted with a real-world problem of visualizing and observing change of gene expression between different human tissues. In this paper, we are presenting a universal representation space based on two-dimensional gel electrophoresis as opposed to force-directed layouts encountered most often in similar problems. We are discussing the methods we devised to make observing change more convenient in a 3D virtual reality environment. © 2013 IEEE.

  13. Independent evolution of the core and accessory gene sets in the genus Neisseria: insights gained from the genome of Neisseria lactamica isolate 020-06

    Directory of Open Access Journals (Sweden)

    White Brian

    2010-11-01

    Full Text Available Abstract Background The genus Neisseria contains two important yet very different pathogens, N. meningitidis and N. gonorrhoeae, in addition to non-pathogenic species, of which N. lactamica is the best characterized. Genomic comparisons of these three bacteria will provide insights into the mechanisms and evolution of pathogenesis in this group of organisms, which are applicable to understanding these processes more generally. Results Non-pathogenic N. lactamica exhibits very similar population structure and levels of diversity to the meningococcus, whilst gonococci are essentially recent descendents of a single clone. All three species share a common core gene set estimated to comprise around 1190 CDSs, corresponding to about 60% of the genome. However, some of the nucleotide sequence diversity within this core genome is particular to each group, indicating that cross-species recombination is rare in this shared core gene set. Other than the meningococcal cps region, which encodes the polysaccharide capsule, relatively few members of the large accessory gene pool are exclusive to one species group, and cross-species recombination within this accessory genome is frequent. Conclusion The three Neisseria species groups represent coherent biological and genetic groupings which appear to be maintained by low rates of inter-species horizontal genetic exchange within the core genome. There is extensive evidence for exchange among positively selected genes and the accessory genome and some evidence of hitch-hiking of housekeeping genes with other loci. It is not possible to define a 'pathogenome' for this group of organisms and the disease causing phenotypes are therefore likely to be complex, polygenic, and different among the various disease-associated phenotypes observed.

  14. UMD-USHbases: a comprehensive set of databases to record and analyse pathogenic mutations and unclassified variants in seven Usher syndrome causing genes.

    Science.gov (United States)

    Baux, David; Faugère, Valérie; Larrieu, Lise; Le Guédard-Méreuze, Sandie; Hamroun, Dalil; Béroud, Christophe; Malcolm, Sue; Claustres, Mireille; Roux, Anne-Françoise

    2008-08-01

    Using the Universal Mutation Database (UMD) software, we have constructed "UMD-USHbases", a set of relational databases of nucleotide variations for seven genes involved in Usher syndrome (MYO7A, CDH23, PCDH15, USH1C, USH1G, USH3A and USH2A). Mutations in the Usher syndrome type I causing genes are also recorded in non-syndromic hearing loss cases and mutations in USH2A in non-syndromic retinitis pigmentosa. Usher syndrome provides a particular challenge for molecular diagnostics because of the clinical and molecular heterogeneity. As many mutations are missense changes, and all the genes also contain apparently non-pathogenic polymorphisms, well-curated databases are crucial for accurate interpretation of pathogenicity. Tools are provided to assess the pathogenicity of mutations, including conservation of amino acids and analysis of splice-sites. Reference amino acid alignments are provided. Apparently non-pathogenic variants in patients with Usher syndrome, at both the nucleotide and amino acid level, are included. The UMD-USHbases currently contain more than 2,830 entries including disease causing mutations, unclassified variants or non-pathogenic polymorphisms identified in over 938 patients. In addition to data collected from 89 publications, 15 novel mutations identified in our laboratory are recorded in MYO7A (6), CDH23 (8), or PCDH15 (1) genes. Information is given on the relative involvement of the seven genes, the number and distribution of variants in each gene. UMD-USHbases give access to a software package that provides specific routines and optimized multicriteria research and sorting tools. These databases should assist clinicians and geneticists seeking information about mutations responsible for Usher syndrome.

  15. Alpha-gliadin genes from the A, B, and D genomes of wheat contain different sets of celiac disease epitopes

    Directory of Open Access Journals (Sweden)

    van Veelen Peter A

    2006-01-01

    Full Text Available Abstract Background Bread wheat (Triticum aestivum is an important staple food. However, wheat gluten proteins cause celiac disease (CD in 0.5 to 1% of the general population. Among these proteins, the α-gliadins contain several peptides that are associated to the disease. Results We obtained 230 distinct α-gliadin gene sequences from severaldiploid wheat species representing the ancestral A, B, and D genomes of the hexaploid bread wheat. The large majority of these sequences (87% contained an internal stop codon. All α-gliadin sequences could be distinguished according to the genome of origin on the basis of sequence similarity, of the average length of the polyglutamine repeats, and of the differences in the presence of four peptides that have been identified as T cell stimulatory epitopes in CD patients through binding to HLA-DQ2/8. By sequence similarity, α-gliadins from the public database of hexaploid T. aestivum could be assigned directly to chromosome 6A, 6B, or 6D. T. monococcum (A genome sequences, as well as those from chromosome 6A of bread wheat, almost invariably contained epitope glia-α9 and glia-α20, but never the intact epitopes glia-α and glia-α2. A number of sequences from T. speltoides, as well as a number of sequences fromchromosome 6B of bread wheat, did not contain any of the four T cell epitopes screened for. The sequences from T. tauschii (D genome, as well as those from chromosome 6D of bread wheat, were found to contain all of these T cell epitopes in variable combinations per gene. The differences in epitope composition resulted mainly from point mutations. These substitutions appeared to be genome specific. Conclusion Our analysis shows that α-gliadin sequences from the three genomes of bread wheat form distinct groups. The four known T cell stimulatory epitopes are distributed non-randomly across the sequences, indicating that the three genomes contribute differently to epitope content. A systematic

  16. Genetic variation in a microRNA-502 minding site in SET8 gene confers clinical outcome of non-small cell lung cancer in a Chinese population.

    Directory of Open Access Journals (Sweden)

    Jiali Xu

    Full Text Available BACKGROUND: Genetic variants may influence microRNA-target interaction through modulate their binding affinity, creating or destroying miRNA-binding sites. SET8, a member of the SET domain-containing methyltransferase, has been implicated in a variety array of biological processes. METHODS: Using Taqman assay, we genotyped a polymorphism rs16917496 T>C within the miR-502 binding site in the 3'-untranslated region of the SET8 gene in 576 non-small cell lung cancer (NSCLC patients. Functions of rs16917496 were investigated using luciferase activity assay and validated by immunostaining. RESULTS: Log-rank test and cox regression indicated that the CC genotype was associated with a longer survival and a reduced risk of death for NSCLC [58.0 vs. 41.0 months, P = 0.031; hazard ratio = 0.44, 95% confidential interval: 0.26-0.74]. Further stepwise regression analysis suggested rs16917496 was an independently favorable factor for prognosis and the protective effect more prominent in never smokers, patients without diabetes and patients who received chemotherapy. A significant interaction was observed between rs16917496 and smoking status in relation to NSCLC survival (PC located at miR-502 binding site contributes to NSCLC survival by altering SET8 expression through modulating miRNA-target interaction.

  17. HU participates in expression of a specific set of genes required for growth and survival at acidic pH in Escherichia coli.

    Science.gov (United States)

    Bi, Hongkai; Sun, Lianle; Fukamachi, Toshihiko; Saito, Hiromi; Kobayashi, Hiroshi

    2009-05-01

    The major histone-like Escherichia coli protein, HU, is composed of alpha and beta subunits respectively encoded by hupA and hupB in Escherichia coli. A mutant deficient in both hupA and hupB grew at a slightly slower rate than the wild type at pH 7.5. Growth of the mutant diminished with a decrease in pH, and no growth was observed at pH 4.6. Mutants of either hupA or hupB grew at all pH levels tested. The arginine-dependent survival at pH 2.5 was diminished approximately 60-fold by the deletion of both hupA and hupB, whereas the survival was slightly affected by the deletion of either hupA or hupB. The mRNA levels of adiA and adiC, which respectively encode arginine decarboxylase and arginine/agmatine antiporter, were low in the mutant deficient in both hupA and hupB. The deletion of both hupA and hupB had little effect on survival at pH 2.5 in the presence of glutamate or lysine, and expression of the genes for glutamate and lysine decarboxylases was not impaired by the deletion of the HU genes. These results suggest that HU regulates expression of the specific set of genes required for growth and survival in acidic environments.

  18. Fine-scale linkage mapping reveals a small set of candidate genes influencing honey bee grooming behavior in response to Varroa mites.

    Directory of Open Access Journals (Sweden)

    Miguel E Arechavaleta-Velasco

    Full Text Available Populations of honey bees in North America have been experiencing high annual colony mortality for 15-20 years. Many apicultural researchers believe that introduced parasites called Varroa mites (V. destructor are the most important factor in colony deaths. One important resistance mechanism that limits mite population growth in colonies is the ability of some lines of honey bees to groom mites from their bodies. To search for genes influencing this trait, we used an Illumina Bead Station genotyping array to determine the genotypes of several hundred worker bees at over a thousand single-nucleotide polymorphisms in a family that was apparently segregating for alleles influencing this behavior. Linkage analyses provided a genetic map with 1,313 markers anchored to genome sequence. Genotypes were analyzed for association with grooming behavior, measured as the time that individual bees took to initiate grooming after mites were placed on their thoraces. Quantitative-trait-locus interval mapping identified a single chromosomal region that was significant at the chromosome-wide level (p<0.05 on chromosome 5 with a LOD score of 2.72. The 95% confidence interval for quantitative trait locus location contained only 27 genes (honey bee official gene annotation set 2 including Atlastin, Ataxin and Neurexin-1 (AmNrx1, which have potential neurodevelopmental and behavioral effects. Atlastin and Ataxin homologs are associated with neurological diseases in humans. AmNrx1 codes for a presynaptic protein with many alternatively spliced isoforms. Neurexin-1 influences the growth, maintenance and maturation of synapses in the brain, as well as the type of receptors most prominent within synapses. Neurexin-1 has also been associated with autism spectrum disorder and schizophrenia in humans, and self-grooming behavior in mice.

  19. Pathway profiles based on gene-set enrichment analysis in the honey bee Apis mellifera under brood rearing-suppressed conditions.

    Science.gov (United States)

    Kim, Kyungmun; Kim, Ju Hyeon; Kim, Young Ho; Hong, Seong-Eui; Lee, Si Hyeock

    2018-01-01

    Perturbation of normal behaviors in honey bee colonies by any external factor can immediately reduce the colony's capacity for brood rearing, which can eventually lead to colony collapse. To investigate the effects of brood-rearing suppression on the biology of honey bee workers, gene-set enrichment analysis of the transcriptomes of worker bees with or without suppressed brood rearing was performed. When brood rearing was suppressed, pathways associated with both protein degradation and synthesis were simultaneously over-represented in both nurses and foragers, and their overall pathway representation profiles resembled those of normal foragers and nurses, respectively. Thus, obstruction of normal labor induced over-representation in pathways related with reshaping of worker bee physiology, suggesting that transition of labor is physiologically reversible. In addition, some genes associated with the regulation of neuronal excitability, cellular and nutritional stress and aggressiveness were over-expressed under brood rearing suppression perhaps to manage in-hive stress under unfavorable conditions. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. Thy1.2 driven expression of transgenic His₆-SUMO2 in the brain of mice alters a restricted set of genes.

    Science.gov (United States)

    Rossner, Moritz J; Tirard, Marilyn

    2014-08-05

    Protein SUMOylation is a post-translational protein modification with a key regulatory role in nerve cell development and function, but its function in mammals in vivo has only been studied cursorily. We generated two new transgenic mouse lines that express His6-tagged SUMO1 and SUMO2 driven by the Thy1.2 promoter. The brains of mice of the two lines express transgenic His6-SUMO peptides and conjugate them to substrates in vivo but cytoarchitecture and synaptic organization of adult transgenic mouse brains are indistinguishable from the wild-type situation. We investigated the impact of transgenic SUMO expression on gene transcription in the hippocampus by performing genome wide analyses using microarrays. Surprisingly, no changes were observed in Thy1.2::His6-SUMO1 transgenic mice and only a restricted set of genes were upregulated in Thy1.2::His6-SUMO2 mice. Among these, Penk1 (Preproenkephalin 1), which encodes Met-enkephalin neuropeptides, showed the highest degree of alteration. Accordingly, a significant increase in Met-enkephalin peptide levels in the hippocampus of Thy1.2::His6-SUMO2 was detected, but the expression levels and cellular localization of Met-enkephalin receptors were not changed. Thus, transgenic neuronal expression of His6-SUMO1 or His6-SUMO2 only induces very minor phenotypical changes in mice. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. A set of genes previously implicated in the hypoxia response might be an important modulator in the rat ear tissue response to mechanical stretch

    Directory of Open Access Journals (Sweden)

    Orgill Dennis

    2007-11-01

    Full Text Available Abstract Background Wounds are increasingly important in our aging societies. Pathologies such as diabetes predispose patients to chronic wounds that can cause pain, infection, and amputation. The vacuum assisted closure device shows remarkable outcomes in wound healing. Its mechanism of action is unclear despite several hypotheses advanced. We previously hypothesized that micromechanical forces can heal wounds. To understand better the biological response of soft tissue to forces, rat ears in vivo were stretched and their gene expression patterns over time obtained. The absolute enrichment (AE algorithm that obtains a combined up and down regulated picture of the expression analysis was implemented. Results With the use of AE, the hypoxia gene set was the most important at a highly significant level. A co-expression network analysis showed that important co-regulated members of the hypoxia pathway include a glucose transporter (slc2a8, heme oxygenase, and nitric oxide synthase2 among others. Conclusion It appears that the hypoxia pathway may be an important modulator of response of soft tissue to forces. This finding gives us insights not only into the underlying biology, but also into clinical interventions that could be designed to mimic within wounded tissue the effects of forces without all the negative effects that forces themselves create.

  2. Set points, settling points and some alternative models: theoretical options to understand how genes and environments combine to regulate body adiposity

    Directory of Open Access Journals (Sweden)

    John R. Speakman

    2011-11-01

    Full Text Available The close correspondence between energy intake and expenditure over prolonged time periods, coupled with an apparent protection of the level of body adiposity in the face of perturbations of energy balance, has led to the idea that body fatness is regulated via mechanisms that control intake and energy expenditure. Two models have dominated the discussion of how this regulation might take place. The set point model is rooted in physiology, genetics and molecular biology, and suggests that there is an active feedback mechanism linking adipose tissue (stored energy to intake and expenditure via a set point, presumably encoded in the brain. This model is consistent with many of the biological aspects of energy balance, but struggles to explain the many significant environmental and social influences on obesity, food intake and physical activity. More importantly, the set point model does not effectively explain the ‘obesity epidemic’ – the large increase in body weight and adiposity of a large proportion of individuals in many countries since the 1980s. An alternative model, called the settling point model, is based on the idea that there is passive feedback between the size of the body stores and aspects of expenditure. This model accommodates many of the social and environmental characteristics of energy balance, but struggles to explain some of the biological and genetic aspects. The shortcomings of these two models reflect their failure to address the gene-by-environment interactions that dominate the regulation of body weight. We discuss two additional models – the general intake model and the dual intervention point model – that address this issue and might offer better ways to understand how body fatness is controlled.

  3. CELF family RNA-binding protein UNC-75 regulates two sets of mutually exclusive exons of the unc-32 gene in neuron-specific manners in Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Hidehito Kuroyanagi

    Full Text Available An enormous number of alternative pre-mRNA splicing patterns in multicellular organisms are coordinately defined by a limited number of regulatory proteins and cis elements. Mutually exclusive alternative splicing should be strictly regulated and is a challenging model for elucidating regulation mechanisms. Here we provide models of the regulation of two sets of mutually exclusive exons, 4a-4c and 7a-7b, of the Caenorhabditis elegans uncoordinated (unc-32 gene, encoding the a subunit of V0 complex of vacuolar-type H(+-ATPases. We visualize selection patterns of exon 4 and exon 7 in vivo by utilizing a trio and a pair of symmetric fluorescence splicing reporter minigenes, respectively, to demonstrate that they are regulated in tissue-specific manners. Genetic analyses reveal that RBFOX family RNA-binding proteins ASD-1 and FOX-1 and a UGCAUG stretch in intron 7b are involved in the neuron-specific selection of exon 7a. Through further forward genetic screening, we identify UNC-75, a neuron-specific CELF family RNA-binding protein of unknown function, as an essential regulator for the exon 7a selection. Electrophoretic mobility shift assays specify a short fragment in intron 7a as the recognition site for UNC-75 and demonstrate that UNC-75 specifically binds via its three RNA recognition motifs to the element including a UUGUUGUGUUGU stretch. The UUGUUGUGUUGU stretch in the reporter minigenes is actually required for the selection of exon 7a in the nervous system. We compare the amounts of partially spliced RNAs in the wild-type and unc-75 mutant backgrounds and raise a model for the mutually exclusive selection of unc-32 exon 7 by the RBFOX family and UNC-75. The neuron-specific selection of unc-32 exon 4b is also regulated by UNC-75 and the unc-75 mutation suppresses the Unc phenotype of the exon-4b-specific allele of unc-32 mutants. Taken together, UNC-75 is the neuron-specific splicing factor and regulates both sets of the mutually exclusive

  4. The PR/SET Domain Zinc Finger Protein Prdm4 Regulates Gene Expression in Embryonic Stem Cells but Plays a Nonessential Role in the Developing Mouse Embryo

    Science.gov (United States)

    Bogani, Debora; Morgan, Marc A. J.; Nelson, Andrew C.; Costello, Ita; McGouran, Joanna F.; Kessler, Benedikt M.

    2013-01-01

    Prdm4 is a highly conserved member of the Prdm family of PR/SET domain zinc finger proteins. Many well-studied Prdm family members play critical roles in development and display striking loss-of-function phenotypes. Prdm4 functional contributions have yet to be characterized. Here, we describe its widespread expression in the early embryo and adult tissues. We demonstrate that DNA binding is exclusively mediated by the Prdm4 zinc finger domain, and we characterize its tripartite consensus sequence via SELEX (systematic evolution of ligands by exponential enrichment) and ChIP-seq (chromatin immunoprecipitation-sequencing) experiments. In embryonic stem cells (ESCs), Prdm4 regulates key pluripotency and differentiation pathways. Two independent strategies, namely, targeted deletion of the zinc finger domain and generation of a EUCOMM LacZ reporter allele, resulted in functional null alleles. However, homozygous mutant embryos develop normally and adults are healthy and fertile. Collectively, these results strongly suggest that Prdm4 functions redundantly with other transcriptional partners to cooperatively regulate gene expression in the embryo and adult animal. PMID:23918801

  5. The AHL- and BDSF-dependent quorum sensing systems control specific and overlapping sets of genes in Burkholderia cenocepacia H111.

    Directory of Open Access Journals (Sweden)

    Nadine Schmid

    Full Text Available Quorum sensing in Burkholderia cenocepacia H111 involves two signalling systems that depend on different signal molecules, namely N-acyl homoserine lactones (AHLs and the diffusible signal factor cis-2-dodecenoic acid (BDSF. Previous studies have shown that AHLs and BDSF control similar phenotypic traits, including biofilm formation, proteolytic activity and pathogenicity. In this study we mapped the BDSF stimulon by RNA-Seq and shotgun proteomics analysis. We demonstrate that a set of the identified BDSF-regulated genes or proteins are also controlled by AHLs, suggesting that the two regulons partially overlap. The detailed analysis of two mutually regulated operons, one encoding three lectins and the other one encoding the large surface protein BapA and its type I secretion machinery, revealed that both AHLs and BDSF are required for full expression, suggesting that the two signalling systems operate in parallel. In accordance with this, we show that both AHLs and BDSF are required for biofilm formation and protease production.

  6. Polymorphisms in sodium-dependent vitamin C transporter genes and plasma, aqueous humor and lens nucleus ascorbate concentrations in an ascorbate depleted setting.

    Science.gov (United States)

    Senthilkumari, Srinivasan; Talwar, Badri; Dharmalingam, Kuppamuthu; Ravindran, Ravilla D; Jayanthi, Ramamurthy; Sundaresan, Periasamy; Saravanan, Charu; Young, Ian S; Dangour, Alan D; Fletcher, Astrid E

    2014-07-01

    We have previously reported low concentrations of plasma ascorbate and low dietary vitamin C intake in the older Indian population and a strong inverse association of these with cataract. Little is known about ascorbate levels in aqueous humor and lens in populations habitually depleted of ascorbate and no studies in any setting have investigated whether genetic polymorphisms influence ascorbate levels in ocular tissues. Our objectives were to investigate relationships between ascorbate concentrations in plasma, aqueous humor and lens and whether these relationships are influenced by Single Nucleotide Polymorphisms (SNPs) in sodium-dependent vitamin C transporter genes (SLC23A1 and SLC23A2). We enrolled sixty patients (equal numbers of men and women, mean age 63 years) undergoing small incision cataract surgery in southern India. We measured ascorbate concentrations in plasma, aqueous humor and lens nucleus using high performance liquid chromatography. SLC23A1 SNPs (rs4257763, rs6596473) and SLC23A2 SNPs (rs1279683 and rs12479919) were genotyped using a TaqMan assay. Patients were interviewed for lifestyle factors which might influence ascorbate. Plasma vitamin C was normalized by a log10 transformation. Statistical analysis used linear regression with the slope of the within-subject associations estimated using beta (β) coefficients. The ascorbate concentrations (μmol/L) were: plasma ascorbate, median and inter-quartile range (IQR), 15.2 (7.8, 34.5), mean (SD) of aqueous humor ascorbate, 1074 (545) and lens nucleus ascorbate, 0.42 (0.16) (μmol/g lens nucleus wet weight). Minimum allele frequencies were: rs1279683 (0.28), rs12479919 (0.30), rs659647 (0.48). Decreasing concentrations of ocular ascorbate from the common to the rare genotype were observed for rs6596473 and rs12479919. The per allele difference in aqueous humor ascorbate for rs6596473 was -217 μmol/L, p humor ascorbate were higher for the GG genotype of rs6596473: GG, β = 1460 compared to

  7. Selecting a set of housekeeping genes for quantitative real-time PCR in normal and tetraploid haemocytes of soft-shell clams, Mya arenaria.

    Science.gov (United States)

    Siah, A; Dohoo, C; McKenna, P; Delaporte, M; Berthe, F C J

    2008-09-01

    The transcripts involved in the molecular mechanisms of haemic neoplasia in relation to the haemocyte ploidy status of the soft-shell clam, Mya arenaria, have yet to be identified. For this purpose, real-time quantitative RT-PCR constitutes a sensitive and efficient technique, which can help determine the gene expression involved in haemocyte tetraploid status in clams affected by haemic neoplasia. One of the critical steps in comparing transcription profiles is the stability of selected housekeeping genes, as well as an accurate normalization. In this study, we selected five reference genes, S18, L37, EF1, EF2 and actin, generally used as single control genes. Their expression was analyzed by real-time quantitative RT-PCR at different levels of haemocyte ploidy status in order to select the most stable genes. Using the geNorm software, our results showed that L37, EF1 and S18 represent the most stable gene expressions related to various ploidy status ranging from 0 to 78% of tetraploid haemocytes in clams sampled in North River (Prince Edward Island, Canada). However, actin gene expression appeared to be highly regulated. Hence, using it as a housekeeping gene in tetraploid haemocytes can result in inaccurate data. To compare gene expression levels related to haemocyte ploidy status in Mya arenaria, using L37, EF1 and S18 as housekeeping genes for accurate normalization is therefore recommended.

  8. Using logistic regression to improve the prognostic value of microarray gene expression data sets: application to early-stage squamous cell carcinoma of the lung and triple negative breast carcinoma.

    Science.gov (United States)

    Mount, David W; Putnam, Charles W; Centouri, Sara M; Manziello, Ann M; Pandey, Ritu; Garland, Linda L; Martinez, Jesse D

    2014-06-10

    Numerous microarray-based prognostic gene expression signatures of primary neoplasms have been published but often with little concurrence between studies, thus limiting their clinical utility. We describe a methodology using logistic regression, which circumvents limitations of conventional Kaplan Meier analysis. We applied this approach to a thrice-analyzed and published squamous cell carcinoma (SQCC) of the lung data set, with the objective of identifying gene expressions predictive of early death versus long survival in early-stage disease. A similar analysis was applied to a data set of triple negative breast carcinoma cases, which present similar clinical challenges. Important to our approach is the selection of homogenous patient groups for comparison. In the lung study, we selected two groups (including only stages I and II), equal in size, of earliest deaths and longest survivors. Genes varying at least four-fold were tested by logistic regression for accuracy of prediction (area under a ROC plot). The gene list was refined by applying two sliding-window analyses and by validations using a leave-one-out approach and model building with validation subsets. In the breast study, a similar logistic regression analysis was used after selecting appropriate cases for comparison. A total of 8594 variable genes were tested for accuracy in predicting earliest deaths versus longest survivors in SQCC. After applying the two sliding window and the leave-one-out analyses, 24 prognostic genes were identified; most of them were B-cell related. When the same data set of stage I and II cases was analyzed using a conventional Kaplan Meier (KM) approach, we identified fewer immune-related genes among the most statistically significant hits; when stage III cases were included, most of the prognostic genes were missed. Interestingly, logistic regression analysis of the breast cancer data set identified many immune-related genes predictive of clinical outcome. Stratification of

  9. PCOTH, a novel gene overexpressed in prostate cancers, promotes prostate cancer cell growth through phosphorylation of oncoprotein TAF-Ibeta/SET.

    Science.gov (United States)

    Anazawa, Yoshio; Nakagawa, Hidewaki; Furihara, Mutsuo; Ashida, Shingo; Tamura, Kenji; Yoshioka, Hiroki; Shuin, Taro; Fujioka, Tomoaki; Katagiri, Toyomasa; Nakamura, Yusuke

    2005-06-01

    Through genome-wide cDNA microarray analysis coupled with microdissection of prostate cancer cells, we identified a novel gene, prostate collagen triple helix (PCOTH), showing overexpression in prostate cancer cells and its precursor cells, prostatic intraepithelial neoplasia (PIN). Immunohistochemical analysis using polyclonal anti-PCOTH antibody confirmed elevated expression of PCOTH, a 100-amino-acid protein containing collagen triple-helix repeats, in prostate cancer cells and PINs. Knocking down PCOTH expression by small interfering RNA (siRNA) resulted in drastic attenuation of prostate cancer cell growth, and concordantly, LNCaP derivative cells that were designed to constitutively express exogenous PCOTH showed higher growth rate than LNCaP cells transfected with mock vector, suggesting the growth-promoting effect of PCOTH on prostate cancer cell. To investigate the biological mechanisms of this growth-promoting effect, we applied two-dimensional differential gel electrophoresis (2D-DIGE) to analyze the phospho-protein fractions in LNCaP cells transfected with PCOTH. We found that the phosphorylation level of oncoprotein TAF-Ibeta/SET was significantly elevated in LNCaP cells transfected with PCOTH than control LNCaP cells, and these findings were confirmed by Western blotting and in-gel kinase assay. Furthermore, knockdown of endogenous TAF-Ibeta expression by siRNA also attenuated viability of prostate cancer cells as well. These findings suggest that PCOTH is involved in growth and survival of prostate cancer cells thorough, in parts, the TAF-Ibeta pathway, and that this molecule should be a promising target for development of new therapeutic strategies for prostate cancers.

  10. ASPIC: a novel method to predict the exon-intron structure of a gene that is optimally compatible to a set of transcript sequences

    Directory of Open Access Journals (Sweden)

    Pesole Graziano

    2005-10-01

    Full Text Available Abstract Background: Currently available methods to predict splice sites are mainly based on the independent and progressive alignment of transcript data (mostly ESTs to the genomic sequence. Apart from often being computationally expensive, this approach is vulnerable to several problems – hence the need to develop novel strategies. Results: We propose a method, based on a novel multiple genome-EST alignment algorithm, for the detection of splice sites. To avoid limitations of splice sites prediction (mainly, over-predictions due to independent single EST alignments to the genomic sequence our approach performs a multiple alignment of transcript data to the genomic sequence based on the combined analysis of all available data. We recast the problem of predicting constitutive and alternative splicing as an optimization problem, where the optimal multiple transcript alignment minimizes the number of exons and hence of splice site observations. We have implemented a splice site predictor based on this algorithm in the software tool ASPIC (Alternative Splicing PredICtion. It is distinguished from other methods based on BLAST-like tools by the incorporation of entirely new ad hoc procedures for accurate and computationally efficient transcript alignment and adopts dynamic programming for the refinement of intron boundaries. ASPIC also provides the minimal set of non-mergeable transcript isoforms compatible with the detected splicing events. The ASPIC web resource is dynamically interconnected with the Ensembl and Unigene databases and also implements an upload facility. Conclusion: Extensive bench marking shows that ASPIC outperforms other existing methods in the detection of novel splicing isoforms and in the minimization of over-predictions. ASPIC also requires a lower computation time for processing a single gene and an EST cluster. The ASPIC web resource is available at http://aspic.algo.disco.unimib.it/aspic-devel/.

  11. RBiomirGS: an all-in-one miRNA gene set analysis solution featuring target mRNA mapping and expression profile integration

    Directory of Open Access Journals (Sweden)

    Jing Zhang

    2018-01-01

    Full Text Available Background With the continuous discovery of microRNA’s (miRNA association with a wide range of biological and cellular processes, expression profile-based functional characterization of such post-transcriptional regulation is crucial for revealing its significance behind particular phenotypes. Profound advancement in bioinformatics has been made to enable in depth investigation of miRNA’s role in regulating cellular and molecular events, resulting in a huge quantity of software packages covering different aspects of miRNA functional analysis. Therefore, an all-in-one software solution is in demand for a comprehensive yet highly efficient workflow. Here we present RBiomirGS, an R package for a miRNA gene set (GS analysis. Methods The package utilizes multiple databases for target mRNA mapping, estimates miRNA effect on the target mRNAs through miRNA expression profile and conducts a logistic regression-based GS enrichment. Additionally, human ortholog Entrez ID conversion functionality is included for target mRNAs. Results By incorporating all the core steps into one package, RBiomirGS eliminates the need for switching between different software packages. The modular structure of RBiomirGS enables various access points to the analysis, with which users can choose the most relevant functionalities for their workflow. Conclusions With RBiomirGS, users are able to assess the functional significance of the miRNA expression profile under the corresponding experimental condition by minimal input and intervention. Accordingly, RBiomirGS encompasses an all-in-one solution for miRNA GS analysis. RBiomirGS is available on GitHub (http://github.com/jzhangc/RBiomirGS. More information including instruction and examples can be found on website (http://kenstoreylab.com/?page_id=2865.

  12. Activated Glucocorticoid Receptor Interacts with the INHAT Component Set/TAF-Iβ and Releases it from a Glucocorticoid-responsive Gene Promoter, Relieving Repression: Implications for the Pathogenesis of Glucocorticoid Resistance in Acute Undifferentiated Leukemia with Set-Can Translocation

    Science.gov (United States)

    Ichijo, Takamasa; Chrousos, George P.; Kino, Tomoshige

    2008-01-01

    SUMMARY Set/template-activating factor (TAF)-Iβ, part of the Set-Can oncogene product found in acute undifferentiated leukemia, is a component of the inhibitor of acetyltransferases (INHAT) complex. Set/TAF-Iβ interacted with the DNA-binding domain of the glucocorticoid receptor (GR) in yeast two-hybrid screening, and repressed GR-induced transcriptional activity of a chromatin-integrated glucocorticoid-responsive and a natural promoter. Set/TAF-Iβ was co-precipitated with glucocorticoid response elements (GREs) of these promoters in the absence of dexamethasone, while addition of the hormone caused dissociation of Set/TAF-Iβ from and attraction of the p160-type coactivator GRIP1 to the promoter GREs. Set-Can fusion protein, on the other hand, did not interact with GR, was constitutively co-precipitated with GREs and suppressed GRIP1-induced enhancement of GR transcriptional activity and histone acetylation. Thus, Set/TAF-Iβ acts as a ligand-activated GR-responsive transcriptional repressor, while Set-Can does not retain physiologic responsiveness to ligand-bound GR, possibly contributing to the poor responsiveness of Set-Can-harboring leukemic cells to glucocorticoids. PMID:18096310

  13. Activated glucocorticoid receptor interacts with the INHAT component Set/TAF-Ibeta and releases it from a glucocorticoid-responsive gene promoter, relieving repression: implications for the pathogenesis of glucocorticoid resistance in acute undifferentiated leukemia with Set-Can translocation.

    Science.gov (United States)

    Ichijo, Takamasa; Chrousos, George P; Kino, Tomoshige

    2008-02-13

    Set/template-activating factor (TAF)-Ibeta, part of the Set-Can oncogene product found in acute undifferentiated leukemia, is a component of the inhibitor of acetyltransferases (INHAT) complex. Set/TAF-Ibeta interacted with the DNA-binding domain of the glucocorticoid receptor (GR) in yeast two-hybrid screening, and repressed GR-induced transcriptional activity of a chromatin-integrated glucocorticoid-responsive and a natural promoter. Set/TAF-Ibeta was co-precipitated with glucocorticoid response elements (GREs) of these promoters in the absence of dexamethasone, while addition of the hormone caused dissociation of Set/TAF-Ibeta from and attraction of the p160-type coactivator GRIP1 to the promoter GREs. Set-Can fusion protein, on the other hand, did not interact with GR, was constitutively co-precipitated with GREs and suppressed GRIP1-induced enhancement of GR transcriptional activity and histone acetylation. Thus, Set/TAF-Ibeta acts as a ligand-activated GR-responsive transcriptional repressor, while Set-Can does not retain physiologic responsiveness to ligand-bound GR, possibly contributing to the poor responsiveness of Set-Can-harboring leukemic cells to glucocorticoids.

  14. Enumeration of minimal stoichiometric precursor sets in metabolic networks.

    Science.gov (United States)

    Andrade, Ricardo; Wannagat, Martin; Klein, Cecilia C; Acuña, Vicente; Marchetti-Spaccamela, Alberto; Milreu, Paulo V; Stougie, Leen; Sagot, Marie-France

    2016-01-01

    What an organism needs at least from its environment to produce a set of metabolites, e.g. target(s) of interest and/or biomass, has been called a minimal precursor set. Early approaches to enumerate all minimal precursor sets took into account only the topology of the metabolic network (topological precursor sets). Due to cycles and the stoichiometric values of the reactions, it is often not possible to produce the target(s) from a topological precursor set in the sense that there is no feasible flux. Although considering the stoichiometry makes the problem harder, it enables to obtain biologically reasonable precursor sets that we call stoichiometric. Recently a method to enumerate all minimal stoichiometric precursor sets was proposed in the literature. The relationship between topological and stoichiometric precursor sets had however not yet been studied. Such relationship between topological and stoichiometric precursor sets is highlighted. We also present two algorithms that enumerate all minimal stoichiometric precursor sets. The first one is of theoretical interest only and is based on the above mentioned relationship. The second approach solves a series of mixed integer linear programming problems. We compared the computed minimal precursor sets to experimentally obtained growth media of several Escherichia coli strains using genome-scale metabolic networks. The results show that the second approach efficiently enumerates minimal precursor sets taking stoichiometry into account, and allows for broad in silico studies of strains or species interactions that may help to understand e.g. pathotype and niche-specific metabolic capabilities. sasita is written in Java, uses cplex as LP solver and can be downloaded together with all networks and input files used in this paper at http://www.sasita.gforge.inria.fr.

  15. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new

  16. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Science.gov (United States)

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an

  17. DNA methylation polymorphism in a set of elite rice cultivars and its possible contribution to inter-cultivar differential gene expression.

    Science.gov (United States)

    Wang, Yongming; Lin, Xiuyun; Dong, Bo; Wang, Yingdian; Liu, Bao

    2004-01-01

    RAPD (randomly amplified polymorphic DNA) and ISSR (inter-simple sequence repeat) fingerprinting on HpaII/MspI-digested genomic DNA of nine elite japonica rice cultivars implies inter-cultivar DNA methylation polymorphism. Using both DNA fragments isolated from RAPD or ISSR gels and selected low-copy sequences as probes, methylation-sensitive Southern blot analysis confirms the existence of extensive DNA methylation polymorphism in both genes and DNA repeats among the rice cultivars. The cultivar-specific methylation patterns are stably maintained, and can be used as reliable molecular markers. Transcriptional analysis of four selected sequences (RdRP, AC9, HSP90 and MMR) on leaves and roots from normal and 5-azacytidine-treated seedlings of three representative cultivars shows an association between the transcriptional activity of one of the genes, the mismatch repair (MMR) gene, and its CG methylation patterns.

  18. Chronic activation of the epithelial immune system of the fruit fly's salivary glands has a negative effect on organismal growth and induces a peculiar set of target genes

    Directory of Open Access Journals (Sweden)

    Abdelsadik Ahmed

    2010-04-01

    Full Text Available Abstract Background Epithelial and especially mucosal immunity represents the first line of defence against the plethora of potential pathogens trying to invade via the gastrointestinal tract. The salivary glands of the fruit fly are an indispensable part of the gastrointestinal tract, but their contribution to the mucosal immunity has almost completely been neglected. Our major goal was to elucidate if the fly's salivary glands are able to mount an immune response and what the major characteristics of this immune response are. Results Ectopic activation of the IMD-pathway within the salivary gland cells is able to induce an immune response, indicating that the salivary glands are indeed immune competent. This reaction is characterized by the concurrent expression of numerous antimicrobial peptide genes. In addition, ectopic activation of the salivary gland's immune response induces morphological changes such as dwarfism throughout all developmental stages and a significantly decreased length of the salivary glands themselves. DNA-microarray analyses of the reaction revealed a complex pattern of up- and downregulated genes. Gene ontology analyses of regulated genes revealed a significant increase in genes associated with ribosomal and proteasomal function. On the other hand, genes coding for peptide receptors and some potassium channels are downregulated. In addition, the comparison of the transcriptional events induced following IMD-activation in the trachea and the salivary glands shows also only a small overlap, indicating that the general IMD-activated core transcriptome is rather small and that the tissue specific component of this response is dominating. Among the regulated genes, those that code for signaling associated protease activity are significantly modulated. Conclusions The salivary glands are immune-competent and they contribute to the overall intestinal immune system. Although they produce antimicrobial peptides, their overall

  19. Definition of the low molecular weight glutenin subunit gene family members in a set of standard bread wheat (Triticum aestivum L.) varieties

    Science.gov (United States)

    Low-molecular-weight glutenin subunits (LMW-GS) are a class of seed storage proteins that play a major role in the determination of the viscoelastic properties of wheat dough. Most of the LMW-GSs are encoded by a multi-gene family located on the short arms of the homoeologous group 1 chromosomes, at...

  20. Multiplex reverse transcription-polymerase chain reaction combined with on-chip electrophoresis as a rapid screening tool for candidate gene sets

    DEFF Research Database (Denmark)

    Wittig, Rainer; Salowsky, Rüdiger; Blaich, Stephanie

    2005-01-01

    Combining multiplex reverse transcription-polymerase chain reaction (mRT-PCR) with microfluidic amplicon analysis, we developed an assay for the rapid and reliable semiquantitative expression screening of 11 candidate genes for drug resistance in human malignant melanoma. The functionality of thi...

  1. Characterization of the CrbS/R Two-Component System in Pseudomonas fluorescens Reveals a New Set of Genes under Its Control and a DNA Motif Required for CrbR-Mediated Transcriptional Activation

    Directory of Open Access Journals (Sweden)

    Edgardo Sepulveda

    2017-11-01

    Full Text Available The CrbS/R system is a two-component signal transduction system that regulates acetate utilization in Vibrio cholerae, P. aeruginosa, and P. entomophila. CrbS is a hybrid histidine kinase that belongs to a recently identified family, in which the signaling domain is fused to an SLC5 solute symporter domain through aSTAC domain. Upon activation by CrbS, CrbR activates transcription of the acs gene, which encodes an acetyl-CoA synthase (ACS, and the actP gene, which encodes an acetate/solute symporter. In this work, we characterized the CrbS/R system in Pseudomonas fluorescens SBW25. Through the quantitative proteome analysis of different mutants, we were able to identify a new set of genes under its control, which play an important role during growth on acetate. These results led us to the identification of a conserved DNA motif in the putative promoter region of acetate-utilization genes in the Gammaproteobacteria that is essential for the CrbR-mediated transcriptional activation of genes under acetate-utilizing conditions. Finally, we took advantage of the existence of a second SLC5-containing two-component signal transduction system in P. fluorescens, CbrA/B, to demonstrate that the activation of the response regulator by the histidine kinase is not dependent on substrate transport through the SLC5 domain.

  2. Gene expression analysis by cDNA-AFLP highlights a set of new signaling networks and translational control during seed dormancy breaking in Nicotiana plumbaginifolia.

    Science.gov (United States)

    Bove, Jérôme; Lucas, Philippe; Godin, Béatrice; Ogé, Laurent; Jullien, Marc; Grappin, Philippe

    2005-03-01

    Seed dormancy in Nicotiana plumbaginifolia is characterized by an abscisic acid accumulation linked to a pronounced germination delay. Dormancy can be released by 1 year after-ripening treatment. Using a cDNA-amplified fragment length polymorphism (cDNA-AFLP) approach we compared the gene expression patterns of dormant and after-ripened seeds, air-dry or during one day imbibition and analyzed 15,000 cDNA fragments. Among them 1020 were found to be differentially regulated by dormancy. Of 412 sequenced cDNA fragments, 83 were assigned to a known function by search similarities to public databases. The functional categories of the identified dormancy maintenance and breaking responsive genes, give evidence that after-ripening turns in the air-dry seed to a new developmental program that modulates, at the RNA level, components of translational control, signaling networks, transcriptional control and regulated proteolysis.

  3. Setting the pace: host rhythmic behaviour and gene expression patterns in the facultatively symbiotic cnidarian Aiptasia are determined largely by Symbiodinium.

    Science.gov (United States)

    Sorek, Michal; Schnytzer, Yisrael; Ben-Asher, Hiba Waldman; Caspi, Vered Chalifa; Chen, Chii-Shiarng; Miller, David J; Levy, Oren

    2018-05-09

    All organisms employ biological clocks to anticipate physical changes in the environment; however, the integration of biological clocks in symbiotic systems has received limited attention. In corals, the interpretation of rhythmic behaviours is complicated by the daily oscillations in tissue oxygen tension resulting from the photosynthetic and respiratory activities of the associated algal endosymbiont Symbiodinium. In order to better understand the integration of biological clocks in cnidarian hosts of Symbiodinium, daily rhythms of behaviour and gene expression were studied in symbiotic and aposymbiotic morphs of the sea-anemone Aiptasia diaphana. The results showed that whereas circatidal (approx. 12-h) cycles of activity and gene expression predominated in aposymbiotic morphs, circadian (approx. 24-h) patterns were the more common in symbiotic morphs, where the expression of a significant number of genes shifted from a 12- to 24-h rhythm. The behavioural experiments on symbiotic A. diaphana displayed diel (24-h) rhythmicity in body and tentacle contraction under the light/dark cycles, whereas aposymbiotic morphs showed approximately 12-h (circatidal) rhythmicity. Reinfection experiments represent an important step in understanding the hierarchy of endogenous clocks in symbiotic associations, where the aposymbiotic Aiptasia morphs returned to a 24-h behavioural rhythm after repopulation with algae. Whilst some modification of host metabolism is to be expected, the extent to which the presence of the algae modified host endogenous behavioural and transcriptional rhythms implies that it is the symbionts that influence the pace. Our results clearly demonstrate the importance of the endosymbiotic algae in determining the timing and the duration of the extension and contraction of the body and tentacles and temporal gene expression.

  4. Sampling gene diversity across the supergroup Amoebozoa: large EST data sets from Acanthamoeba castellanii, Hartmannella vermiformis, Physarum polycephalum, Hyperamoeba dachnaya and Hyperamoeba sp.

    Science.gov (United States)

    Watkins, Russell F; Gray, Michael W

    2008-04-01

    From comparative analysis of EST data for five taxa within the eukaryotic supergroup Amoebozoa, including two free-living amoebae (Acanthamoeba castellanii, Hartmannella vermiformis) and three slime molds (Physarum polycephalum, Hyperamoeba dachnaya and Hyperamoeba sp.), we obtained new broad-range perspectives on the evolution and biosynthetic capacity of this assemblage. Together with genome sequences for the amoebozoans Dictyostelium discoideum and Entamoeba histolytica, and including partial genome sequence available for A. castellanii, we used the EST data to identify genes that appear to be exclusive to the supergroup, and to specific clades therein. Many of these genes are likely involved in cell-cell communication or differentiation. In examining on a broad scale a number of characters that previously have been considered in simpler cross-species comparisons, typically between Dictyostelium and Entamoeba, we find that Amoebozoa as a whole exhibits striking variation in the number and distribution of biosynthetic pathways, for example, ones for certain critical stress-response molecules, including trehalose and mannitol. Finally, we report additional compelling cases of lateral gene transfer within Amoebozoa, further emphasizing that although this process has influenced genome evolution in all examined amoebozoan taxa, it has done so to a variable extent.

  5. Witnessing stressful events induces glutamatergic synapse pathway alterations and gene set enrichment of positive EPSP regulation within the VTA of adult mice: An ontology based approach

    Science.gov (United States)

    Brewer, Jacob S.

    It is well known that exposure to severe stress increases the risk for developing mood disorders. Currently, the neurobiological and genetic mechanisms underlying the functional effects of psychological stress are poorly understood. Presenting a major obstacle to the study of psychological stress is the inability of current animal models of stress to distinguish between physical and psychological stressors. A novel paradigm recently developed by Warren et al., is able to tease apart the effects of physical and psychological stress in adult mice by allowing these mice to "witness," the social defeat of another mouse thus removing confounding variables associated with physical stressors. Using this 'witness' model of stress and RNA-Seq technology, the current study aims to study the genetic effects of psychological stress. After, witnessing the social defeat of another mouse, VTA tissue was extracted, sequenced, and analyzed for differential expression. Since genes often work together in complex networks, a pathway and gene ontology (GO) analysis was performed using data from the differential expression analysis. The pathway and GO analyzes revealed a perturbation of the glutamatergic synapse pathway and an enrichment of positive excitatory post-synaptic potential regulation. This is consistent with the excitatory synapse theory of depression. Together these findings demonstrate a dysregulation of the mesolimbic reward pathway at the gene level as a result of psychological stress potentially contributing to depressive like behaviors.

  6. Counting SET-free sets

    OpenAIRE

    Harman, Nate

    2016-01-01

    We consider the following counting problem related to the card game SET: How many $k$-element SET-free sets are there in an $n$-dimensional SET deck? Through a series of algebraic reformulations and reinterpretations, we show the answer to this question satisfies two polynomiality conditions.

  7. Reprogramming the body weight set point by a reciprocal interaction of hypothalamic leptin sensitivity and Pomc gene expression reverts extreme obesity

    Directory of Open Access Journals (Sweden)

    Kavaljit H. Chhabra

    2016-10-01

    Conclusions: Pomc reactivation in previously obese, calorie-restricted ArcPomc−/− mice normalized energy homeostasis, suggesting that their body weight set point was restored to control levels. In contrast, massively obese and hyperleptinemic ArcPomc−/− mice or those weight-matched and treated with PASylated leptin to maintain extreme hyperleptinemia prior to Pomc reactivation converged to an intermediate set point relative to lean control and obese ArcPomc−/− mice. We conclude that restoration of hypothalamic leptin sensitivity and Pomc expression is necessary for obese ArcPomc−/− mice to achieve and sustain normal metabolic homeostasis; whereas deficits in either parameter set a maladaptive allostatic balance that defends increased adiposity and body weight.

  8. A gene-environment investigation on personality traits in two independent clinical sets of adult patients with personality disorder and attention deficit/hyperactive disorder.

    Science.gov (United States)

    Jacob, Christian P; Nguyen, Thuy Trang; Dempfle, Astrid; Heine, Monika; Windemuth-Kieselbach, Christine; Baumann, Katarina; Jacob, Florian; Prechtl, Julian; Wittlich, Maike; Herrmann, Martin J; Gross-Lesch, Silke; Lesch, Klaus-Peter; Reif, Andreas

    2010-06-01

    While an interactive effect of genes with adverse life events is increasingly appreciated in current concepts of depression etiology, no data are presently available on interactions between genetic and environmental (G x E) factors with respect to personality and related disorders. The present study therefore aimed to detect main effects as well as interactions of serotonergic candidate genes (coding for the serotonin transporter, 5-HTT; the serotonin autoreceptor, HTR1A; and the enzyme which synthesizes serotonin in the brain, TPH2) with the burden of life events (#LE) in two independent samples consisting of 183 patients suffering from personality disorders and 123 patients suffering from adult attention deficit/hyperactivity disorder (aADHD). Simple analyses ignoring possible G x E interactions revealed no evidence for associations of either #LE or of the considered polymorphisms in 5-HTT and TPH2. Only the G allele of HTR1A rs6295 seemed to increase the risk of emotional-dramatic cluster B personality disorders (p = 0.019, in the personality disorder sample) and to decrease the risk of anxious-fearful cluster C personality disorders (p = 0.016, in the aADHD sample). We extended the initial simple model by taking a G x E interaction term into account, since this approach may better fit the data indicating that the effect of a gene is modified by stressful life events or, vice versa, that stressful life events only have an effect in the presence of a susceptibility genotype. By doing so, we observed nominal evidence for G x E effects as well as main effects of 5-HTT-LPR and the TPH2 SNP rs4570625 on the occurrence of personality disorders. Further replication studies, however, are necessary to validate the apparent complexity of G x E interactions in disorders of human personality.

  9. High Diversity of Myocyanophage in Various Aquatic Environments Revealed by High-Throughput Sequencing of Major Capsid Protein Gene With a New Set of Primers

    Directory of Open Access Journals (Sweden)

    Weiguo Hou

    2018-05-01

    Full Text Available Myocyanophages, a group of viruses infecting cyanobacteria, are abundant and play important roles in elemental cycling. Here we investigated the particle-associated viral communities retained on 0.2 μm filters and in sediment samples (representing ancient cyanophage communities from four ocean and three lake locations, using high-throughput sequencing and a newly designed primer pair targeting a gene fragment (∼145-bp in length encoding the cyanophage gp23 major capsid protein (MCP. Diverse viral communities were detected in all samples. The fragments of 142-, 145-, and 148-bp in length were most abundant in the amplicons, and most sequences (>92% belonged to cyanophages. Additionally, different sequencing depths resulted in different diversity estimates of the viral community. Operational taxonomic units obtained from deep sequencing of the MCP gene covered the majority of those obtained from shallow sequencing, suggesting that deep sequencing exhibited a more complete picture of cyanophage community than shallow sequencing. Our results also revealed a wide geographic distribution of marine myocyanophages, i.e., higher dissimilarities of the myocyanophage communities corresponded with the larger distances between the sampling sites. Collectively, this study suggests that the newly designed primer pair can be effectively used to study the community and diversity of myocyanophage from different environments, and the high-throughput sequencing represents a good method to understand viral diversity.

  10. Expression patterns of porcine Toll-like receptors family set of genes (TLR1-10) in gut-associated lymphoid tissues alter with age.

    Science.gov (United States)

    Uddin, Muhammad Jasim; Kaewmala, Kanokwan; Tesfaye, Dawit; Tholen, Ernst; Looft, Christian; Hoelker, Michael; Schellander, Karl; Cinar, Mehmet Ulas

    2013-08-01

    The aim was to study the expression pattern of the porcine TLR family (TLR1-10) genes in gut-associated lymphoid tissues (GALT) of varying ages. A total of nine clinically healthy pigs of three ages group (1 day, 2 months and 5 months old) were selected for this experiment (three pigs in each group). Tissues from intestinal mucosa in stomach, duodenum, jejunum and ileum and mesenteric lymph node (MLN) were used. mRNA expression of TLRs (1-10) was detectable in all tissues and TLR3 showed the highest mRNA abundance among TLRs. TLR3 expression in stomach, and TLR1 and TLR6 expression in MLN were higher in adult than newborn pigs. The western blot results of TLR2, 3 and 9 in some cases, did not coincide with the mRNA expression results. The protein localization of TLR2, 3 and 9 showed that TLR expressing cells were abundant in the lamina propria, Peyer's patches in intestine, and around and within the lymphoid follicles in the MLN. This expressions study sheds the first light on the expression patterns of all TLR genes in GALT at different ages of pigs. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. Genomic characteristics comparisons of 12 food-related filamentous fungi in tRNA gene set, codon usage and amino acid composition.

    Science.gov (United States)

    Chen, Wanping; Xie, Ting; Shao, Yanchun; Chen, Fusheng

    2012-04-10

    Filamentous fungi are widely exploited in food industry due to their abilities to secrete large amounts of enzymes and metabolites. The recent availability of fungal genome sequences has provided an opportunity to explore the genomic characteristics of these food-related filamentous fungi. In this paper, we selected 12 representative filamentous fungi in the areas of food processing and safety, which were Aspergillus clavatus, A. flavus, A. fumigatus, A. nidulans, A. niger, A. oryzae, A. terreus, Monascus ruber, Neurospora crassa, Penicillium chrysogenum, Rhizopus oryzae and Trichoderma reesei, and did the comparative studies of their genomic characteristics of tRNA gene distribution, codon usage pattern and amino acid composition. The results showed that the copy numbers greatly differed among isoaccepting tRNA genes and the distribution seemed to be related with translation process. The results also revealed that genome compositional variation probably constrained the base choice at the third codon, and affected the overall amino acid composition but seemed to have little effect on the integrated physicochemical characteristics of overall amino acids. The further analysis suggested that the wobble pairing and base modification were the important mechanisms in codon-anticodon interaction. In the scope of authors' knowledge, it is the first report about the genomic characteristics analysis of food-related filamentous fungi, which would be informative for the analysis of filamentous fungal genome evolution and their practical application in food industry. Copyright © 2012 Elsevier B.V. All rights reserved.

  12. Toolbox Approaches Using Molecular Markers and 16S rRNA Gene Amplicon Data Sets for Identification of Fecal Pollution in Surface Water.

    Science.gov (United States)

    Ahmed, W; Staley, C; Sadowsky, M J; Gyawali, P; Sidhu, J P S; Palmer, A; Beale, D J; Toze, S

    2015-10-01

    In this study, host-associated molecular markers and bacterial 16S rRNA gene community analysis using high-throughput sequencing were used to identify the sources of fecal pollution in environmental waters in Brisbane, Australia. A total of 92 fecal and composite wastewater samples were collected from different host groups (cat, cattle, dog, horse, human, and kangaroo), and 18 water samples were collected from six sites (BR1 to BR6) along the Brisbane River in Queensland, Australia. Bacterial communities in the fecal, wastewater, and river water samples were sequenced. Water samples were also tested for the presence of bird-associated (GFD), cattle-associated (CowM3), horse-associated, and human-associated (HF183) molecular markers, to provide multiple lines of evidence regarding the possible presence of fecal pollution associated with specific hosts. Among the 18 water samples tested, 83%, 33%, 17%, and 17% were real-time PCR positive for the GFD, HF183, CowM3, and horse markers, respectively. Among the potential sources of fecal pollution in water samples from the river, DNA sequencing tended to show relatively small contributions from wastewater treatment plants (up to 13% of sequence reads). Contributions from other animal sources were rarely detected and were very small (molecular markers showed variable agreement. A lack of relationships among fecal indicator bacteria, host-associated molecular markers, and 16S rRNA gene community analysis data was also observed. Nonetheless, we show that bacterial community and host-associated molecular marker analyses can be combined to identify potential sources of fecal pollution in an urban river. This study is a proof of concept, and based on the results, we recommend using bacterial community analysis (where possible) along with PCR detection or quantification of host-associated molecular markers to provide information on the sources of fecal pollution in waterways. Copyright © 2015, American Society for Microbiology

  13. A Molecular Approach to Nested RT-PCR Using a New Set of Primers for the Detection of the Human Immunodeficiency Virus Protease Gene.

    Science.gov (United States)

    Zarei, Mohammad; Ravanshad, Mehrdad; Bagban, Ashraf; Fallahi, Shahab

    2016-07-01

    The human immunodeficiency virus (HIV-1) is the etiologic agent of AIDS. The disease can be transmitted via blood in the window period prior to the development of antibodies to the disease. Thus, an appropriate method for the detection of HIV-1 during this window period is very important. This descriptive study proposes a sensitive, efficient, inexpensive, and easy method to detect HIV-1. In this study 25 serum samples of patients under treatment and also 10 positive and 10 negative control samples were studied. Twenty-five blood samples were obtained from HIV-1-infected individuals who were receiving treatment at the acquired immune deficiency syndrome (AIDS) research center of Imam Khomeini hospital in Tehran. The identification of HIV-1-positive samples was done by using reverse transcription to produce copy deoxyribonucleic acid (cDNA) and then optimizing the nested polymerase chain reaction (PCR) method. Two pairs of primers were then designed specifically for the protease gene fragment of the nested real time-PCR (RT-PCR) samples. Electrophoresis was used to examine the PCR products. The results were analyzed using statistical tests, including Fisher's exact test, and SPSS17 software. The 325 bp band of the protease gene was observed in all the positive control samples and in none of the negative control samples. The proposed method correctly identified HIV-1 in 23 of the 25 samples. These results suggest that, in comparison with viral cultures, antibody detection by enzyme linked immunosorbent assay (ELISAs), and conventional PCR methods, the proposed method has high sensitivity and specificity for the detection of HIV-1.

  14. Automatic sets and Delone sets

    International Nuclear Information System (INIS)

    Barbe, A; Haeseler, F von

    2004-01-01

    Automatic sets D part of Z m are characterized by having a finite number of decimations. They are equivalently generated by fixed points of certain substitution systems, or by certain finite automata. As examples, two-dimensional versions of the Thue-Morse, Baum-Sweet, Rudin-Shapiro and paperfolding sequences are presented. We give a necessary and sufficient condition for an automatic set D part of Z m to be a Delone set in R m . The result is then extended to automatic sets that are defined as fixed points of certain substitutions. The morphology of automatic sets is discussed by means of examples

  15. A common multiple cloning site in a set of vectors for expression of eukaryotic genes in mammalian, insect and bacterial cells

    DEFF Research Database (Denmark)

    Pallisgaard, N; Pedersen, FS; Birkelund, Svend

    1994-01-01

    a start Met codon was included in the same reading frame as in lambda gt11Sfi-Not to support expression of partial cDNA clones. Thus a cDNA insert of lambda gt11Sfi-Not could be shuttled among the new vectors for expression. The other set of vectors without a start codon were suitable for expression of c......DNA carrying their own start Met codon. By Western blot analysis and by transactivation of a reporter plasmid in co-transfections we show that cDNA is very efficiently expressed in NIH 3T3 cells under control of the elongation factor 1 alpha promoter....

  16. RRM domain of Arabidopsis splicing factor SF1 is important for pre-mRNA splicing of a specific set of genes

    KAUST Repository

    Lee, Keh Chien

    2017-04-11

    The RNA recognition motif of Arabidopsis splicing factor SF1 affects the alternative splicing of FLOWERING LOCUS M pre-mRNA and a heat shock transcription factor HsfA2 pre-mRNA. Splicing factor 1 (SF1) plays a crucial role in 3\\' splice site recognition by binding directly to the intron branch point. Although plant SF1 proteins possess an RNA recognition motif (RRM) domain that is absent in its fungal and metazoan counterparts, the role of the RRM domain in SF1 function has not been characterized. Here, we show that the RRM domain differentially affects the full function of the Arabidopsis thaliana AtSF1 protein under different experimental conditions. For example, the deletion of RRM domain influences AtSF1-mediated control of flowering time, but not the abscisic acid sensitivity response during seed germination. The alternative splicing of FLOWERING LOCUS M (FLM) pre-mRNA is involved in flowering time control. We found that the RRM domain of AtSF1 protein alters the production of alternatively spliced FLM-β transcripts. We also found that the RRM domain affects the alternative splicing of a heat shock transcription factor HsfA2 pre-mRNA, thereby mediating the heat stress response. Taken together, our results suggest the importance of RRM domain for AtSF1-mediated alternative splicing of a subset of genes involved in the regulation of flowering and adaptation to heat stress.

  17. Bioinformatic Description of Immunotherapy Targets for Pediatric T-Cell Leukemia and the Impact of Normal Gene Sets Used for Comparison

    Directory of Open Access Journals (Sweden)

    Rimas J Orentas

    2014-06-01

    Full Text Available Pediatric lymphoid leukemia has the highest cure rate of all pediatric malignancies, yet due to its prevalence, still accounts for the majority of childhood cancer deaths and requires long-term highly toxic therapy. The ability to target B-cell ALL with immunoglobulin-like binders, whether anti-CD22 antibody or anti-CD19 CAR-Ts, has impacted treatment options for some patients. The development of new ways to target B cell antigens continues at rapid pace. T-cell ALL accounts for up to 20% of childhood leukemia but has yet to see a set of high value immunotherapeutic targets identified. To find new targets for T-ALL immunotherapy, we employed a bioinformatic comparison to broad normal tissue arrays, hematopoietic stem cells (HSC, and mature lymphocytes, then filtered the results for transcripts encoding plasma membrane proteins. T-ALL bears a core T cell signature and transcripts encoding TCR/CD3 components and canonical markers of T cell development predominate, especially when comparison was made to normal tissue or HSC. However, when comparison to mature lymphocytes was also undertaken, we identified two antigens that may drive, or be associated with leukemogenesis; TALLA-1 and hedgehog interacting protein, HHIP. In addition, TCR subfamilies, CD1, activation and adhesion markers, membrane organizing molecules, and receptors linked to metabolism and inflammation were also identified. Of these, only CD52, CD37, and CD98 are currently being targeted clinically. This work provides a set of targets to be considered for future development of immunotherapies for T-ALL.

  18. Codominant expression of genes coding for different sets of inducible salivary polypeptides associated with parotid hypertrophy in two inbred mouse strains.

    Science.gov (United States)

    López-Solís, Remigio O; Kemmerling, Ulrike

    2005-05-01

    Experimental mouse parotid hypertrophy has been associated with the expression of a number of isoproterenol-induced salivary proline-rich polypeptides (IISPs). Mouse salivary proline-rich proteins (PRPs) have been mapped both to chromosomes 6 and 8. Recently, mice of two inbred strains (A/Snell and A. Swiss) have been found to differ drastically in the IISPs. In this study, mice of both strains were used for cross-breeding experiments addressed to define the pattern of inheritance of the IISP phenotype and to establish whether the IISPs are coded on a single or on several chromosomes. The IISP phenotype of individual mice was assessed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) of whole saliva collected after three daily stimulations by isoproterenol. Parental A/Snell and A. Swiss mice were homogeneous for distinctive strain-associated IISP-patterns. First filial generation (F1) mice obtained from the cross of A/Snell with A. Swiss mice expressed with no exception both the A/Snell and A. Swiss IISPs (coexpression). In the second filial generation (F2) both parental IISP phenotypes reappeared together with a majority of mice expressing the F1-hybrid phenotype (1:2:1 ratio). Backcrosses of F1 x A/Snell and F1 x A. Swiss produced offsprings displaying the F1 and the corresponding parental phenotypes with a 1:1 ratio. No recombinants were observed among F2 mice or among mice resulting from backcrosses. Thus, genes coding for the IISPs that are expressed differentially in both mouse strains are located on the same chromosome, probably at the same locus (alleles) or at quite closely linked loci (nonalleles). 2005 Wiley-Liss, Inc

  19. Genome-wide association analysis for heat tolerance at flowering detected a large set of genes involved in adaptation to thermal and other stresses.

    Directory of Open Access Journals (Sweden)

    Tanguy Lafarge

    Full Text Available Fertilization sensitivity to heat in rice is a major issue within climate change scenarios in the tropics. A panel of 167 indica landraces and improved varieties was phenotyped for spikelet sterility (SPKST under 38°C during anthesis and for several secondary traits potentially affecting panicle micro-climate and thus the fertilization process. The panel was genotyped with an average density of one marker per 29 kb using genotyping by sequencing. Genome-wide association analyses (GWAS were conducted using three methods based on single marker regression, haplotype regression and simultaneous fitting of all markers, respectively. Fourteen loci significantly associated with SPKST under at least two GWAS methods were detected. A large number of associations was also detected for the secondary traits. Analysis of co-localization of SPKST associated loci with QTLs detected in progenies of bi-parental crosses reported in the literature allowed to narrow -down the position of eight of those QTLs, including the most documented one, qHTSF4.1. Gene families underlying loci associated with SPKST corresponded to functions ranging from sensing abiotic stresses and regulating plant response, such as wall-associated kinases and heat shock proteins, to cell division and gametophyte development. Analysis of diversity at the vicinity of loci associated with SPKST within the rice three thousand genomes, revealed widespread distribution of the favourable alleles across O. sativa genetic groups. However, few accessions assembled the favourable alleles at all loci. Effective donors included the heat tolerant variety N22 and some Indian and Taiwanese varieties. These results provide a basis for breeding for heat tolerance during anthesis and for functional validation of major loci governing this trait.

  20. Niche-Specific Requirement for Hyphal Wall protein 1 in Virulence of Candida albicans

    Science.gov (United States)

    Staab, Janet F.; Datta, Kausik; Rhee, Peter

    2013-01-01

    Specialized Candida albicans cell surface proteins called adhesins mediate binding of the fungus to host cells. The mammalian transglutaminase (TG) substrate and adhesin, Hyphal wall protein 1 (Hwp1), is expressed on the hyphal form of C. albicans where it mediates fungal adhesion to epithelial cells. Hwp1 is also required for biofilm formation and mating thus the protein functions in both fungal-host and self-interactions. Hwp1 is required for full virulence of C. albicans in murine models of disseminated candidiasis and of esophageal candidiasis. Previous studies correlated TG activity on the surface of oral epithelial cells, produced by epithelial TG (TG1), with tight binding of C. albicans via Hwp1 to the host cell surfaces. However, the contribution of other Tgs, specifically tissue TG (TG2), to disseminated candidiasis mediated by Hwp1 was not known. A newly created hwp1 null strain in the wild type SC5314 background was as virulent as the parental strain in C57BL/6 mice, and virulence was retained in C57BL/6 mice deleted for Tgm2 (TG2). Further, the hwp1 null strains displayed modestly reduced virulence in BALB/c mice as did strain DD27-U1, an independently created hwp1Δ/Δ in CAI4 corrected for its ura3Δ defect at the URA3 locus. Hwp1 was still needed to produce wild type biofilms, and persist on murine tongues in an oral model of oropharyngeal candidiasis consistent with previous studies by us and others. Finally, lack of Hwp1 affected the translocation of C. albicans from the mouse intestine into the bloodstream of mice. Together, Hwp1 appears to have a minor role in disseminated candidiasis, independent of tissue TG, but a key function in host- and self-association to the surface of oral mucosa. PMID:24260489

  1. Influence of Niche-Specific Nutrients on Secondary Metabolism in Vibrionaceae

    DEFF Research Database (Denmark)

    Giubergia, Sonia; Phippen, Christopher; Gotfredsen, Charlotte Held

    2016-01-01

    was responsible for the antibacterial activity of Vibrio furnissii and Vibrio fluvialis These results suggest a role of chitin in the regulation of secondary metabolism in vibrios and demonstrate that considering bacterial ecophysiology during development of screening strategies will facilitate bioprospecting......Many factors, such as the substrate and the growth phase, influence biosynthesis of secondary metabolites in microorganisms. Therefore, it is crucial to consider these factors when establishing a bioprospecting strategy. Mimicking the conditions of the natural environment has been suggested...... as a means of inducing or influencing microbial secondary metabolite production. The purpose of the present study was to determine how the bioactivity of Vibrionaceae was influenced by carbon sources typical of their natural environment. We determined how mannose and chitin, compared to glucose, influenced...

  2. Stem cell niche-specific Ebf3 maintains the bone marrow cavity.

    Science.gov (United States)

    Seike, Masanari; Omatsu, Yoshiki; Watanabe, Hitomi; Kondoh, Gen; Nagasawa, Takashi

    2018-03-01

    Bone marrow is the tissue filling the space between bone surfaces. Hematopoietic stem cells (HSCs) are maintained by special microenvironments known as niches within bone marrow cavities. Mesenchymal cells, termed CXC chemokine ligand 12 (CXCL12)-abundant reticular (CAR) cells or leptin receptor-positive (LepR + ) cells, are a major cellular component of HSC niches that gives rise to osteoblasts in bone marrow. However, it remains unclear how osteogenesis is prevented in most CAR/LepR + cells to maintain HSC niches and marrow cavities. Here, using lineage tracing, we found that the transcription factor early B-cell factor 3 (Ebf3) is preferentially expressed in CAR/LepR + cells and that Ebf3-expressing cells are self-renewing mesenchymal stem cells in adult marrow. When Ebf3 is deleted in CAR/LepR + cells, HSC niche function is severely impaired, and bone marrow is osteosclerotic with increased bone in aged mice. In mice lacking Ebf1 and Ebf3 , CAR/LepR + cells exhibiting a normal morphology are abundantly present, but their niche function is markedly impaired with depleted HSCs in infant marrow. Subsequently, the mutants become progressively more osteosclerotic, leading to the complete occlusion of marrow cavities in early adulthood. CAR/LepR + cells differentiate into bone-producing cells with reduced HSC niche factor expression in the absence of Ebf1/Ebf3 Thus, HSC cellular niches express Ebf3 that is required to create HSC niches, to inhibit their osteoblast differentiation, and to maintain spaces for HSCs. © 2018 Seike et al.; Published by Cold Spring Harbor Laboratory Press.

  3. Genes and Gene Therapy

    Science.gov (United States)

    ... correctly, a child can have a genetic disorder. Gene therapy is an experimental technique that uses genes to ... or prevent disease. The most common form of gene therapy involves inserting a normal gene to replace an ...

  4. Essential Bacillus subtilis genes

    DEFF Research Database (Denmark)

    Kobayashi, K.; Ehrlich, S.D.; Albertini, A.

    2003-01-01

    To estimate the minimal gene set required to sustain bacterial life in nutritious conditions, we carried out a systematic inactivation of Bacillus subtilis genes. Among approximate to4,100 genes of the organism, only 192 were shown to be indispensable by this or previous work. Another 79 genes were...... predicted to be essential. The vast majority of essential genes were categorized in relatively few domains of cell metabolism, with about half involved in information processing, one-fifth involved in the synthesis of cell envelope and the determination of cell shape and division, and one-tenth related...... to cell energetics. Only 4% of essential genes encode unknown functions. Most essential genes are present throughout a wide range of Bacteria, and almost 70% can also be found in Archaea and Eucarya. However, essential genes related to cell envelope, shape, division, and respiration tend to be lost from...

  5. Multidrug-Resistant CTX-M-(15, 9, 2)- and KPC-2-Producing Enterobacter hormaechei and Enterobacter asburiae Isolates Possessed a Set of Acquired Heavy Metal Tolerance Genes Including a Chromosomal sil Operon (for Acquired Silver Resistance).

    Science.gov (United States)

    Andrade, Leonardo N; Siqueira, Thiago E S; Martinez, Roberto; Darini, Ana Lucia C

    2018-01-01

    Bacterial resistance to antibiotics is concern in healthcare-associated infections. On the other hand, bacterial tolerance to other antimicrobials, like heavy metals, has been neglected and underestimated in hospital pathogens. Silver has long been used as an antimicrobial agent and it seems to be an important indicator of heavy metal tolerance. To explore this perspective, we searched for the presence of acquired silver resistance genes ( sil operon: silE, silS, silR, silC, silF, silB, silA , and silP ) and acquired extended-spectrum cephalosporin and carbapenem resistance genes ( bla CTX-M and bla KPC ) in Enterobacter cloacae Complex (EcC) ( n = 27) and Enterobacter aerogenes ( n = 8) isolated from inpatients at a general hospital. Moreover, the genetic background of the silA (silver-efflux pump) and the presence of other acquired heavy metal tolerance genes, pcoD (copper-efflux pump), arsB (arsenite-efflux pump), terF (tellurite resistance protein), and merA (mercuric reductase) were also investigated. Outstandingly, 21/27 (78%) EcC isolates harbored silA gene located in the chromosome. Complete sil operon was found in 19/21 silA -positive EcC isolates. Interestingly, 8/20 (40%) E. hormaechei and 5/6 (83%) E. asburiae co-harbored silA/pcoD genes and bla CTX-M-(15,2,or9) and/or bla KPC-2 genes. Frequent occurrences of arsB, terF , and merA genes were detected, especially in silA/pcoD -positive, multidrug-resistant (MDR) and/or CTX-M-producing isolates. Our study showed co-presence of antibiotic and heavy metal tolerance genes in MDR EcC isolates. In our viewpoint, there are few studies regarding to bacterial heavy metal tolerance and we call attention for more investigations and discussion about this issue in different hospital pathogens.

  6. Multidrug-Resistant CTX-M-(15, 9, 2- and KPC-2-Producing Enterobacter hormaechei and Enterobacter asburiae Isolates Possessed a Set of Acquired Heavy Metal Tolerance Genes Including a Chromosomal sil Operon (for Acquired Silver Resistance

    Directory of Open Access Journals (Sweden)

    Leonardo N. Andrade

    2018-03-01

    Full Text Available Bacterial resistance to antibiotics is concern in healthcare-associated infections. On the other hand, bacterial tolerance to other antimicrobials, like heavy metals, has been neglected and underestimated in hospital pathogens. Silver has long been used as an antimicrobial agent and it seems to be an important indicator of heavy metal tolerance. To explore this perspective, we searched for the presence of acquired silver resistance genes (sil operon: silE, silS, silR, silC, silF, silB, silA, and silP and acquired extended-spectrum cephalosporin and carbapenem resistance genes (blaCTX−M and blaKPC in Enterobacter cloacae Complex (EcC (n = 27 and Enterobacter aerogenes (n = 8 isolated from inpatients at a general hospital. Moreover, the genetic background of the silA (silver-efflux pump and the presence of other acquired heavy metal tolerance genes, pcoD (copper-efflux pump, arsB (arsenite-efflux pump, terF (tellurite resistance protein, and merA (mercuric reductase were also investigated. Outstandingly, 21/27 (78% EcC isolates harbored silA gene located in the chromosome. Complete sil operon was found in 19/21 silA-positive EcC isolates. Interestingly, 8/20 (40% E. hormaechei and 5/6 (83% E. asburiae co-harbored silA/pcoD genes and blaCTX−M−(15,2,or9 and/or blaKPC−2 genes. Frequent occurrences of arsB, terF, and merA genes were detected, especially in silA/pcoD-positive, multidrug-resistant (MDR and/or CTX-M-producing isolates. Our study showed co-presence of antibiotic and heavy metal tolerance genes in MDR EcC isolates. In our viewpoint, there are few studies regarding to bacterial heavy metal tolerance and we call attention for more investigations and discussion about this issue in different hospital pathogens.

  7. Gene Ontology

    Directory of Open Access Journals (Sweden)

    Gaston K. Mazandu

    2012-01-01

    Full Text Available The wide coverage and biological relevance of the Gene Ontology (GO, confirmed through its successful use in protein function prediction, have led to the growth in its popularity. In order to exploit the extent of biological knowledge that GO offers in describing genes or groups of genes, there is a need for an efficient, scalable similarity measure for GO terms and GO-annotated proteins. While several GO similarity measures exist, none adequately addresses all issues surrounding the design and usage of the ontology. We introduce a new metric for measuring the distance between two GO terms using the intrinsic topology of the GO-DAG, thus enabling the measurement of functional similarities between proteins based on their GO annotations. We assess the performance of this metric using a ROC analysis on human protein-protein interaction datasets and correlation coefficient analysis on the selected set of protein pairs from the CESSM online tool. This metric achieves good performance compared to the existing annotation-based GO measures. We used this new metric to assess functional similarity between orthologues, and show that it is effective at determining whether orthologues are annotated with similar functions and identifying cases where annotation is inconsistent between orthologues.

  8. UpSet: Visualization of Intersecting Sets

    Science.gov (United States)

    Lex, Alexander; Gehlenborg, Nils; Strobelt, Hendrik; Vuillemot, Romain; Pfister, Hanspeter

    2016-01-01

    Understanding relationships between sets is an important analysis task that has received widespread attention in the visualization community. The major challenge in this context is the combinatorial explosion of the number of set intersections if the number of sets exceeds a trivial threshold. In this paper we introduce UpSet, a novel visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections. UpSet is focused on creating task-driven aggregates, communicating the size and properties of aggregates and intersections, and a duality between the visualization of the elements in a dataset and their set membership. UpSet visualizes set intersections in a matrix layout and introduces aggregates based on groupings and queries. The matrix layout enables the effective representation of associated data, such as the number of elements in the aggregates and intersections, as well as additional summary statistics derived from subset or element attributes. Sorting according to various measures enables a task-driven analysis of relevant intersections and aggregates. The elements represented in the sets and their associated attributes are visualized in a separate view. Queries based on containment in specific intersections, aggregates or driven by attribute filters are propagated between both views. We also introduce several advanced visual encodings and interaction methods to overcome the problems of varying scales and to address scalability. UpSet is web-based and open source. We demonstrate its general utility in multiple use cases from various domains. PMID:26356912

  9. Discovering genes underlying QTL

    Energy Technology Data Exchange (ETDEWEB)

    Vanavichit, Apichart [Kasetsart University, Kamphaengsaen, Nakorn Pathom (Thailand)

    2002-02-01

    A map-based approach has allowed scientists to discover few genes at a time. In addition, the reproductive barrier between cultivated rice and wild relatives has prevented us from utilizing the germ plasm by a map-based approach. Most genetic traits important to agriculture or human diseases are manifested as observable, quantitative phenotypes called Quantitative Trait Loci (QTL). In many instances, the complexity of the phenotype/genotype interaction and the general lack of clearly identifiable gene products render the direct molecular cloning approach ineffective, thus additional strategies like genome mapping are required to identify the QTL in question. Genome mapping requires no prior knowledge of the gene function, but utilizes statistical methods to identify the most likely gene location. To completely characterize genes of interest, the initially mapped region of a gene location will have to be narrowed down to a size that is suitable for cloning and sequencing. Strategies for gene identification within the critical region have to be applied after the sequencing of a potentially large clone or set of clones that contains this gene(s). Tremendous success of positional cloning has been shown for cloning many genes responsible for human diseases, including cystic fibrosis and muscular dystrophy as well as plant disease resistance genes. Genome and QTL mapping, positional cloning: the pre-genomics era, comparative approaches to gene identification, and positional cloning: the genomics era are discussed in the report. (M. Suetake)

  10. Gene Therapy

    Science.gov (United States)

    Gene therapy Overview Gene therapy involves altering the genes inside your body's cells in an effort to treat or stop disease. Genes contain your ... that don't work properly can cause disease. Gene therapy replaces a faulty gene or adds a new ...

  11. A new set of ESTs from chickpea (Cicer arietinum L. embryo reveals two novel F-box genes, CarF-box_PP2 and CarF-box_LysM, with potential roles in seed development.

    Directory of Open Access Journals (Sweden)

    Shefali Gupta

    Full Text Available Considering the economic importance of chickpea (C. arietinum L. seeds, it is important to understand the mechanisms underlying seed development for which a cDNA library was constructed from 6 day old chickpea embryos. A total of 8,186 ESTs were obtained from which 4,048 high quality ESTs were assembled into 1,480 unigenes that majorly encoded genes involved in various metabolic and regulatory pathways. Of these, 95 ESTs were found to be involved in ubiquitination related protein degradation pathways and 12 ESTs coded specifically for putative F-box proteins. Differential transcript accumulation of these putative F-box genes was observed in chickpea tissues as evidenced by quantitative real-time PCR. Further, to explore the role of F-box proteins in chickpea seed development, two F-box genes were selected for molecular characterization. These were named as CarF-box_PP2 and CarF-box_LysM depending on their C-terminal domains, PP2 and LysM, respectively. Their highly conserved structures led us to predict their target substrates. Subcellular localization experiment revealed that CarF-box_PP2 was localized in the cytoplasm and CarF-box_LysM was localized in the nucleus. We demonstrated their physical interactions with SKP1 protein, which validated that they function as F-box proteins in the formation of SCF complexes. Sequence analysis of their promoter regions revealed certain seed specific cis-acting elements that may be regulating their preferential transcript accumulation in the seed. Overall, the study helped in expanding the EST database of chickpea, which was further used to identify two novel F-box genes having a potential role in seed development.

  12. A new set of ESTs from chickpea (Cicer arietinum L.) embryo reveals two novel F-box genes, CarF-box_PP2 and CarF-box_LysM, with potential roles in seed development.

    Science.gov (United States)

    Gupta, Shefali; Garg, Vanika; Bhatia, Sabhyata

    2015-01-01

    Considering the economic importance of chickpea (C. arietinum L.) seeds, it is important to understand the mechanisms underlying seed development for which a cDNA library was constructed from 6 day old chickpea embryos. A total of 8,186 ESTs were obtained from which 4,048 high quality ESTs were assembled into 1,480 unigenes that majorly encoded genes involved in various metabolic and regulatory pathways. Of these, 95 ESTs were found to be involved in ubiquitination related protein degradation pathways and 12 ESTs coded specifically for putative F-box proteins. Differential transcript accumulation of these putative F-box genes was observed in chickpea tissues as evidenced by quantitative real-time PCR. Further, to explore the role of F-box proteins in chickpea seed development, two F-box genes were selected for molecular characterization. These were named as CarF-box_PP2 and CarF-box_LysM depending on their C-terminal domains, PP2 and LysM, respectively. Their highly conserved structures led us to predict their target substrates. Subcellular localization experiment revealed that CarF-box_PP2 was localized in the cytoplasm and CarF-box_LysM was localized in the nucleus. We demonstrated their physical interactions with SKP1 protein, which validated that they function as F-box proteins in the formation of SCF complexes. Sequence analysis of their promoter regions revealed certain seed specific cis-acting elements that may be regulating their preferential transcript accumulation in the seed. Overall, the study helped in expanding the EST database of chickpea, which was further used to identify two novel F-box genes having a potential role in seed development.

  13. Fuzzy sets, rough sets, multisets and clustering

    CERN Document Server

    Dahlbom, Anders; Narukawa, Yasuo

    2017-01-01

    This book is dedicated to Prof. Sadaaki Miyamoto and presents cutting-edge papers in some of the areas in which he contributed. Bringing together contributions by leading researchers in the field, it concretely addresses clustering, multisets, rough sets and fuzzy sets, as well as their applications in areas such as decision-making. The book is divided in four parts, the first of which focuses on clustering and classification. The second part puts the spotlight on multisets, bags, fuzzy bags and other fuzzy extensions, while the third deals with rough sets. Rounding out the coverage, the last part explores fuzzy sets and decision-making.

  14. Rapid and simple method by combining FTA™ card DNA extraction with two set multiplex PCR for simultaneous detection of non-O157 Shiga toxin-producing Escherichia coli strains and virulence genes in food samples.

    Science.gov (United States)

    Kim, S A; Park, S H; Lee, S I; Ricke, S C

    2017-12-01

    The aim of this research was to optimize two multiplex polymerase chain reaction (PCR) assays that could simultaneously detect six non-O157 Shiga toxin-producing Escherichia coli (STEC) as well as the three virulence genes. We also investigated the potential of combining the FTA™ card-based DNA extraction with the multiplex PCR assays. Two multiplex PCR assays were optimized using six primer pairs for each non-O157 STEC serogroup and three primer pairs for virulence genes respectively. Each STEC strain specific primer pair only amplified 155, 238, 321, 438, 587 and 750 bp product for O26, O45, O103, O111, O121 and O145 respectively. Three virulence genes were successfully multiplexed: 375 bp for eae, 655 bp for stx1 and 477 bp for stx2. When two multiplex PCR assays were validated with ground beef samples, distinctive bands were also successfully produced. Since the two multiplex PCR examined here can be conducted under the same PCR conditions, the six non-O157 STEC and their virulence genes could be concurrently detected with one run on the thermocycler. In addition, all bands clearly appeared to be amplified by FTA card DNA extraction in the multiplex PCR assay from the ground beef sample, suggesting that an FTA card could be a viable sampling approach for rapid and simple DNA extraction to reduce time and labour and therefore may have practical use for the food industry. Two multiplex polymerase chain reaction (PCR) assays were optimized for discrimination of six non-O157 Shiga toxin-producing Escherichia coli (STEC) and identification of their major virulence genes within a single reaction, simultaneously. This study also determined the successful ability of the FTA™ card as an alternative to commercial DNA extraction method for conducting multiplex STEC PCR assays. The FTA™ card combined with multiplex PCR holds promise for the food industry by offering a simple and rapid DNA sample method for reducing time, cost and labour for detection of STEC in

  15. Fusion of NUP98 and the SET binding protein 1 (SETBP1) gene in a paediatric acute T cell lymphoblastic leukaemia with t(11;18)(p15;q12)

    DEFF Research Database (Denmark)

    Panagopoulos, Ioannis; Kerndrup, Gitte; Carlsen, Niels

    2007-01-01

    Three NUP98 chimaeras have previously been reported in T cell acute lymphoblastic leukaemia (T-ALL): NUP98/ADD3, NUP98/CCDC28A, and NUP98/RAP1GDS1. We report a T-ALL with t(11;18)(p15;q12) resulting in a novel NUP98 fusion. Fluorescent in situ hybridisation showed NUP98 and SET binding protein 1(...... in leukaemias; however, it encodes a protein that specifically interacts with SET, fused to NUP214 in a case of acute undifferentiated leukaemia.......Three NUP98 chimaeras have previously been reported in T cell acute lymphoblastic leukaemia (T-ALL): NUP98/ADD3, NUP98/CCDC28A, and NUP98/RAP1GDS1. We report a T-ALL with t(11;18)(p15;q12) resulting in a novel NUP98 fusion. Fluorescent in situ hybridisation showed NUP98 and SET binding protein 1...

  16. In-vivo expression profiling of Pseudomonas aeruginosa infections reveals niche-specific and strain-independent transcriptional programs.

    Directory of Open Access Journals (Sweden)

    Piotr Bielecki

    Full Text Available Pseudomonas aeruginosa is a threatening, opportunistic pathogen causing disease in immunocompromised individuals. The hallmark of P. aeruginosa virulence is its multi-factorial and combinatorial nature. It renders such bacteria infectious for many organisms and it is often resistant to antibiotics. To gain insights into the physiology of P. aeruginosa during infection, we assessed the transcriptional programs of three different P. aeruginosa strains directly after isolation from burn wounds of humans. We compared the programs to those of the same strains using two infection models: a plant model, which consisted of the infection of the midrib of lettuce leaves, and a murine tumor model, which was obtained by infection of mice with an induced tumor in the abdomen. All control conditions of P. aeruginosa cells growing in suspension and as a biofilm were added to the analysis. We found that these different P. aeruginosa strains express a pool of distinct genetic traits that are activated under particular infection conditions regardless of their genetic variability. The knowledge herein generated will advance our understanding of P. aeruginosa virulence and provide valuable cues for the definition of prospective targets to develop novel intervention strategies.

  17. In-Vivo Expression Profiling of Pseudomonas aeruginosa Infections Reveals Niche-Specific and Strain-Independent Transcriptional Programs

    NARCIS (Netherlands)

    Bielecki, P.; Puchalka, J.; Wos-Oxley, M.L.; Martins Dos Santos, V.A.P.

    2011-01-01

    Pseudomonas aeruginosa is a threatening, opportunistic pathogen causing disease in immunocompromised individuals. The hallmark of P. aeruginosa virulence is its multi-factorial and combinatorial nature. It renders such bacteria infectious for many organisms and it is often resistant to antibiotics.

  18. Toxin-antitoxin systems are important for niche-specific colonization and stress resistance of uropathogenic Escherichia coli.

    Directory of Open Access Journals (Sweden)

    J Paul Norton

    Full Text Available Toxin-antitoxin (TA systems are prevalent in many bacterial genomes and have been implicated in biofilm and persister cell formation, but the contribution of individual chromosomally encoded TA systems during bacterial pathogenesis is not well understood. Of the known TA systems encoded by Escherichia coli, only a subset is associated with strains of extraintestinal pathogenic E. coli (ExPEC. These pathogens colonize diverse niches and are a major cause of sepsis, meningitis, and urinary tract infections. Using a murine infection model, we show that two TA systems (YefM-YoeB and YbaJ-Hha independently promote colonization of the bladder by the reference uropathogenic ExPEC isolate CFT073, while a third TA system comprised of the toxin PasT and the antitoxin PasI is critical to ExPEC survival within the kidneys. The PasTI TA system also enhances ExPEC persister cell formation in the presence of antibiotics and markedly increases pathogen resistance to nutrient limitation as well as oxidative and nitrosative stresses. On its own, low-level expression of PasT protects ExPEC from these stresses, whereas overexpression of PasT is toxic and causes bacterial stasis. PasT-induced stasis can be rescued by overexpression of PasI, indicating that PasTI is a bona fide TA system. By mutagenesis, we find that the stress resistance and toxic effects of PasT can be uncoupled and mapped to distinct domains. Toxicity was specifically linked to sequences within the N-terminus of PasT, a region that also promotes the development of persister cells. These results indicate discrete, multipurpose functions for a TA-associated toxin and demonstrate that individual TA systems can provide bacteria with pronounced fitness advantages dependent on toxin expression levels and the specific environmental niche occupied.

  19. SETS reference manual

    International Nuclear Information System (INIS)

    Worrell, R.B.

    1985-05-01

    The Set Equation Transformation System (SETS) is used to achieve the symbolic manipulation of Boolean equations. Symbolic manipulation involves changing equations from their original forms into more useful forms - particularly by applying Boolean identities. The SETS program is an interpreter which reads, interprets, and executes SETS user programs. The user writes a SETS user program specifying the processing to be achieved and submits it, along with the required data, for execution by SETS. Because of the general nature of SETS, i.e., the capability to manipulate Boolean equations regardless of their origin, the program has been used for many different kinds of analysis

  20. Rapid spread of influenza A(H1N1)pdm09 viruses with a new set of specific mutations in the internal genes in the beginning of 2015/2016 epidemic season in Moscow and Saint Petersburg (Russian Federation).

    Science.gov (United States)

    Komissarov, Andrey; Fadeev, Artem; Sergeeva, Maria; Petrov, Sergey; Sintsova, Kseniya; Egorova, Anna; Pisareva, Maria; Buzitskaya, Zhanna; Musaeva, Tamila; Danilenko, Daria; Konovalova, Nadezhda; Petrova, Polina; Stolyarov, Kirill; Smorodintseva, Elizaveta; Burtseva, Elena; Krasnoslobodtsev, Kirill; Kirillova, Elena; Karpova, Lyudmila; Eropkin, Mikhail; Sominina, Anna; Grudinin, Mikhail

    2016-07-01

    A dramatic increase of influenza activity in Russia since week 3 of 2016 significantly differs from previous seasons in terms of the incidence of influenza and acute respiratory infection (ARI) and in number of lethal cases. We performed antigenic analysis of 108 and whole-genome sequencing of 77 influenza A(H1N1)pdm09 viruses from Moscow and Saint Petersburg. Most of the viruses were antigenically related to the vaccine strain. Whole-genome analysis revealed a composition of specific mutations in the internal genes (D2E and M83I in NEP, E125D in NS1, M105T in NP, Q208K in M1, and N204S in PA-X) that probably emerged before the beginning of 2015/2016 epidemic season. © 2016 The Authors. Influenza and Other Respiratory Viruses Published by John Wiley & Sons Ltd.

  1. Role of the urate transporter SLC2A9 gene in susceptibility to gout in New Zealand Māori, Pacific Island, and Caucasian case-control sample sets.

    Science.gov (United States)

    Hollis-Moffatt, Jade E; Xu, Xin; Dalbeth, Nicola; Merriman, Marilyn E; Topless, Ruth; Waddell, Chloe; Gow, Peter J; Harrison, Andrew A; Highton, John; Jones, Peter B B; Stamp, Lisa K; Merriman, Tony R

    2009-11-01

    To examine the role of genetic variation in the renal urate transporter SLC2A9 in gout in New Zealand sample sets of Māori, Pacific Island, and Caucasian ancestry and to determine if the Māori and Pacific Island samples could be useful for fine-mapping. Patients (n= 56 Māori, 69 Pacific Island, and 131 Caucasian) were recruited from rheumatology outpatient clinics and satisfied the American College of Rheumatology criteria for gout. The control samples comprised 125 Māori subjects, 41 Pacific Island subjects, and 568 Caucasian subjects without arthritis. SLC2A9 single-nucleotide polymorphisms rs16890979 (V253I), rs5028843, rs11942223, and rs12510549 were genotyped (possible etiologic variants in Caucasians). Association of the major allele of rs16890979, rs11942223, and rs5028843 with gout was observed in all sample sets (P = 3.7 x 10(-7), 1.6 x 10(-6), and 7.6 x 10(-5) for rs11942223 in the Māori, Pacific Island, and Caucasian samples, respectively). One 4-marker haplotype (1/1/2/1; more prevalent in the Māori and Pacific Island control samples) was not observed in a single gout case. Our data confirm a role of SLC2A9 in gout susceptibility in a New Zealand Caucasian sample set, with the effect on risk (odds ratio >2.0) greater than previous estimates. We also demonstrate association of SLC2A9 with gout in samples of Māori and Pacific Island ancestry and a consistent pattern of haplotype association. The presence of both alleles of rs16890979 on susceptibility and protective haplotypes in the Māori and Pacific Island sample is evidence against a role for this nonsynonymous variant as the sole etiologic agent. More extensive linkage disequilibrium in Māori and Pacific Island samples suggests that Caucasian samples may be more useful for fine-mapping.

  2. Sets in Coq, Coq in Sets

    Directory of Open Access Journals (Sweden)

    Bruno Barras

    2010-01-01

    Full Text Available This work is about formalizing models of various type theories of the Calculus of Constructions family. Here we focus on set theoretical models. The long-term goal is to build a formal set theoretical model of the Calculus of Inductive Constructions, so we can be sure that Coq is consistent with the language used by most mathematicians.One aspect of this work is to axiomatize several set theories: ZF possibly with inaccessible cardinals, and HF, the theory of hereditarily finite sets. On top of these theories we have developped a piece of the usual set theoretical construction of functions, ordinals and fixpoint theory. We then proved sound several models of the Calculus of Constructions, its extension with an infinite hierarchy of universes, and its extension with the inductive type of natural numbers where recursion follows the type-based termination approach.The other aspect is to try and discharge (most of these assumptions. The goal here is rather to compare the theoretical strengths of all these formalisms. As already noticed by Werner, the replacement axiom of ZF in its general form seems to require a type-theoretical axiom of choice (TTAC.

  3. Hierarchical sets: analyzing pangenome structure through scalable set visualizations

    Science.gov (United States)

    2017-01-01

    Abstract Motivation: The increase in available microbial genome sequences has resulted in an increase in the size of the pangenomes being analyzed. Current pangenome visualizations are not intended for the pangenome sizes possible today and new approaches are necessary in order to convert the increase in available information to increase in knowledge. As the pangenome data structure is essentially a collection of sets we explore the potential for scalable set visualization as a tool for pangenome analysis. Results: We present a new hierarchical clustering algorithm based on set arithmetics that optimizes the intersection sizes along the branches. The intersection and union sizes along the hierarchy are visualized using a composite dendrogram and icicle plot, which, in pangenome context, shows the evolution of pangenome and core size along the evolutionary hierarchy. Outlying elements, i.e. elements whose presence pattern do not correspond with the hierarchy, can be visualized using hierarchical edge bundles. When applied to pangenome data this plot shows putative horizontal gene transfers between the genomes and can highlight relationships between genomes that is not represented by the hierarchy. We illustrate the utility of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. Availability and Implementation: The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https://cran.r-project.org/web/packages/hierarchicalSets) Contact: thomasp85@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28130242

  4. Invariant sets for Windows

    CERN Document Server

    Morozov, Albert D; Dragunov, Timothy N; Malysheva, Olga V

    1999-01-01

    This book deals with the visualization and exploration of invariant sets (fractals, strange attractors, resonance structures, patterns etc.) for various kinds of nonlinear dynamical systems. The authors have created a special Windows 95 application called WInSet, which allows one to visualize the invariant sets. A WInSet installation disk is enclosed with the book.The book consists of two parts. Part I contains a description of WInSet and a list of the built-in invariant sets which can be plotted using the program. This part is intended for a wide audience with interests ranging from dynamical

  5. Evidence for homosexuality gene

    Energy Technology Data Exchange (ETDEWEB)

    Pool, R.

    1993-07-16

    A genetic analysis of 40 pairs of homosexual brothers has uncovered a region on the X chromosome that appears to contain a gene or genes for homosexuality. When analyzing the pedigrees of homosexual males, the researcheres found evidence that the trait has a higher likelihood of being passed through maternal genes. This led them to search the X chromosome for genes predisposing to homosexuality. The researchers examined the X chromosomes of pairs of homosexual brothers for regions of DNA that most or all had in common. Of the 40 sets of brothers, 33 shared a set of five markers in the q28 region of the long arm of the X chromosome. The linkage has a LOD score of 4.0, which translates into a 99.5% certainty that there is a gene or genes in this area that predispose males to homosexuality. The chief researcher warns, however, that this one site cannot explain all instances of homosexuality, since there were some cases where the trait seemed to be passed paternally. And even among those brothers where there was no evidence that the trait was passed paternally, seven sets of brothers did not share the Xq28 markers. It seems likely that homosexuality arises from a variety of causes.

  6. Value Set Authority Center

    Data.gov (United States)

    U.S. Department of Health & Human Services — The VSAC provides downloadable access to all official versions of vocabulary value sets contained in the 2014 Clinical Quality Measures (CQMs). Each value set...

  7. Settings for Suicide Prevention

    Science.gov (United States)

    ... Suicide Populations Racial/Ethnic Groups Older Adults Adolescents LGBT Military/Veterans Men Effective Prevention Comprehensive Approach Identify ... Based Prevention Settings American Indian/Alaska Native Settings Schools Colleges and Universities Primary Care Emergency Departments Behavioral ...

  8. Alternate superior Julia sets

    International Nuclear Information System (INIS)

    Yadav, Anju; Rani, Mamta

    2015-01-01

    Alternate Julia sets have been studied in Picard iterative procedures. The purpose of this paper is to study the quadratic and cubic maps using superior iterates to obtain Julia sets with different alternate structures. Analytically, graphically and computationally it has been shown that alternate superior Julia sets can be connected, disconnected and totally disconnected, and also fattier than the corresponding alternate Julia sets. A few examples have been studied by applying different type of alternate structures

  9. Sets, Planets, and Comets

    Science.gov (United States)

    Baker, Mark; Beltran, Jane; Buell, Jason; Conrey, Brian; Davis, Tom; Donaldson, Brianna; Detorre-Ozeki, Jeanne; Dibble, Leila; Freeman, Tom; Hammie, Robert; Montgomery, Julie; Pickford, Avery; Wong, Justine

    2013-01-01

    Sets in the game "Set" are lines in a certain four-dimensional space. Here we introduce planes into the game, leading to interesting mathematical questions, some of which we solve, and to a wonderful variation on the game "Set," in which every tableau of nine cards must contain at least one configuration for a player to pick up.

  10. Axiomatic set theory

    CERN Document Server

    Suppes, Patrick

    1972-01-01

    This clear and well-developed approach to axiomatic set theory is geared toward upper-level undergraduates and graduate students. It examines the basic paradoxes and history of set theory and advanced topics such as relations and functions, equipollence, finite sets and cardinal numbers, rational and real numbers, and other subjects. 1960 edition.

  11. Paired fuzzy sets

    DEFF Research Database (Denmark)

    Rodríguez, J. Tinguaro; Franco de los Ríos, Camilo; Gómez, Daniel

    2015-01-01

    In this paper we want to stress the relevance of paired fuzzy sets, as already proposed in previous works of the authors, as a family of fuzzy sets that offers a unifying view for different models based upon the opposition of two fuzzy sets, simply allowing the existence of different types...

  12. Elements of set theory

    CERN Document Server

    Enderton, Herbert B

    1977-01-01

    This is an introductory undergraduate textbook in set theory. In mathematics these days, essentially everything is a set. Some knowledge of set theory is necessary part of the background everyone needs for further study of mathematics. It is also possible to study set theory for its own interest--it is a subject with intruiging results anout simple objects. This book starts with material that nobody can do without. There is no end to what can be learned of set theory, but here is a beginning.

  13. Radionuclide reporter gene imaging for cardiac gene therapy

    International Nuclear Information System (INIS)

    Inubushi, Masayuki; Tamaki, Nagara

    2007-01-01

    In the field of cardiac gene therapy, angiogenic gene therapy has been most extensively investigated. The first clinical trial of cardiac angiogenic gene therapy was reported in 1998, and at the peak, more than 20 clinical trial protocols were under evaluation. However, most trials have ceased owing to the lack of decisive proof of therapeutic effects and the potential risks of viral vectors. In order to further advance cardiac angiogenic gene therapy, remaining open issues need to be resolved: there needs to be improvement of gene transfer methods, regulation of gene expression, development of much safer vectors and optimisation of therapeutic genes. For these purposes, imaging of gene expression in living organisms is of great importance. In radionuclide reporter gene imaging, ''reporter genes'' transferred into cell nuclei encode for a protein that retains a complementary ''reporter probe'' of a positron or single-photon emitter; thus expression of the reporter genes can be imaged with positron emission tomography or single-photon emission computed tomography. Accordingly, in the setting of gene therapy, the location, magnitude and duration of the therapeutic gene co-expression with the reporter genes can be monitored non-invasively. In the near future, gene therapy may evolve into combination therapy with stem/progenitor cell transplantation, so-called cell-based gene therapy or gene-modified cell therapy. Radionuclide reporter gene imaging is now expected to contribute in providing evidence on the usefulness of this novel therapeutic approach, as well as in investigating the molecular mechanisms underlying neovascularisation and safety issues relevant to further progress in conventional gene therapy. (orig.)

  14. Social Set Analysis

    DEFF Research Database (Denmark)

    Vatrapu, Ravi; Mukkamala, Raghava Rao; Hussain, Abid

    2016-01-01

    , conceptual and formal models of social data, and an analytical framework for combining big social data sets with organizational and societal data sets. Three empirical studies of big social data are presented to illustrate and demonstrate social set analysis in terms of fuzzy set-theoretical sentiment...... automata and agent-based modeling). However, when it comes to organizational and societal units of analysis, there exists no approach to conceptualize, model, analyze, explain, and predict social media interactions as individuals' associations with ideas, values, identities, and so on. To address...... analysis, crisp set-theoretical interaction analysis, and event-studies-oriented set-theoretical visualizations. Implications for big data analytics, current limitations of the set-theoretical approach, and future directions are outlined....

  15. Haar meager sets revisited

    Czech Academy of Sciences Publication Activity Database

    Doležal, Martin; Rmoutil, M.; Vejnar, B.; Vlasák, V.

    2016-01-01

    Roč. 440, č. 2 (2016), s. 922-939 ISSN 0022-247X Institutional support: RVO:67985840 Keywords : Haar meager set * Haar null set * Polish group Subject RIV: BA - General Mathematics Impact factor: 1.064, year: 2016 http://www.sciencedirect.com/science/article/pii/S0022247X1600305X

  16. Setting goals in psychotherapy

    DEFF Research Database (Denmark)

    Emiliussen, Jakob; Wagoner, Brady

    2013-01-01

    The present study is concerned with the ethical dilemmas of setting goals in therapy. The main questions that it aims to answer are: who is to set the goals for therapy and who is to decide when they have been reached? The study is based on four semi-­‐structured, phenomenological interviews...

  17. Pseudo-set framing.

    Science.gov (United States)

    Barasz, Kate; John, Leslie K; Keenan, Elizabeth A; Norton, Michael I

    2017-10-01

    Pseudo-set framing-arbitrarily grouping items or tasks together as part of an apparent "set"-motivates people to reach perceived completion points. Pseudo-set framing changes gambling choices (Study 1), effort (Studies 2 and 3), giving behavior (Field Data and Study 4), and purchase decisions (Study 5). These effects persist in the absence of any reward, when a cost must be incurred, and after participants are explicitly informed of the arbitrariness of the set. Drawing on Gestalt psychology, we develop a conceptual account that predicts what will-and will not-act as a pseudo-set, and defines the psychological process through which these pseudo-sets affect behavior: over and above typical reference points, pseudo-set framing alters perceptions of (in)completeness, making intermediate progress seem less complete. In turn, these feelings of incompleteness motivate people to persist until the pseudo-set has been fulfilled. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  18. Descriptive set theory

    CERN Document Server

    Moschovakis, YN

    1987-01-01

    Now available in paperback, this monograph is a self-contained exposition of the main results and methods of descriptive set theory. It develops all the necessary background material from logic and recursion theory, and treats both classical descriptive set theory and the effective theory developed by logicians.

  19. Possibility Fuzzy Soft Set

    Directory of Open Access Journals (Sweden)

    Shawkat Alkhazaleh

    2011-01-01

    Full Text Available We introduce the concept of possibility fuzzy soft set and its operation and study some of its properties. We give applications of this theory in solving a decision-making problem. We also introduce a similarity measure of two possibility fuzzy soft sets and discuss their application in a medical diagnosis problem.

  20. Archaeological predictive model set.

    Science.gov (United States)

    2015-03-01

    This report is the documentation for Task 7 of the Statewide Archaeological Predictive Model Set. The goal of this project is to : develop a set of statewide predictive models to assist the planning of transportation projects. PennDOT is developing t...

  1. Large Data Set Mining

    NARCIS (Netherlands)

    Leemans, I.B.; Broomhall, Susan

    2017-01-01

    Digital emotion research has yet to make history. Until now large data set mining has not been a very active field of research in early modern emotion studies. This is indeed surprising since first, the early modern field has such rich, copyright-free, digitized data sets and second, emotion studies

  2. "Ready, Set, FLOW!"

    Science.gov (United States)

    Stroud, Wesley

    2018-01-01

    All educators want their classrooms to be inviting areas that support investigations. However, a common mistake is to fill learning spaces with items or objects that are set up by the teacher or are simply "for show." This type of setting, although it may create a comfortable space for students, fails to stimulate investigations and…

  3. Determining Semantically Related Significant Genes.

    Science.gov (United States)

    Taha, Kamal

    2014-01-01

    GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.

  4. Economic communication model set

    Science.gov (United States)

    Zvereva, Olga M.; Berg, Dmitry B.

    2017-06-01

    This paper details findings from the research work targeted at economic communications investigation with agent-based models usage. The agent-based model set was engineered to simulate economic communications. Money in the form of internal and external currencies was introduced into the models to support exchanges in communications. Every model, being based on the general concept, has its own peculiarities in algorithm and input data set since it was engineered to solve the specific problem. Several and different origin data sets were used in experiments: theoretic sets were estimated on the basis of static Leontief's equilibrium equation and the real set was constructed on the basis of statistical data. While simulation experiments, communication process was observed in dynamics, and system macroparameters were estimated. This research approved that combination of an agent-based and mathematical model can cause a synergetic effect.

  5. Theory of random sets

    CERN Document Server

    Molchanov, Ilya

    2017-01-01

    This monograph, now in a thoroughly revised second edition, offers the latest research on random sets. It has been extended to include substantial developments achieved since 2005, some of them motivated by applications of random sets to econometrics and finance. The present volume builds on the foundations laid by Matheron and others, including the vast advances in stochastic geometry, probability theory, set-valued analysis, and statistical inference. It shows the various interdisciplinary relationships of random set theory within other parts of mathematics, and at the same time fixes terminology and notation that often vary in the literature, establishing it as a natural part of modern probability theory and providing a platform for future development. It is completely self-contained, systematic and exhaustive, with the full proofs that are necessary to gain insight. Aimed at research level, Theory of Random Sets will be an invaluable reference for probabilists; mathematicians working in convex and integ...

  6. Using RNA-Seq data to select refence genes for normalizing gene expression in apple roots

    Science.gov (United States)

    Gene expression in apple roots in response to various stress conditions is a less-explored research subject. Reliable reference genes for normalizing quantitative gene expression data have not been carefully investigated. In this study, the suitability of a set of 15 apple genes were evaluated for t...

  7. Gene expression

    International Nuclear Information System (INIS)

    Hildebrand, C.E.; Crawford, B.D.; Walters, R.A.; Enger, M.D.

    1983-01-01

    We prepared probes for isolating functional pieces of the metallothionein locus. The probes enabled a variety of experiments, eventually revealing two mechanisms for metallothionein gene expression, the order of the DNA coding units at the locus, and the location of the gene site in its chromosome. Once the switch regulating metallothionein synthesis was located, it could be joined by recombinant DNA methods to other, unrelated genes, then reintroduced into cells by gene-transfer techniques. The expression of these recombinant genes could then be induced by exposing the cells to Zn 2+ or Cd 2+ . We would thus take advantage of the clearly defined switching properties of the metallothionein gene to manipulate the expression of other, perhaps normally constitutive, genes. Already, despite an incomplete understanding of how the regulatory switch of the metallothionein locus operates, such experiments have been performed successfully

  8. Basic set theory

    CERN Document Server

    Levy, Azriel

    2002-01-01

    An advanced-level treatment of the basics of set theory, this text offers students a firm foundation, stopping just short of the areas employing model-theoretic methods. Geared toward upper-level undergraduate and graduate students, it consists of two parts: the first covers pure set theory, including the basic motions, order and well-foundedness, cardinal numbers, the ordinals, and the axiom of choice and some of it consequences; the second deals with applications and advanced topics such as point set topology, real spaces, Boolean algebras, and infinite combinatorics and large cardinals. An

  9. Set theory essentials

    CERN Document Server

    Milewski, Emil G

    2012-01-01

    REA's Essentials provide quick and easy access to critical information in a variety of different fields, ranging from the most basic to the most advanced. As its name implies, these concise, comprehensive study guides summarize the essentials of the field covered. Essentials are helpful when preparing for exams, doing homework and will remain a lasting reference source for students, teachers, and professionals. Set Theory includes elementary logic, sets, relations, functions, denumerable and non-denumerable sets, cardinal numbers, Cantor's theorem, axiom of choice, and order relations.

  10. Lebesgue Sets Immeasurable Existence

    Directory of Open Access Journals (Sweden)

    Diana Marginean Petrovai

    2012-12-01

    Full Text Available It is well known that the notion of measure and integral were released early enough in close connection with practical problems of measuring of geometric figures. Notion of measure was outlined in the early 20th century through H. Lebesgue’s research, founder of the modern theory of measure and integral. It was developed concurrently a technique of integration of functions. Gradually it was formed a specific area todaycalled the measure and integral theory. Essential contributions to building this theory was made by a large number of mathematicians: C. Carathodory, J. Radon, O. Nikodym, S. Bochner, J. Pettis, P. Halmos and many others. In the following we present several abstract sets, classes of sets. There exists the sets which are not Lebesgue measurable and the sets which are Lebesgue measurable but are not Borel measurable. Hence B ⊂ L ⊂ P(X.

  11. Leadership set-up

    DEFF Research Database (Denmark)

    Thude, Bettina Ravnborg; Stenager, Egon; von Plessen, Christian

    2018-01-01

    . Findings: The study found that the leadership set-up did not have any clear influence on interdisciplinary cooperation, as all wards had a high degree of interdisciplinary cooperation independent of which leadership set-up they had. Instead, the authors found a relation between leadership set-up and leader...... could influence legitimacy. Originality/value: The study shows that leadership set-up is not the predominant factor that creates interdisciplinary cooperation; but rather, leader legitimacy also should be considered. Additionally, the study shows that leader legitimacy can be difficult to establish...... and that it cannot be taken for granted. This is something chief executive officers should bear in mind when they plan and implement new leadership structures. Therefore, it would also be useful to look more closely at how to achieve legitimacy in cases where the leader is from a different profession to the staff....

  12. General Paleoclimatology Data Sets

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Data of past climate and environment derived from unusual proxy evidence. Parameter keywords describe what was measured in this data set. Additional summary...

  13. HEDIS Limited Data Set

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Healthcare Effectiveness Data and Information Set (HEDIS) is a tool used by more than 90 percent of Americas health plans to measure performance on important...

  14. Genes2FANs: connecting genes through functional association networks

    Science.gov (United States)

    2012-01-01

    Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in

  15. Set theory and physics

    Energy Technology Data Exchange (ETDEWEB)

    Svozil, K. [Univ. of Technology, Vienna (Austria)

    1995-11-01

    Inasmuch as physical theories are formalizable, set theory provides a framework for theoretical physics. Four speculations about the relevance of set theoretical modeling for physics are presented: the role of transcendental set theory (i) in chaos theory, (ii) for paradoxical decompositions of solid three-dimensional objects, (iii) in the theory of effective computability (Church-Turing thesis) related to the possible {open_quotes}solution of supertasks,{close_quotes} and (iv) for weak solutions. Several approaches to set theory and their advantages and disadvantages for physical applications are discussed: Cantorian {open_quotes}naive{close_quotes} (i.e., nonaxiomatic) set theory, contructivism, and operationalism. In the author`s opinion, an attitude, of {open_quotes}suspended attention{close_quotes} (a term borrowed from psychoanalysis) seems most promising for progress. Physical and set theoretical entities must be operationalized wherever possible. At the same time, physicists should be open to {open_quotes}bizarre{close_quotes} or {open_quotes}mindboggling{close_quotes} new formalisms, which need not be operationalizable or testable at the time of their creation, but which may successfully lead to novel fields of phenomenology and technology.

  16. Setting conservation priorities.

    Science.gov (United States)

    Wilson, Kerrie A; Carwardine, Josie; Possingham, Hugh P

    2009-04-01

    A generic framework for setting conservation priorities based on the principles of classic decision theory is provided. This framework encapsulates the key elements of any problem, including the objective, the constraints, and knowledge of the system. Within the context of this framework the broad array of approaches for setting conservation priorities are reviewed. While some approaches prioritize assets or locations for conservation investment, it is concluded here that prioritization is incomplete without consideration of the conservation actions required to conserve the assets at particular locations. The challenges associated with prioritizing investments through time in the face of threats (and also spatially and temporally heterogeneous costs) can be aided by proper problem definition. Using the authors' general framework for setting conservation priorities, multiple criteria can be rationally integrated and where, how, and when to invest conservation resources can be scheduled. Trade-offs are unavoidable in priority setting when there are multiple considerations, and budgets are almost always finite. The authors discuss how trade-offs, risks, uncertainty, feedbacks, and learning can be explicitly evaluated within their generic framework for setting conservation priorities. Finally, they suggest ways that current priority-setting approaches may be improved.

  17. Gene function prediction based on Gene Ontology Hierarchy Preserving Hashing.

    Science.gov (United States)

    Zhao, Yingwen; Fu, Guangyuan; Wang, Jun; Guo, Maozu; Yu, Guoxian

    2018-02-23

    Gene Ontology (GO) uses structured vocabularies (or terms) to describe the molecular functions, biological roles, and cellular locations of gene products in a hierarchical ontology. GO annotations associate genes with GO terms and indicate the given gene products carrying out the biological functions described by the relevant terms. However, predicting correct GO annotations for genes from a massive set of GO terms as defined by GO is a difficult challenge. To combat with this challenge, we introduce a Gene Ontology Hierarchy Preserving Hashing (HPHash) based semantic method for gene function prediction. HPHash firstly measures the taxonomic similarity between GO terms. It then uses a hierarchy preserving hashing technique to keep the hierarchical order between GO terms, and to optimize a series of hashing functions to encode massive GO terms via compact binary codes. After that, HPHash utilizes these hashing functions to project the gene-term association matrix into a low-dimensional one and performs semantic similarity based gene function prediction in the low-dimensional space. Experimental results on three model species (Homo sapiens, Mus musculus and Rattus norvegicus) for interspecies gene function prediction show that HPHash performs better than other related approaches and it is robust to the number of hash functions. In addition, we also take HPHash as a plugin for BLAST based gene function prediction. From the experimental results, HPHash again significantly improves the prediction performance. The codes of HPHash are available at: http://mlda.swu.edu.cn/codes.php?name=HPHash. Copyright © 2018 Elsevier Inc. All rights reserved.

  18. Quantum mechanics over sets

    Science.gov (United States)

    Ellerman, David

    2014-03-01

    In models of QM over finite fields (e.g., Schumacher's ``modal quantum theory'' MQT), one finite field stands out, Z2, since Z2 vectors represent sets. QM (finite-dimensional) mathematics can be transported to sets resulting in quantum mechanics over sets or QM/sets. This gives a full probability calculus (unlike MQT with only zero-one modalities) that leads to a fulsome theory of QM/sets including ``logical'' models of the double-slit experiment, Bell's Theorem, QIT, and QC. In QC over Z2 (where gates are non-singular matrices as in MQT), a simple quantum algorithm (one gate plus one function evaluation) solves the Parity SAT problem (finding the parity of the sum of all values of an n-ary Boolean function). Classically, the Parity SAT problem requires 2n function evaluations in contrast to the one function evaluation required in the quantum algorithm. This is quantum speedup but with all the calculations over Z2 just like classical computing. This shows definitively that the source of quantum speedup is not in the greater power of computing over the complex numbers, and confirms the idea that the source is in superposition.

  19. The Model Confidence Set

    DEFF Research Database (Denmark)

    Hansen, Peter Reinhard; Lunde, Asger; Nason, James M.

    The paper introduces the model confidence set (MCS) and applies it to the selection of models. A MCS is a set of models that is constructed such that it will contain the best model with a given level of confidence. The MCS is in this sense analogous to a confidence interval for a parameter. The MCS......, beyond the comparison of models. We apply the MCS procedure to two empirical problems. First, we revisit the inflation forecasting problem posed by Stock and Watson (1999), and compute the MCS for their set of inflation forecasts. Second, we compare a number of Taylor rule regressions and determine...... the MCS of the best in terms of in-sample likelihood criteria....

  20. Revitalizing the setting approach

    DEFF Research Database (Denmark)

    Bloch, Paul; Toft, Ulla; Reinbach, Helene Christine

    2014-01-01

    BackgroundThe concept of health promotion rests on aspirations aiming at enabling people to increase control over and improve their health. Health promotion action is facilitated in settings such as schools, homes and work places. As a contribution to the promotion of healthy lifestyles, we have ...... approach is based on ecological and whole-systems thinking, and stipulates important principles and values of integration, participation, empowerment, context and knowledge-based development....... further developed the setting approach in an effort to harmonise it with contemporary realities (and complexities) of health promotion and public health action. The paper introduces a modified concept, the supersetting approach, which builds on the optimised use of diverse and valuable resources embedded...... in local community settings and on the strengths of social interaction and local ownership as drivers of change processes. Interventions based on a supersetting approach are first and foremost characterised by being integrated, but also participatory, empowering, context-sensitive and knowledge...

  1. Tuberculosis diagnosis in resource-limited settings: Clinical use of ...

    African Journals Online (AJOL)

    EB

    GeneXpert in the diagnosis of smear-negative PTB: a case report ... Background: The Xpert MTB/RIF test (GeneXpert) has recently been endorsed for use in resource-limited settings for the ... normochromic anemia (9.2g/dl), hypoalbuminemia.

  2. Set theory and logic

    CERN Document Server

    Stoll, Robert R

    1979-01-01

    Set Theory and Logic is the result of a course of lectures for advanced undergraduates, developed at Oberlin College for the purpose of introducing students to the conceptual foundations of mathematics. Mathematics, specifically the real number system, is approached as a unity whose operations can be logically ordered through axioms. One of the most complex and essential of modern mathematical innovations, the theory of sets (crucial to quantum mechanics and other sciences), is introduced in a most careful concept manner, aiming for the maximum in clarity and stimulation for further study in

  3. Nonmeasurable sets and functions

    CERN Document Server

    Kharazishvili, Alexander

    2004-01-01

    The book is devoted to various constructions of sets which are nonmeasurable with respect to invariant (more generally, quasi-invariant) measures. Our starting point is the classical Vitali theorem stating the existence of subsets of the real line which are not measurable in the Lebesgue sense. This theorem stimulated the development of the following interesting topics in mathematics:1. Paradoxical decompositions of sets in finite-dimensional Euclidean spaces;2. The theory of non-real-valued-measurable cardinals;3. The theory of invariant (quasi-invariant)extensions of invariant (quasi-invaria

  4. Why quasi-sets?

    Directory of Open Access Journals (Sweden)

    Décio Krause

    2002-11-01

    Full Text Available Quasi-set theory was developed to deal with collections of indistinguishable objects. In standard mathematics, there are no such kind of entities, for indistinguishability (agreement with respect to all properties entails numerical identity. The main motivation underlying such a theory is of course quantum physics, for collections of indistinguishable (’identical’ in the physicists’ jargon particles cannot be regarded as ’sets’ of standard set theories, which are collections of distinguishable objects. In this paper, a rationale for the development of such a theory is presented, motivated by Heinz Post’s claim that indistinguishability ofquantum entities should be attributed ’right at the start’.

  5. Combinatorics of finite sets

    CERN Document Server

    Anderson, Ian

    2011-01-01

    Coherent treatment provides comprehensive view of basic methods and results of the combinatorial study of finite set systems. The Clements-Lindstrom extension of the Kruskal-Katona theorem to multisets is explored, as is the Greene-Kleitman result concerning k-saturated chain partitions of general partially ordered sets. Connections with Dilworth's theorem, the marriage problem, and probability are also discussed. Each chapter ends with a helpful series of exercises and outline solutions appear at the end. ""An excellent text for a topics course in discrete mathematics."" - Bulletin of the Ame

  6. Trichoderma genes

    Science.gov (United States)

    Foreman, Pamela [Los Altos, CA; Goedegebuur, Frits [Vlaardingen, NL; Van Solingen, Pieter [Naaldwijk, NL; Ward, Michael [San Francisco, CA

    2012-06-19

    Described herein are novel gene sequences isolated from Trichoderma reesei. Two genes encoding proteins comprising a cellulose binding domain, one encoding an arabionfuranosidase and one encoding an acetylxylanesterase are described. The sequences, CIP1 and CIP2, contain a cellulose binding domain. These proteins are especially useful in the textile and detergent industry and in pulp and paper industry.

  7. Analysis of the real EADGENE data set:

    DEFF Research Database (Denmark)

    Jaffrézic, Florence; de Koning, Dirk-Jan; Boettcher, Paul J

    2007-01-01

    A large variety of methods has been proposed in the literature for microarray data analysis. The aim of this paper was to present techniques used by the EADGENE (European Animal Disease Genomics Network of Excellence) WP1.4 participants for data quality control, normalisation and statistical...... methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two...... quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised...

  8. Prices and Price Setting

    NARCIS (Netherlands)

    R.P. Faber (Riemer)

    2010-01-01

    textabstractThis thesis studies price data and tries to unravel the underlying economic processes of why firms have chosen these prices. It focuses on three aspects of price setting. First, it studies whether the existence of a suggested price has a coordinating effect on the prices of firms.

  9. Cobham recursive set functions

    Czech Academy of Sciences Publication Activity Database

    Beckmann, A.; Buss, S.; Friedman, S.-D.; Müller, M.; Thapen, Neil

    2016-01-01

    Roč. 167, č. 3 (2016), s. 335-369 ISSN 0168-0072 R&D Projects: GA ČR GBP202/12/G061 Institutional support: RVO:67985840 Keywords : set function * polynomial time * Cobham recursion Subject RIV: BA - General Mathematics Impact factor: 0.647, year: 2016 http://www.sciencedirect.com/science/article/pii/S0168007215001293

  10. SET-Routes programme

    CERN Multimedia

    Marietta Schupp, EMBL Photolab

    2008-01-01

    Dr Sabine Hentze, specialist in human genetics, giving an Insight Lecture entitled "Human Genetics – Diagnostics, Indications and Ethical Issues" on 23 September 2008 at EMBL Heidelberg. Activities in a achool in Budapest during a visit of Angela Bekesi, Ambassadors for the SET-Routes programme.

  11. The Crystal Set

    Science.gov (United States)

    Greenslade, Thomas B., Jr.

    2014-01-01

    In past issues of this journal, the late H. R. Crane wrote a long series of articles under the running title of "How Things Work." In them, Dick dealt with many questions that physics teachers asked themselves, but did not have the time to answer. This article is my attempt to work through the physics of the crystal set, which I thought…

  12. State-set branching

    DEFF Research Database (Denmark)

    Jensen, Rune Møller; Veloso, Manuela M.; Bryant, Randal E.

    2008-01-01

    In this article, we present a framework called state-set branching that combines symbolic search based on reduced ordered Binary Decision Diagrams (BDDs) with best-first search, such as A* and greedy best-first search. The framework relies on an extension of these algorithms from expanding a sing...

  13. Generalized rough sets

    International Nuclear Information System (INIS)

    Rady, E.A.; Kozae, A.M.; Abd El-Monsef, M.M.E.

    2004-01-01

    The process of analyzing data under uncertainty is a main goal for many real life problems. Statistical analysis for such data is an interested area for research. The aim of this paper is to introduce a new method concerning the generalization and modification of the rough set theory introduced early by Pawlak [Int. J. Comput. Inform. Sci. 11 (1982) 314

  14. Therapists in Oncology Settings

    Science.gov (United States)

    Hendrick, Susan S.

    2013-01-01

    This article describes the author's experiences of working with cancer patients/survivors both individually and in support groups for many years, across several settings. It also documents current best-practice guidelines for the psychosocial treatment of cancer patients/survivors and their families. The author's view of the important qualities…

  15. Probabilistic Open Set Recognition

    Science.gov (United States)

    Jain, Lalit Prithviraj

    Real-world tasks in computer vision, pattern recognition and machine learning often touch upon the open set recognition problem: multi-class recognition with incomplete knowledge of the world and many unknown inputs. An obvious way to approach such problems is to develop a recognition system that thresholds probabilities to reject unknown classes. Traditional rejection techniques are not about the unknown; they are about the uncertain boundary and rejection around that boundary. Thus traditional techniques only represent the "known unknowns". However, a proper open set recognition algorithm is needed to reduce the risk from the "unknown unknowns". This dissertation examines this concept and finds existing probabilistic multi-class recognition approaches are ineffective for true open set recognition. We hypothesize the cause is due to weak adhoc assumptions combined with closed-world assumptions made by existing calibration techniques. Intuitively, if we could accurately model just the positive data for any known class without overfitting, we could reject the large set of unknown classes even under this assumption of incomplete class knowledge. For this, we formulate the problem as one of modeling positive training data by invoking statistical extreme value theory (EVT) near the decision boundary of positive data with respect to negative data. We provide a new algorithm called the PI-SVM for estimating the unnormalized posterior probability of class inclusion. This dissertation also introduces a new open set recognition model called Compact Abating Probability (CAP), where the probability of class membership decreases in value (abates) as points move from known data toward open space. We show that CAP models improve open set recognition for multiple algorithms. Leveraging the CAP formulation, we go on to describe the novel Weibull-calibrated SVM (W-SVM) algorithm, which combines the useful properties of statistical EVT for score calibration with one-class and binary

  16. Hesitant fuzzy sets theory

    CERN Document Server

    Xu, Zeshui

    2014-01-01

    This book provides the readers with a thorough and systematic introduction to hesitant fuzzy theory. It presents the most recent research results and advanced methods in the field. These includes: hesitant fuzzy aggregation techniques, hesitant fuzzy preference relations, hesitant fuzzy measures, hesitant fuzzy clustering algorithms and hesitant fuzzy multi-attribute decision making methods. Since its introduction by Torra and Narukawa in 2009, hesitant fuzzy sets have become more and more popular and have been used for a wide range of applications, from decision-making problems to cluster analysis, from medical diagnosis to personnel appraisal and information retrieval. This book offers a comprehensive report on the state-of-the-art in hesitant fuzzy sets theory and applications, aiming at becoming a reference guide for both researchers and practitioners in the area of fuzzy mathematics and other applied research fields (e.g. operations research, information science, management science and engineering) chara...

  17. Frame scaling function sets and frame wavelet sets in Rd

    International Nuclear Information System (INIS)

    Liu Zhanwei; Hu Guoen; Wu Guochang

    2009-01-01

    In this paper, we classify frame wavelet sets and frame scaling function sets in higher dimensions. Firstly, we obtain a necessary condition for a set to be the frame wavelet sets. Then, we present a necessary and sufficient condition for a set to be a frame scaling function set. We give a property of frame scaling function sets, too. Some corresponding examples are given to prove our theory in each section.

  18. Investigation progress of PET reporter gene imaging

    International Nuclear Information System (INIS)

    Chen Yumei; Huang Gang

    2006-01-01

    Molecular imaging for gene therapy and gene expression has been more and more attractive, while the use of gene therapy has been widely investigated and intense research have allowed it to the clinical setting in the last two-decade years. In vivo imaging with positron emission tomography (PET) by combination of appropriate PET reporter gene and PET reporter probe could provide qualitative and quantitative information for gene therapy. PET imaging could also obtain some valuable parameters not available by other techniques. This technology is useful to understand the process and development of gene therapy and how to apply it into clinical practice in the future. (authors)

  19. Setting Goals for Achievement in Physical Education Settings

    Science.gov (United States)

    Baghurst, Timothy; Tapps, Tyler; Kensinger, Weston

    2015-01-01

    Goal setting has been shown to improve student performance, motivation, and task completion in academic settings. Although goal setting is utilized by many education professionals to help students set realistic and proper goals, physical educators may not be using goal setting effectively. Without incorporating all three types of goals and…

  20. Genome-Wide Comparative Gene Family Classification

    Science.gov (United States)

    Frech, Christian; Chen, Nansheng

    2010-01-01

    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species. PMID:20976221

  1. Ageing genes

    DEFF Research Database (Denmark)

    Rattan, Suresh

    2018-01-01

    The idea of gerontogenes is in line with the evolutionary explanation of ageing as being an emergent phenomenon as a result of the imperfect maintenance and repair systems. Although evolutionary processes did not select for any specific ageing genes that restrict and determine the lifespan...... of an individual, the term ‘gerontogenes’ primarily refers to any genes that may seem to influence ageing and longevity, without being specifically selected for that role. Such genes can also be called ‘virtual gerontogenes’ by virtue of their indirect influence on the rate and process of ageing. More than 1000...... virtual gerontogenes have been associated with ageing and longevity in model organisms and humans. The ‘real’ genes, which do influence the essential lifespan of a species, and have been selected for in accordance with the evolutionary life history of the species, are known as the longevity assurance...

  2. Gene coexpression network analysis as a source of functional annotation for rice genes.

    Directory of Open Access Journals (Sweden)

    Kevin L Childs

    Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional

  3. Cloning and selection of reference genes for gene expression ...

    African Journals Online (AJOL)

    Full length mRNA sequences of Ac-β-actin and Ac-gapdh, and partial mRNA sequences of Ac-18SrRNA and Ac-ubiquitin were cloned from pineapple in this study. The four genes were tested as housekeeping genes in three experimental sets. GeNorm and NormFinder analysis revealed that β-actin was the most ...

  4. Social Set Visualizer

    DEFF Research Database (Denmark)

    Flesch, Benjamin; Hussain, Abid; Vatrapu, Ravi

    2015-01-01

    -edge open source visual analytics libraries from D3.js and creation of new visualizations (ac-tor mobility across time, conversational comets etc). Evaluation of the dashboard consisting of technical testing, usability testing, and domain-specific testing with CSR students and yielded positive results.......This paper presents a state-of-the art visual analytics dash-board, Social Set Visualizer (SoSeVi), of approximately 90 million Facebook actions from 11 different companies that have been mentioned in the traditional media in relation to garment factory accidents in Bangladesh. The enterprise...

  5. Social Set Analysis

    DEFF Research Database (Denmark)

    Vatrapu, Ravi; Hussain, Abid; Buus Lassen, Niels

    2015-01-01

    of Facebook or Twitter data. However, there exist no other holistic computational social science approach beyond the relational sociology and graph theory of SNA. To address this limitation, this paper presents an alternative holistic approach to Big Social Data analytics called Social Set Analysis (SSA......This paper argues that the basic premise of Social Network Analysis (SNA) -- namely that social reality is constituted by dyadic relations and that social interactions are determined by structural properties of networks-- is neither necessary nor sufficient, for Big Social Data analytics...

  6. Markov set-chains

    CERN Document Server

    Hartfiel, Darald J

    1998-01-01

    In this study extending classical Markov chain theory to handle fluctuating transition matrices, the author develops a theory of Markov set-chains and provides numerous examples showing how that theory can be applied. Chapters are concluded with a discussion of related research. Readers who can benefit from this monograph are those interested in, or involved with, systems whose data is imprecise or that fluctuate with time. A background equivalent to a course in linear algebra and one in probability theory should be sufficient.

  7. Analysis of successive data sets

    NARCIS (Netherlands)

    Spreeuwers, Lieuwe Jan; Breeuwer, Marcel; Haselhoff, Eltjo Hans

    2008-01-01

    The invention relates to the analysis of successive data sets. A local intensity variation is formed from such successive data sets, that is, from data values in successive data sets at corresponding positions in each of the data sets. A region of interest is localized in the individual data sets on

  8. Analysis of successive data sets

    NARCIS (Netherlands)

    Spreeuwers, Lieuwe Jan; Breeuwer, Marcel; Haselhoff, Eltjo Hans

    2002-01-01

    The invention relates to the analysis of successive data sets. A local intensity variation is formed from such successive data sets, that is, from data values in successive data sets at corresponding positions in each of the data sets. A region of interest is localized in the individual data sets on

  9. Dynamical basis set

    International Nuclear Information System (INIS)

    Blanco, M.; Heller, E.J.

    1985-01-01

    A new Cartesian basis set is defined that is suitable for the representation of molecular vibration-rotation bound states. The Cartesian basis functions are superpositions of semiclassical states generated through the use of classical trajectories that conform to the intrinsic dynamics of the molecule. Although semiclassical input is employed, the method becomes ab initio through the standard matrix diagonalization variational method. Special attention is given to classical-quantum correspondences for angular momentum. In particular, it is shown that the use of semiclassical information preferentially leads to angular momentum eigenstates with magnetic quantum number Vertical BarMVertical Bar equal to the total angular momentum J. The present method offers a reliable technique for representing highly excited vibrational-rotational states where perturbation techniques are no longer applicable

  10. Gene doping.

    Science.gov (United States)

    Haisma, H J; de Hon, O

    2006-04-01

    Together with the rapidly increasing knowledge on genetic therapies as a promising new branch of regular medicine, the issue has arisen whether these techniques might be abused in the field of sports. Previous experiences have shown that drugs that are still in the experimental phases of research may find their way into the athletic world. Both the World Anti-Doping Agency (WADA) and the International Olympic Committee (IOC) have expressed concerns about this possibility. As a result, the method of gene doping has been included in the list of prohibited classes of substances and prohibited methods. This review addresses the possible ways in which knowledge gained in the field of genetic therapies may be misused in elite sports. Many genes are readily available which may potentially have an effect on athletic performance. The sporting world will eventually be faced with the phenomena of gene doping to improve athletic performance. A combination of developing detection methods based on gene arrays or proteomics and a clear education program on the associated risks seems to be the most promising preventive method to counteract the possible application of gene doping.

  11. Gene Locater

    DEFF Research Database (Denmark)

    Anwar, Muhammad Zohaib; Sehar, Anoosha; Rehman, Inayat-Ur

    2012-01-01

    software's for calculating recombination frequency is mostly limited to the range and flexibility of this type of analysis. GENE LOCATER is a fully customizable program for calculating recombination frequency, written in JAVA. Through an easy-to-use interface, GENE LOCATOR allows users a high degree...... of flexibility in calculating genetic linkage and displaying linkage group. Among other features, this software enables user to identify linkage groups with output visualized graphically. The program calculates interference and coefficient of coincidence with elevated accuracy in sample datasets. AVAILABILITY...

  12. Setting the scene

    International Nuclear Information System (INIS)

    Curran, S.

    1977-01-01

    The reasons for the special meeting on the breeder reactor are outlined with some reference to the special Scottish interest in the topic. Approximately 30% of the electrical energy generated in Scotland is nuclear and the special developments at Dounreay make policy decisions on the future of the commercial breeder reactor urgent. The participants review the major questions arising in arriving at such decisions. In effect an attempt is made to respond to the wish of the Secretary of State for Energy to have informed debate. To set the scene the importance of energy availability as regards to the strength of the national economy is stressed and the reasons for an increasing energy demand put forward. Examination of alternative sources of energy shows that none is definitely capable of filling the foreseen energy gap. This implies an integrated thermal/breeder reactor programme as the way to close the anticipated gap. The problems of disposal of radioactive waste and the safeguards in the handling of plutonium are outlined. Longer-term benefits, including the consumption of plutonium and naturally occurring radioactive materials, are examined. (author)

  13. Ready, set, move!

    CERN Multimedia

    Anaïs Schaeffer

    2012-01-01

    This year, the CERN Medical Service is launching a new public health campaign. Advertised by the catchphrase “Move! & Eat Better”, the particular aim of the campaign is to encourage people at CERN to take more regular exercise, of whatever kind.   The CERN annual relay race is scheduled on 24 May this year. The CERN Medical Service will officially launch its “Move! & Eat Better” campaign at this popular sporting event. “We shall be on hand on the day of the race to strongly advocate regular physical activity,” explains Rachid Belkheir, one of the Medical Service doctors. "We really want to pitch our campaign and answer any questions people may have. Above all we want to set an example. So we are going to walk the same circuit as the runners to underline to people that they can easily incorporate movement into their daily routine.” An underlying concern has prompted this campaign: during their first few year...

  14. Synthetic promoter libraries- tuning of gene expression

    DEFF Research Database (Denmark)

    Hammer, Karin; Mijakovic, Ivan; Jensen, Peter Ruhdal

    2006-01-01

    knockout and strong overexpression. However, applications such as metabolic optimization and control analysis necessitate a continuous set of expression levels with only slight increments in strength to cover a specific window around the wildtype expression level of the studied gene; this requirement can......The study of gene function often requires changing the expression of a gene and evaluating the consequences. In principle, the expression of any given gene can be modulated in a quasi-continuum of discrete expression levels but the traditional approaches are usually limited to two extremes: gene...

  15. Set discrimination of quantum states

    International Nuclear Information System (INIS)

    Zhang Shengyu; Ying Mingsheng

    2002-01-01

    We introduce a notion of set discrimination, which is an interesting extension of quantum state discrimination. A state is secretly chosen from a number of quantum states, which are partitioned into some disjoint sets. A set discrimination is required to identify which set the given state belongs to. Several essential problems are addressed in this paper, including the condition of perfect set discrimination, unambiguous set discrimination, and in the latter case, the efficiency of the discrimination. This generalizes some important results on quantum state discrimination in the literature. A combination of state and set discrimination and the efficiency are also studied

  16. Soft sets combined with interval valued intuitionistic fuzzy sets of type-2 and rough sets

    Directory of Open Access Journals (Sweden)

    Anjan Mukherjee

    2015-03-01

    Full Text Available Fuzzy set theory, rough set theory and soft set theory are all mathematical tools dealing with uncertainties. The concept of type-2 fuzzy sets was introduced by Zadeh in 1975 which was extended to interval valued intuitionistic fuzzy sets of type-2 by the authors.This paper is devoted to the discussions of the combinations of interval valued intuitionistic sets of type-2, soft sets and rough sets.Three different types of new hybrid models, namely-interval valued intuitionistic fuzzy soft sets of type-2, soft rough interval valued intuitionistic fuzzy sets of type-2 and soft interval valued intuitionistic fuzzy rough sets of type-2 are proposed and their properties are derived.

  17. Reranking candidate gene models with cross-species comparison for improved gene prediction

    Directory of Open Access Journals (Sweden)

    Pereira Fernando CN

    2008-10-01

    Full Text Available Abstract Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc. Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models.

  18. Special set linear algebra and special set fuzzy linear algebra

    OpenAIRE

    Kandasamy, W. B. Vasantha; Smarandache, Florentin; Ilanthenral, K.

    2009-01-01

    The authors in this book introduce the notion of special set linear algebra and special set fuzzy Linear algebra, which is an extension of the notion set linear algebra and set fuzzy linear algebra. These concepts are best suited in the application of multi expert models and cryptology. This book has five chapters. In chapter one the basic concepts about set linear algebra is given in order to make this book a self contained one. The notion of special set linear algebra and their fuzzy analog...

  19. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    Science.gov (United States)

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  20. Re-Setting Music Education's "Default Settings"

    Science.gov (United States)

    Regelski, Thomas A.

    2013-01-01

    This paper explores the effects and problems of one highly influential default setting of the "normal style template" of music education and proposes some alternatives. These do not require abandoning all traditional templates for school music. But re-setting the default settings does depend on reconsidering the promised function of…

  1. Industrial scale gene synthesis.

    Science.gov (United States)

    Notka, Frank; Liss, Michael; Wagner, Ralf

    2011-01-01

    The most recent developments in the area of deep DNA sequencing and downstream quantitative and functional analysis are rapidly adding a new dimension to understanding biochemical pathways and metabolic interdependencies. These increasing insights pave the way to designing new strategies that address public needs, including environmental applications and therapeutic inventions, or novel cell factories for sustainable and reconcilable energy or chemicals sources. Adding yet another level is building upon nonnaturally occurring networks and pathways. Recent developments in synthetic biology have created economic and reliable options for designing and synthesizing genes, operons, and eventually complete genomes. Meanwhile, high-throughput design and synthesis of extremely comprehensive DNA sequences have evolved into an enabling technology already indispensable in various life science sectors today. Here, we describe the industrial perspective of modern gene synthesis and its relationship with synthetic biology. Gene synthesis contributed significantly to the emergence of synthetic biology by not only providing the genetic material in high quality and quantity but also enabling its assembly, according to engineering design principles, in a standardized format. Synthetic biology on the other hand, added the need for assembling complex circuits and large complexes, thus fostering the development of appropriate methods and expanding the scope of applications. Synthetic biology has also stimulated interdisciplinary collaboration as well as integration of the broader public by addressing socioeconomic, philosophical, ethical, political, and legal opportunities and concerns. The demand-driven technological achievements of gene synthesis and the implemented processes are exemplified by an industrial setting of large-scale gene synthesis, describing production from order to delivery. Copyright © 2011 Elsevier Inc. All rights reserved.

  2. SET oncoprotein accumulation regulates transcription through DNA demethylation and histone hypoacetylation.

    Science.gov (United States)

    Almeida, Luciana O; Neto, Marinaldo P C; Sousa, Lucas O; Tannous, Maryna A; Curti, Carlos; Leopoldino, Andreia M

    2017-04-18

    Epigenetic modifications are essential in the control of normal cellular processes and cancer development. DNA methylation and histone acetylation are major epigenetic modifications involved in gene transcription and abnormal events driving the oncogenic process. SET protein accumulates in many cancer types, including head and neck squamous cell carcinoma (HNSCC); SET is a member of the INHAT complex that inhibits gene transcription associating with histones and preventing their acetylation. We explored how SET protein accumulation impacts on the regulation of gene expression, focusing on DNA methylation and histone acetylation. DNA methylation profile of 24 tumour suppressors evidenced that SET accumulation decreased DNA methylation in association with loss of 5-methylcytidine, formation of 5-hydroxymethylcytosine and increased TET1 levels, indicating an active DNA demethylation mechanism. However, the expression of some suppressor genes was lowered in cells with high SET levels, suggesting that loss of methylation is not the main mechanism modulating gene expression. SET accumulation also downregulated the expression of 32 genes of a panel of 84 transcription factors, and SET directly interacted with chromatin at the promoter of the downregulated genes, decreasing histone acetylation. Gene expression analysis after cell treatment with 5-aza-2'-deoxycytidine (5-AZA) and Trichostatin A (TSA) revealed that histone acetylation reversed transcription repression promoted by SET. These results suggest a new function for SET in the regulation of chromatin dynamics. In addition, TSA diminished both SET protein levels and SET capability to bind to gene promoter, suggesting that administration of epigenetic modifier agents could be efficient to reverse SET phenotype in cancer.

  3. Reference Gene Screening for Analyzing Gene Expression Across Goat Tissue

    Directory of Open Access Journals (Sweden)

    Yu Zhang

    2013-12-01

    Full Text Available Real-time quantitative PCR (qRT-PCR is one of the important methods for investigating the changes in mRNA expression levels in cells and tissues. Selection of the proper reference genes is very important when calibrating the results of real-time quantitative PCR. Studies on the selection of reference genes in goat tissues are limited, despite the economic importance of their meat and dairy products. We used real-time quantitative PCR to detect the expression levels of eight reference gene candidates (18S, TBP, HMBS, YWHAZ, ACTB, HPRT1, GAPDH and EEF1A2 in ten tissues types sourced from Boer goats. The optimal reference gene combination was selected according to the results determined by geNorm, NormFinder and Bestkeeper software packages. The analyses showed that tissue is an important variability factor in genes expression stability. When all tissues were considered, 18S, TBP and HMBS is the optimal reference combination for calibrating quantitative PCR analysis of gene expression from goat tissues. Dividing data set by tissues, ACTB was the most stable in stomach, small intestine and ovary, 18S in heart and spleen, HMBS in uterus and lung, TBP in liver, HPRT1 in kidney and GAPDH in muscle. Overall, this study provided valuable information about the goat reference genes that can be used in order to perform a proper normalisation when relative quantification by qRT-PCR studies is undertaken.

  4. Complex community of nitrite-dependent anaerobic methane oxidation bacteria in coastal sediments of the Mai Po wetland by PCR amplification of both 16S rRNA and pmoA genes.

    Science.gov (United States)

    Chen, Jing; Zhou, Zhichao; Gu, Ji-Dong

    2015-02-01

    In the present work, both 16S rRNA and pmoA gene-based PCR primers were employed successfully to study the diversity and distribution of n-damo bacteria in the surface and lower layer sediments at the coastal Mai Po wetland. The occurrence of n-damo bacteria in both the surface and subsurface sediments with high diversity was confirmed in this study. Unlike the two other known n-damo communities from coastal areas, the pmoA gene-amplified sequences in the present work clustered not only with some freshwater subclusters but also within three newly erected marine subclusters mostly, indicating the unique niche specificity of n-damo bacteria in this wetland. Results suggested vegetation affected the distribution and community structures of n-damo bacteria in the sediments and n-damo could coexist with sulfate-reducing methanotrophs in the coastal ecosystem. Community structures of the Mai Po n-damo bacteria based on 16S rRNA gene were different from those of either the freshwater or the marine. In contrast, structures of the Mai Po n-damo communities based on pmoA gene grouped with the marine ones and were clearly distinguished from the freshwater ones. The abundance of n-damo bacteria at this wetland was quantified using 16S rRNA gene PCR primers to be 2.65-6.71 × 10(5) copies/g dry sediment. Ammonium and nitrite strongly affected the community structures and distribution of n-damo bacteria in the coastal Mai Po wetland sediments.

  5. Gene doping: gene delivery for olympic victory

    OpenAIRE

    Gould, David

    2012-01-01

    With one recently recommended gene therapy in Europe and a number of other gene therapy treatments now proving effective in clinical trials it is feasible that the same technologies will soon be adopted in the world of sport by unscrupulous athletes and their trainers in so called ‘gene doping’. In this article an overview of the successful gene therapy clinical trials is provided and the potential targets for gene doping are highlighted. Depending on whether a doping gene product is secreted...

  6. MOTIVATION: Goals and Goal Setting

    Science.gov (United States)

    Stratton, Richard K.

    2005-01-01

    Goal setting has great impact on a team's performance. Goals enable a team to synchronize their efforts to achieve success. In this article, the author talks about goals and goal setting. This articles complements Domain 5--Teaching and Communication (p.14) and discusses one of the benchmarks listed therein: "Teach the goal setting process and…

  7. Genes and Hearing Loss

    Science.gov (United States)

    ... ENTCareers Marketplace Find an ENT Doctor Near You Genes and Hearing Loss Genes and Hearing Loss Patient ... mutation may only have dystopia canthorum. How Do Genes Work? Genes are a road map for the ...

  8. Multidimensional scaling for large genomic data sets

    Directory of Open Access Journals (Sweden)

    Lu Henry

    2008-04-01

    Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully

  9. Mycobacterium abscessus in Healthcare Settings

    Science.gov (United States)

    ... Duodenoscope Sampling Method Interim Duodenoscope Culture Method Multiplex Real-Time PCR Detection of KPC & NDM-1 genes Quinolones ... bacterium are usually of the skin and the soft tissues under the skin. It is also a ...

  10. Gene expression and gene therapy imaging

    International Nuclear Information System (INIS)

    Rome, Claire; Couillaud, Franck; Moonen, Chrit T.W.

    2007-01-01

    The fast growing field of molecular imaging has achieved major advances in imaging gene expression, an important element of gene therapy. Gene expression imaging is based on specific probes or contrast agents that allow either direct or indirect spatio-temporal evaluation of gene expression. Direct evaluation is possible with, for example, contrast agents that bind directly to a specific target (e.g., receptor). Indirect evaluation may be achieved by using specific substrate probes for a target enzyme. The use of marker genes, also called reporter genes, is an essential element of MI approaches for gene expression in gene therapy. The marker gene may not have a therapeutic role itself, but by coupling the marker gene to a therapeutic gene, expression of the marker gene reports on the expression of the therapeutic gene. Nuclear medicine and optical approaches are highly sensitive (detection of probes in the picomolar range), whereas MRI and ultrasound imaging are less sensitive and require amplification techniques and/or accumulation of contrast agents in enlarged contrast particles. Recently developed MI techniques are particularly relevant for gene therapy. Amongst these are the possibility to track gene therapy vectors such as stem cells, and the techniques that allow spatiotemporal control of gene expression by non-invasive heating (with MRI guided focused ultrasound) and the use of temperature sensitive promoters. (orig.)

  11. Identification of Human HK Genes and Gene Expression Regulation Study in Cancer from Transcriptomics Data Analysis

    Science.gov (United States)

    Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun

    2013-01-01

    The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867

  12. Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria

    Directory of Open Access Journals (Sweden)

    Muscariello Lidia

    2006-05-01

    Full Text Available Abstract Background Genomes of gram-positive bacteria encode many putative cell-surface proteins, of which the majority has no known function. From the rapidly increasing number of available genome sequences it has become apparent that many cell-surface proteins are conserved, and frequently encoded in gene clusters or operons, suggesting common functions, and interactions of multiple components. Results A novel gene cluster encoding exclusively cell-surface proteins was identified, which is conserved in a subgroup of gram-positive bacteria. Each gene cluster generally has one copy of four new gene families called cscA, cscB, cscC and cscD. Clusters encoding these cell-surface proteins were found only in complete genomes of Lactobacillus plantarum, Lactobacillus sakei, Enterococcus faecalis, Listeria innocua, Listeria monocytogenes, Lactococcus lactis ssp lactis and Bacillus cereus and in incomplete genomes of L. lactis ssp cremoris, Lactobacillus casei, Enterococcus faecium, Pediococcus pentosaceus, Lactobacillius brevis, Oenococcus oeni, Leuconostoc mesenteroides, and Bacillus thuringiensis. These genes are neither present in the genomes of streptococci, staphylococci and clostridia, nor in the Lactobacillus acidophilus group, suggesting a niche-specific distribution, possibly relating to association with plants. All encoded proteins have a signal peptide for secretion by the Sec-dependent pathway, while some have cell-surface anchors, novel WxL domains, and putative domains for sugar binding and degradation. Transcriptome analysis in L. plantarum shows that the cscA-D genes are co-expressed, supporting their operon organization. Many gene clusters are significantly up-regulated in a glucose-grown, ccpA-mutant derivative of L. plantarum, suggesting catabolite control. This is supported by the presence of predicted CRE-sites upstream or inside the up-regulated cscA-D gene clusters. Conclusion We propose that the CscA, CscB, CscC and Csc

  13. Plant SET domain-containing proteins: structure, function and regulation

    Czech Academy of Sciences Publication Activity Database

    Ng, D.W.K.; Wang, T.; Chandrasekharan, M.B.; Aramayo, R.; Kertbundit, Sunee; Hall, T.C.

    2007-01-01

    Roč. 1769, 5-6 (2007), s. 316-329 ISSN 0167-4781 Institutional research plan: CEZ:AV0Z50380511 Keywords : arabidopsis SET genes * alternative splicing * epigenetic s Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.704, year: 2007

  14. Healthcare priority setting in Kenya

    DEFF Research Database (Denmark)

    Bukachi, Salome A.; Onyango-Ouma, Washington; Siso, Jared Maaka

    2014-01-01

    In resource-poor settings, the accountability for reasonableness (A4R) has been identified as an important advance in priority setting that helps to operationalize fair priority setting in specific contexts. The four conditions of A4R are backed by theory, not evidence, that conformance with them...... improves the priority setting decisions. This paper describes the healthcare priority setting processes in Malindi district, Kenya, prior to the implementation of A4R in 2008 and evaluates the process for its conformance with the conditions for A4R. In-depth interviews and focus group discussions with key...... players in the Malindi district health system and a review of key policy documents and national guidelines show that the priority setting process in the district relies heavily on guidelines from the national level, making it more of a vertical, top-down orientation. Multilateral and donor agencies...

  15. Social settings and addiction relapse.

    Science.gov (United States)

    Walton, M A; Reischl, T M; Ramanthan, C S

    1995-01-01

    Despite addiction theorists' acknowledgment of the impact of environmental factors on relapse, researchers have not adequately investigated these influences. Ninety-six substance users provided data regarding their perceived risk for relapse, exposure to substances, and involvement in reinforcing activities. These three setting attributes were assessed in their home, work, and community settings. Reuse was assessed 3 months later. When controlling for confounding variables, aspects of the home settings significantly distinguished abstainers from reusers; perceived risk for relapse was the strongest predictor of reuse. Exposure to substances and involvement in reinforcing activities were not robust reuse indicators. The work and community settings were not significant determinants of reuse. These findings offer some initial support for the utility of examining social settings to better understand addiction relapse and recovery. Identification of setting-based relapse determinants provides concrete targets for relapse prevention interventions.

  16. Hierarchical Sets: Analyzing Pangenome Structure through Scalable Set Visualizations

    DEFF Research Database (Denmark)

    Pedersen, Thomas Lin

    2017-01-01

    of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https...

  17. Rooted triple consensus and anomalous gene trees

    Directory of Open Access Journals (Sweden)

    Schmidt Heiko A

    2008-04-01

    Full Text Available Abstract Background Anomalous gene trees (AGTs are gene trees with a topology different from a species tree that are more probable to observe than congruent gene trees. In this paper we propose a rooted triple approach to finding the correct species tree in the presence of AGTs. Results Based on simulated data we show that our method outperforms the extended majority rule consensus strategy, while still resolving the species tree. Applying both methods to a metazoan data set of 216 genes, we tested whether AGTs substantially interfere with the reconstruction of the metazoan phylogeny. Conclusion Evidence of AGTs was not found in this data set, suggesting that erroneously reconstructed gene trees are the most significant challenge in the reconstruction of phylogenetic relationships among species with current data. The new method does however rule out the erroneous reconstruction of deep or poorly resolved splits in the presence of lineage sorting.

  18. Imaging reporter gene for monitoring gene therapy

    International Nuclear Information System (INIS)

    Beco, V. de; Baillet, G.; Tamgac, F.; Tofighi, M.; Weinmann, P.; Vergote, J.; Moretti, J.L.; Tamgac, G.

    2002-01-01

    Scintigraphic images can be obtained to document gene function at cellular level. This approach is presented here and the use of a reporter gene to monitor gene therapy is described. Two main ways are presented: either the use of a reporter gene coding for an enzyme the action of which will be monitored by radiolabeled pro-drug, or a cellular receptor gene, the action of which is documented by a radio labeled cognate receptor ligand. (author)

  19. Fuzzy reasoning on Horn Set

    International Nuclear Information System (INIS)

    Liu, X.; Fang, K.

    1986-01-01

    A theoretical study in fuzzy reasoning on Horn Set is presented in this paper. The authors first introduce the concepts of λ-Horn Set of clauses and λ-Input Half Lock deduction. They then use the λ-resolution method to discuss fuzzy reasoning on λ-Horn set of clauses. It is proved that the proposed λ-Input Half Lock resolution method is complete with the rules in certain format

  20. A note on extreme sets

    Directory of Open Access Journals (Sweden)

    Radosław Cymer

    2017-10-01

    Full Text Available In decomposition theory, extreme sets have been studied extensively due to its connection to perfect matchings in a graph. In this paper, we first define extreme sets with respect to degree-matchings and next investigate some of their properties. In particular, we prove the generalized Decomposition Theorem and give a characterization for the set of all extreme vertices in a graph.

  1. APPLICATION OF GOAL SETTING THEORY

    OpenAIRE

    Yurtkoru, E. Serra; Bozkurt, Tulay; Bekta, Fatos; Ahmed, Mahir Jibril; Kola, Vehap

    2017-01-01

    The purpose of this study is to test the goal theorymodel originally developed by Locke and Latham in organizational setting inTurkey, and explain its influence on job satisfaction and affective commitment.Also mediating role of task specific strategy and moderating role ofselfefficacy are examined. Locke and Latham’s goal setting measure is adaptedto Turkish. Survey method is employed to collect data from 222 respondents fromautomotive industry. Goal setting dimensions predicted affective co...

  2. Algorithms over partially ordered sets

    DEFF Research Database (Denmark)

    Baer, Robert M.; Østerby, Ole

    1969-01-01

    in partially ordered sets, answer the combinatorial question of how many maximal chains might exist in a partially ordered set withn elements, and we give an algorithm for enumerating all maximal chains. We give (in § 3) algorithms which decide whether a partially ordered set is a (lower or upper) semi......-lattice, and whether a lattice has distributive, modular, and Boolean properties. Finally (in § 4) we give Algol realizations of the various algorithms....

  3. Closed sets of nonlocal correlations

    International Nuclear Information System (INIS)

    Allcock, Jonathan; Linden, Noah; Brunner, Nicolas; Popescu, Sandu; Skrzypczyk, Paul; Vertesi, Tamas

    2009-01-01

    We present a fundamental concept - closed sets of correlations - for studying nonlocal correlations. We argue that sets of correlations corresponding to information-theoretic principles, or more generally to consistent physical theories, must be closed under a natural set of operations. Hence, studying the closure of sets of correlations gives insight into which information-theoretic principles are genuinely different, and which are ultimately equivalent. This concept also has implications for understanding why quantum nonlocality is limited, and for finding constraints on physical theories beyond quantum mechanics.

  4. A book of set theory

    CERN Document Server

    Pinter, Charles C

    2014-01-01

    Suitable for upper-level undergraduates, this accessible approach to set theory poses rigorous but simple arguments. Each definition is accompanied by commentary that motivates and explains new concepts. Starting with a repetition of the familiar arguments of elementary set theory, the level of abstract thinking gradually rises for a progressive increase in complexity.A historical introduction presents a brief account of the growth of set theory, with special emphasis on problems that led to the development of the various systems of axiomatic set theory. Subsequent chapters explore classes and

  5. Quantum set theory and applications

    International Nuclear Information System (INIS)

    Rodriguez, E.

    1984-01-01

    The work of von Neumann tells us that the logic of quantum mechanics is not Boolenan. This suggests the formulation of a quantum theory of sets based on quantum logic much as modern set theory is based on Boolean logic. In the first part of this dissertation such a quantum set theory is developed. In the second part, quantum set theory is proposed as a universal language for physics. A quantum topology and the beginnings of a quantum geometry are developed in this language. Finally, a toy model is studied. It gives indications of possible lines for progress in this program

  6. COGNATE: comparative gene annotation characterizer.

    Science.gov (United States)

    Wilbrandt, Jeanne; Misof, Bernhard; Niehuis, Oliver

    2017-07-17

    The comparison of gene and genome structures across species has the potential to reveal major trends of genome evolution. However, such a comparative approach is currently hampered by a lack of standardization (e.g., Elliott TA, Gregory TR, Philos Trans Royal Soc B: Biol Sci 370:20140331, 2015). For example, testing the hypothesis that the total amount of coding sequences is a reliable measure of potential proteome diversity (Wang M, Kurland CG, Caetano-Anollés G, PNAS 108:11954, 2011) requires the application of standardized definitions of coding sequence and genes to create both comparable and comprehensive data sets and corresponding summary statistics. However, such standard definitions either do not exist or are not consistently applied. These circumstances call for a standard at the descriptive level using a minimum of parameters as well as an undeviating use of standardized terms, and for software that infers the required data under these strict definitions. The acquisition of a comprehensive, descriptive, and standardized set of parameters and summary statistics for genome publications and further analyses can thus greatly benefit from the availability of an easy to use standard tool. We developed a new open-source command-line tool, COGNATE (Comparative Gene Annotation Characterizer), which uses a given genome assembly and its annotation of protein-coding genes for a detailed description of the respective gene and genome structure parameters. Additionally, we revised the standard definitions of gene and genome structures and provide the definitions used by COGNATE as a working draft suggestion for further reference. Complete parameter lists and summary statistics are inferred using this set of definitions to allow down-stream analyses and to provide an overview of the genome and gene repertoire characteristics. COGNATE is written in Perl and freely available at the ZFMK homepage ( https://www.zfmk.de/en/COGNATE ) and on github ( https

  7. prokaryote genome annotation with GeneScan and GLIMMER

    Indian Academy of Sciences (India)

    Unknown

    The number of false predictions (both positive and negative) is higher for GeneScan as compared to GLIMMER, but in a ... on whether they need to be trained on a set of genes in order to ..... FP has partial matches to the kdpA gene in C. jejuni.

  8. Current perspectives in Set7 mediated stem cell differentiation

    Directory of Open Access Journals (Sweden)

    Nazanin Karimnia

    2016-12-01

    Full Text Available Set7 is a key regulatory enzyme involved in the methylation of lysine residues of histone and non-histone proteins. This lysine methyltransferase is induced during stem cell differentiation and regulates lineage specific gene transcription and cell fate. In this article we discuss recent experimental evidence identifying regulatory targets under the control of Set7 as well as emerging evidence of regulation in stem cell differentiation. Furthermore, we discuss the function of non-coding RNAs regulated by Set7 implicated in cell plasticity.

  9. Selection of Phototransduction Genes in Homo sapiens.

    Science.gov (United States)

    Christopher, Mark; Scheetz, Todd E; Mullins, Robert F; Abràmoff, Michael D

    2013-08-13

    We investigated the evidence of recent positive selection in the human phototransduction system at single nucleotide polymorphism (SNP) and gene level. SNP genotyping data from the International HapMap Project for European, Eastern Asian, and African populations was used to discover differences in haplotype length and allele frequency between these populations. Numeric selection metrics were computed for each SNP and aggregated into gene-level metrics to measure evidence of recent positive selection. The level of recent positive selection in phototransduction genes was evaluated and compared to a set of genes shown previously to be under recent selection, and a set of highly conserved genes as positive and negative controls, respectively. Six of 20 phototransduction genes evaluated had gene-level selection metrics above the 90th percentile: RGS9, GNB1, RHO, PDE6G, GNAT1, and SLC24A1. The selection signal across these genes was found to be of similar magnitude to the positive control genes and much greater than the negative control genes. There is evidence for selective pressure in the genes involved in retinal phototransduction, and traces of this selective pressure can be demonstrated using SNP-level and gene-level metrics of allelic variation. We hypothesize that the selective pressure on these genes was related to their role in low light vision and retinal adaptation to ambient light changes. Uncovering the underlying genetics of evolutionary adaptations in phototransduction not only allows greater understanding of vision and visual diseases, but also the development of patient-specific diagnostic and intervention strategies.

  10. Simple Comparative Analyses of Differentially Expressed Gene Lists May Overestimate Gene Overlap.

    Science.gov (United States)

    Lawhorn, Chelsea M; Schomaker, Rachel; Rowell, Jonathan T; Rueppell, Olav

    2018-04-16

    Comparing the overlap between sets of differentially expressed genes (DEGs) within or between transcriptome studies is regularly used to infer similarities between biological processes. Significant overlap between two sets of DEGs is usually determined by a simple test. The number of potentially overlapping genes is compared to the number of genes that actually occur in both lists, treating every gene as equal. However, gene expression is controlled by transcription factors that bind to a variable number of transcription factor binding sites, leading to variation among genes in general variability of their expression. Neglecting this variability could therefore lead to inflated estimates of significant overlap between DEG lists. With computer simulations, we demonstrate that such biases arise from variation in the control of gene expression. Significant overlap commonly arises between two lists of DEGs that are randomly generated, assuming that the control of gene expression is variable among genes but consistent between corresponding experiments. More overlap is observed when transcription factors are specific to their binding sites and when the number of genes is considerably higher than the number of different transcription factors. In contrast, overlap between two DEG lists is always lower than expected when the genetic architecture of expression is independent between the two experiments. Thus, the current methods for determining significant overlap between DEGs are potentially confounding biologically meaningful overlap with overlap that arises due to variability in control of expression among genes, and more sophisticated approaches are needed.

  11. Combined protein construct and synthetic gene engineering for heterologous protein expression and crystallization using Gene Composer

    Directory of Open Access Journals (Sweden)

    Walchli John

    2009-04-01

    Full Text Available Abstract Background With the goal of improving yield and success rates of heterologous protein production for structural studies we have developed the database and algorithm software package Gene Composer. This freely available electronic tool facilitates the information-rich design of protein constructs and their engineered synthetic gene sequences, as detailed in the accompanying manuscript. Results In this report, we compare heterologous protein expression levels from native sequences to that of codon engineered synthetic gene constructs designed by Gene Composer. A test set of proteins including a human kinase (P38α, viral polymerase (HCV NS5B, and bacterial structural protein (FtsZ were expressed in both E. coli and a cell-free wheat germ translation system. We also compare the protein expression levels in E. coli for a set of 11 different proteins with greatly varied G:C content and codon bias. Conclusion The results consistently demonstrate that protein yields from codon engineered Gene Composer designs are as good as or better than those achieved from the synonymous native genes. Moreover, structure guided N- and C-terminal deletion constructs designed with the aid of Gene Composer can lead to greater success in gene to structure work as exemplified by the X-ray crystallographic structure determination of FtsZ from Bacillus subtilis. These results validate the Gene Composer algorithms, and suggest that using a combination of synthetic gene and protein construct engineering tools can improve the economics of gene to structure research.

  12. Ranking candidate disease genes from gene expression and protein interaction: a Katz-centrality based approach.

    Directory of Open Access Journals (Sweden)

    Jing Zhao

    Full Text Available Many diseases have complex genetic causes, where a set of alleles can affect the propensity of getting the disease. The identification of such disease genes is important to understand the mechanistic and evolutionary aspects of pathogenesis, improve diagnosis and treatment of the disease, and aid in drug discovery. Current genetic studies typically identify chromosomal regions associated specific diseases. But picking out an unknown disease gene from hundreds of candidates located on the same genomic interval is still challenging. In this study, we propose an approach to prioritize candidate genes by integrating data of gene expression level, protein-protein interaction strength and known disease genes. Our method is based only on two, simple, biologically motivated assumptions--that a gene is a good disease-gene candidate if it is differentially expressed in cases and controls, or that it is close to other disease-gene candidates in its protein interaction network. We tested our method on 40 diseases in 58 gene expression datasets of the NCBI Gene Expression Omnibus database. On these datasets our method is able to predict unknown disease genes as well as identifying pleiotropic genes involved in the physiological cellular processes of many diseases. Our study not only provides an effective algorithm for prioritizing candidate disease genes but is also a way to discover phenotypic interdependency, cooccurrence and shared pathophysiology between different disorders.

  13. Setting up virtual private network

    International Nuclear Information System (INIS)

    Huang Hongmei; Zhang Chengjun

    2003-01-01

    Setting up virtual private network for business enterprise provides a low cost network foundation, increases enterprise's network function and enlarges its private scope. The text introduces virtual private network's principal, privileges and protocols that use in virtual private network. At last, this paper introduces several setting up virtual private network's technologies which based on LAN

  14. Setting up virtual private network

    International Nuclear Information System (INIS)

    Huang Hongmei; Zhang Chengjun

    2003-01-01

    Setting up virtual private network for business enterprise provides a low cost network foundation, increases enterprise network function and enlarges its private scope. This text introduces virtual private network principal, privileges and protocols applied in virtual private network. At last, this paper introduces several setting up virtual private network technologies which is based on LAN

  15. Bankruptcy Prediction with Rough Sets

    NARCIS (Netherlands)

    J.C. Bioch (Cor); V. Popova (Viara)

    2001-01-01

    textabstractThe bankruptcy prediction problem can be considered an or dinal classification problem. The classical theory of Rough Sets describes objects by discrete attributes, and does not take into account the order- ing of the attributes values. This paper proposes a modification of the Rough Set

  16. Ranking Specific Sets of Objects.

    Science.gov (United States)

    Maly, Jan; Woltran, Stefan

    2017-01-01

    Ranking sets of objects based on an order between the single elements has been thoroughly studied in the literature. In particular, it has been shown that it is in general impossible to find a total ranking - jointly satisfying properties as dominance and independence - on the whole power set of objects. However, in many applications certain elements from the entire power set might not be required and can be neglected in the ranking process. For instance, certain sets might be ruled out due to hard constraints or are not satisfying some background theory. In this paper, we treat the computational problem whether an order on a given subset of the power set of elements satisfying different variants of dominance and independence can be found, given a ranking on the elements. We show that this problem is tractable for partial rankings and NP-complete for total rankings.

  17. Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased.

    Science.gov (United States)

    Xi, Zhenxiang; Liu, Liang; Davis, Charles C

    2015-11-01

    The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014). Copyright © 2015 Elsevier Inc. All rights reserved.

  18. Multiclass gene selection using Pareto-fronts.

    Science.gov (United States)

    Rajapakse, Jagath C; Mundra, Piyushkumar A

    2013-01-01

    Filter methods are often used for selection of genes in multiclass sample classification by using microarray data. Such techniques usually tend to bias toward a few classes that are easily distinguishable from other classes due to imbalances of strong features and sample sizes of different classes. It could therefore lead to selection of redundant genes while missing the relevant genes, leading to poor classification of tissue samples. In this manuscript, we propose to decompose multiclass ranking statistics into class-specific statistics and then use Pareto-front analysis for selection of genes. This alleviates the bias induced by class intrinsic characteristics of dominating classes. The use of Pareto-front analysis is demonstrated on two filter criteria commonly used for gene selection: F-score and KW-score. A significant improvement in classification performance and reduction in redundancy among top-ranked genes were achieved in experiments with both synthetic and real-benchmark data sets.

  19. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  20. First-Class Object Sets

    DEFF Research Database (Denmark)

    Ernst, Erik

    Typically, objects are monolithic entities with a fixed interface. To increase the flexibility in this area, this paper presents first-class object sets as a language construct. An object set offers an interface which is a disjoint union of the interfaces of its member objects. It may also be used...... for a special kind of method invocation involving multiple objects in a dynamic lookup process. With support for feature access and late-bound method calls object sets are similar to ordinary objects, only more flexible. The approach is made precise by means of a small calculus, and the soundness of its type...

  1. Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

    International Nuclear Information System (INIS)

    Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

    2000-01-01

    A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society

  2. Predictions of Gene Family Distributions in Microbial Genomes: Evolution by Gene Duplication and Modification

    Energy Technology Data Exchange (ETDEWEB)

    Yanai, Itai; Camacho, Carlos J.; DeLisi, Charles

    2000-09-18

    A universal property of microbial genomes is the considerable fraction of genes that are homologous to other genes within the same genome. The process by which these homologues are generated is not well understood, but sequence analysis of 20 microbial genomes unveils a recurrent distribution of gene family sizes. We show that a simple evolutionary model based on random gene duplication and point mutations fully accounts for these distributions and permits predictions for the number of gene families in genomes not yet complete. Our findings are consistent with the notion that a genome evolves from a set of precursor genes to a mature size by gene duplications and increasing modifications. (c) 2000 The American Physical Society.

  3. Allele frequencies in the VRN-A1, VRN-B1 and VRN-D1 vernalization response and PPD-B1 and PPD-D1 photoperiod sensitivity genes, and their effects on heading in a diverse set of wheat cultivars (Triticum aestivum L.).

    Science.gov (United States)

    Kiss, Tibor; Balla, Krisztina; Veisz, Ottó; Láng, László; Bedő, Zoltán; Griffiths, Simon; Isaac, Peter; Karsai, Ildikó

    2014-01-01

    Heading of cereals is determined by complex genetic and environmental factors in which genes responsible for vernalization and photoperiod sensitivity play a decisive role. Our aim was to use diagnostic molecular markers to determine the main allele types in VRN - A1 , VRN - B1 , VRN - D1 , PPD - B1 and PPD - D1 in a worldwide wheat collection of 683 genotypes and to investigate the effect of these alleles on heading in the field. The dominant VRN - A1 , VRN - B1 and VRN - D1 alleles were present at a low frequency. The PPD - D1a photoperiod-insensitive allele was carried by 57 % of the cultivars and was most frequent in Asian and European cultivars. The PPD - B1 photoperiod-insensitive allele was carried by 22 % of the genotypes from Asia, America and Europe. Nine versions of the PPD - B1 -insensitive allele were identified based on gene copy number and intercopy structure. The allele compositions in PPD - D1 , PPD - B1 and VRN - D1 significantly influenced heading and together explained 37.5 % of the phenotypic variance. The role of gene model increased to 39.1 % when PPD - B1 intercopy structure was taken into account instead of overall PPD - B1 type (sensitive vs. insensitive). As a single component, PPD - D1 had the most important role (28.0 % of the phenotypic variance), followed by PPD - B1 (12.3 % for PPD - B1 _overall, and 15.1 % for PPD - B1 _intercopy) and VRN - D1 (2.2 %). Significant gene interactions were identified between the marker alleles within PPD - B1 and between VRN - D1 and the two PPD1 genes. The earliest heading genotypes were those with the photoperiod-insensitive allele in PPD - D1 and PPD - B1 , and with the spring allele for VRN - D1 and the winter alleles for VRN - A1 and VRN - B1 . This combination could only be detected in genotypes from Southern Europe and Asia. Late-heading genotypes had the sensitivity alleles for both PPD1 genes, regardless of the allelic composition of the VRN1 genes. There was a 10-day difference in

  4. Guidelines for setting speed limits

    CSIR Research Space (South Africa)

    Wium, DJW

    1986-02-01

    Full Text Available , parking and loading manoeuvres, access to bounding properties, intersections, width of road without central median and clear roadside area. The method should result in greater uniformity in speed limits for similar circumstances as set by different...

  5. Test Program Set (TPS) Lab

    Data.gov (United States)

    Federal Laboratory Consortium — The ARDEC TPS Laboratory provides an organic Test Program Set (TPS) development, maintenance, and life cycle management capability for DoD LCMC materiel developers....

  6. Price setting in turbulent times

    DEFF Research Database (Denmark)

    Ólafsson, Tjörvi; Pétursdóttir, Ásgerdur; Vignisdóttir, Karen Á.

    This price setting survey among Icelandic firms aims to make two contributions to the literature. First, it studies price setting in an advanced economy within a more turbulent macroeconomic environment than has previously been done. The results indicate that price adjustments are to a larger...... extent driven by exchange rate fluctuations than in most other advanced countries. The median Icelandic firm reviews its prices every four months and changes them every six months. The main sources of price rigidity and the most commonly used price setting methods are the same as in most other countries....... A second contribution to the literature is our analysis of the nexus between price setting and exchange rate movements, a topic that has attracted surprisingly limited attention in this survey-based literature. A novel aspect of our approach is to base our analysis on a categorisation of firms...

  7. On Intuitionistic Fuzzy Sets Theory

    CERN Document Server

    Atanassov, Krassimir T

    2012-01-01

    This book aims to be a  comprehensive and accurate survey of state-of-art research on intuitionistic fuzzy sets theory and could be considered a continuation and extension of the author´s previous book on Intuitionistic Fuzzy Sets, published by Springer in 1999 (Atanassov, Krassimir T., Intuitionistic Fuzzy Sets, Studies in Fuzziness and soft computing, ISBN 978-3-7908-1228-2, 1999). Since the aforementioned  book has appeared, the research activity of the author within the area of intuitionistic fuzzy sets has been expanding into many directions. The results of the author´s most recent work covering the past 12 years as well as the newest general ideas and open problems in this field have been therefore collected in this new book.

  8. Agenda-setting the unknown

    DEFF Research Database (Denmark)

    Dannevig, Halvor

    -setting theory, it is concluded that agenda-setting of climate change adaptation requires human agency in providing local legitimacy and salience for the issue. The thesis also finds that boundary arrangements are needed to bridge the gap between local knowledge and scientific knowledge for adaptation governance....... Attempts at such boundary arrangements are already in place at the regional governance levels, but they must be strengthened if municipalities are to take further steps in implementing adaptation measures....

  9. Introduction to Fuzzy Set Theory

    Science.gov (United States)

    Kosko, Bart

    1990-01-01

    An introduction to fuzzy set theory is described. Topics covered include: neural networks and fuzzy systems; the dynamical systems approach to machine intelligence; intelligent behavior as adaptive model-free estimation; fuzziness versus probability; fuzzy sets; the entropy-subsethood theorem; adaptive fuzzy systems for backing up a truck-and-trailer; product-space clustering with differential competitive learning; and adaptive fuzzy system for target tracking.

  10. Julia Sets of Orthogonal Polynomials

    DEFF Research Database (Denmark)

    Christiansen, Jacob Stordal; Henriksen, Christian; Petersen, Henrik Laurberg

    2018-01-01

    For a probability measure with compact and non-polar support in the complex plane we relate dynamical properties of the associated sequence of orthogonal polynomials fPng to properties of the support. More precisely we relate the Julia set of Pn to the outer boundary of the support, the lled Julia...... set to the polynomial convex hull K of the support, and the Green's function associated with Pn to the Green's function for the complement of K....

  11. Sets with Prescribed Arithmetic Densities

    Czech Academy of Sciences Publication Activity Database

    Luca, F.; Pomerance, C.; Porubský, Štefan

    2008-01-01

    Roč. 3, č. 2 (2008), s. 67-80 ISSN 1336-913X R&D Projects: GA ČR GA201/07/0191 Institutional research plan: CEZ:AV0Z10300504 Keywords : generalized arithmetic density * generalized asymptotic density * generalized logarithmic density * arithmetical semigroup * weighted arithmetic mean * ratio set * R-dense set * Axiom A * delta-regularly varying function Subject RIV: BA - General Mathematics

  12. Systemic consultation and goal setting

    OpenAIRE

    Carr, Alan

    1993-01-01

    Over two decades of empirical research conducted within a positivist framework has shown that goal setting is a particularly useful method for influencing task performance in occupational and industrial contexts. The conditions under which goal setting is maximally effective are now clearly established. These include situations where there is a high level of acceptance and commitment, where goals are specific and challenging, where the task is relatively simple rather than ...

  13. Philosophical introduction to set theory

    CERN Document Server

    Pollard, Stephen

    2015-01-01

    The primary mechanism for ideological and theoretical unification in modern mathematics, set theory forms an essential element of any comprehensive treatment of the philosophy of mathematics. This unique approach to set theory offers a technically informed discussion that covers a variety of philosophical issues. Rather than focusing on intuitionist and constructive alternatives to the Cantorian/Zermelian tradition, the author examines the two most important aspects of the current philosophy of mathematics, mathematical structuralism and mathematical applications of plural reference and plural

  14. Evaluation of suitable reference genes for gene expression studies in bovine muscular tissue

    Directory of Open Access Journals (Sweden)

    Dunner Susana

    2008-09-01

    Full Text Available Abstract Background Real-time reverse transcriptase quantitative polymerase chain reaction (real-time RTqPCR is a technique used to measure mRNA species copy number as a way to determine key genes involved in different biological processes. However, the expression level of these key genes may vary among tissues or cells not only as a consequence of differential expression but also due to different factors, including choice of reference genes to normalize the expression levels of the target genes; thus the selection of reference genes is critical for expression studies. For this purpose, ten candidate reference genes were investigated in bovine muscular tissue. Results The value of stability of ten candidate reference genes included in three groups was estimated: the so called 'classical housekeeping' genes (18S, GAPDH and ACTB, a second set of genes used in expression studies conducted on other tissues (B2M, RPII, UBC and HMBS and a third set of novel genes (SF3A1, EEF1A2 and CASC3. Three different statistical algorithms were used to rank the genes by their stability measures as produced by geNorm, NormFinder and Bestkeeper. The three methods tend to agree on the most stably expressed genes and the least in muscular tissue. EEF1A2 and HMBS followed by SF3A1, ACTB, and CASC3 can be considered as stable reference genes, and B2M, RPII, UBC and GAPDH would not be appropriate. Although the rRNA-18S stability measure seems to be within the range of acceptance, its use is not recommended because its synthesis regulation is not representative of mRNA levels. Conclusion Based on geNorm algorithm, we propose the use of three genes SF3A1, EEF1A2 and HMBS as references for normalization of real-time RTqPCR in muscle expression studies.

  15. An introduction to random sets

    CERN Document Server

    Nguyen, Hung T

    2006-01-01

    The study of random sets is a large and rapidly growing area with connections to many areas of mathematics and applications in widely varying disciplines, from economics and decision theory to biostatistics and image analysis. The drawback to such diversity is that the research reports are scattered throughout the literature, with the result that in science and engineering, and even in the statistics community, the topic is not well known and much of the enormous potential of random sets remains untapped.An Introduction to Random Sets provides a friendly but solid initiation into the theory of random sets. It builds the foundation for studying random set data, which, viewed as imprecise or incomplete observations, are ubiquitous in today''s technological society. The author, widely known for his best-selling A First Course in Fuzzy Logic text as well as his pioneering work in random sets, explores motivations, such as coarse data analysis and uncertainty analysis in intelligent systems, for studying random s...

  16. Setting MEPS for electronic products

    International Nuclear Information System (INIS)

    Siderius, Hans-Paul

    2014-01-01

    When analysing price, performance and efficiency data for 15 consumer electronic and information and communication technology products, we found that in general price did not relate to the efficiency of the product. Prices of electronic products with comparable performance decreased over time. For products where the data allowed fitting the relationship, we found an exponential decrease in price with an average time constant of −0.30 [1/year], meaning that every year the product became 26% cheaper on average. The results imply that the classical approach of setting minimum efficiency performance standards (MEPS) by means of life cycle cost calculations cannot be applied to electronic products. Therefore, an alternative approach based on the improvement of efficiency over time and the variation in efficiency of products on the market, is presented. The concept of a policy action window can provide guidance for the decision on whether setting MEPS for a certain product is appropriate. If the (formal) procedure for setting MEPS takes longer than the policy action window, this means that the efficiency improvement will also be achieved without setting MEPS. We found short, i.e. less than three years, policy action windows for graphic cards, network attached storage products, network switches and televisions. - Highlights: • For electronic consumer products price does not relate to efficiency. • Average price decrease of selected electronic products is 26 % per year. • We give an alternative approach to life cycle cost calculations for setting MEPS. • The policy action window indicates whether setting MEPS is appropriate

  17. Dimerization of a Viral SET Protein Endows its Function

    Energy Technology Data Exchange (ETDEWEB)

    H Wei; M Zhou

    2011-12-31

    Histone modifications are regarded as the most indispensible phenomena in epigenetics. Of these modifications, lysine methylation is of the greatest complexity and importance as site- and state-specific lysine methylation exerts a plethora of effects on chromatin structure and gene transcription. Notably, paramecium bursaria chlorella viruses encode a conserved SET domain methyltransferase, termed vSET, that functions to suppress host transcription by methylating histone H3 at lysine 27 (H3K27), a mark for eukaryotic gene silencing. Unlike mammalian lysine methyltransferases (KMTs), vSET functions only as a dimer, but the underlying mechanism has remained elusive. In this study, we demonstrate that dimeric vSET operates with negative cooperativity between the two active sites and engages in H3K27 methylation one site at a time. New atomic structures of vSET in the free form and a ternary complex with S-adenosyl homocysteine and a histone H3 peptide and biochemical analyses reveal the molecular origin for the negative cooperativity and explain the substrate specificity of H3K27 methyltransferases. Our study suggests a 'walking' mechanism, by which vSET acts all by itself to globally methylate host H3K27, which is accomplished by the mammalian EZH2 KMT only in the context of the Polycomb repressive complex.

  18. Set-Pi: Set Membership pi-Calculus

    DEFF Research Database (Denmark)

    Bruni, Alessandro; Mödersheim, Sebastian Alexander; Nielson, Flemming

    2015-01-01

    Communication protocols often rely on stateful mechanisms to ensure certain security properties. For example, counters and timestamps can be used to ensure authentication, or the security of communication can depend on whether a particular key is registered to a server or it has been revoked. Pro......Verif, like other state of the art tools for protocol analysis, achieves good performance by converting a formal protocol specification into a set of Horn clauses, that represent a monotonically growing set of facts that a Dolev-Yao attacker can derive from the system. Since this set of facts is not state...... method with three examples, a simple authentication protocol based on counters, a key registration protocol, and a model of the Yubikey security device....

  19. Imaging gene expression in gene therapy

    International Nuclear Information System (INIS)

    Wiebe, Leonard I.

    1997-01-01

    Full text. Gene therapy can be used to introduce new genes, or to supplement the function of indigenous genes. At the present time, however, there is non-invasive test to demonstrate efficacy of the gene transfer and expression processes. It has been postulated that scintigraphic imaging can offer unique information on both the site at which the transferred gene is expressed, and the degree of expression, both of which are critical issue for safety and clinical efficacy. Many current studies are based on 'suicide gene therapy' of cancer. Cells modified to express these genes commit metabolic suicide in the presence of an enzyme encoded by the transferred gene and a specifically-convertible pro drug. Pro drug metabolism can lead to selective metabolic trapping, required for scintigraphy. Herpes simplex virus type-1 thymidine kinase (H S V-1 t k + ) has been use for 'suicide' in vivo tumor gene therapy. It has been proposed that radiolabelled nucleosides can be used as radiopharmaceuticals to detect H S V-1 t k + gene expression where the H S V-1 t k + gene serves a reporter or therapeutic function. Animal gene therapy models have been studied using purine-([ 18 F]F H P G; [ 18 F]-A C V), and pyrimidine- ([ 123 / 131 I]I V R F U; [ 124 / 131I ]) antiviral nucleosides. Principles of gene therapy and gene therapy imaging will be reviewed and experimental data for [ 123 / 131I ]I V R F U imaging with the H S V-1 t k + reporter gene will be presented

  20. Imaging gene expression in gene therapy

    Energy Technology Data Exchange (ETDEWEB)

    Wiebe, Leonard I. [Alberta Univ., Edmonton (Canada). Noujaim Institute for Pharmaceutical Oncology Research

    1997-12-31

    Full text. Gene therapy can be used to introduce new genes, or to supplement the function of indigenous genes. At the present time, however, there is non-invasive test to demonstrate efficacy of the gene transfer and expression processes. It has been postulated that scintigraphic imaging can offer unique information on both the site at which the transferred gene is expressed, and the degree of expression, both of which are critical issue for safety and clinical efficacy. Many current studies are based on `suicide gene therapy` of cancer. Cells modified to express these genes commit metabolic suicide in the presence of an enzyme encoded by the transferred gene and a specifically-convertible pro drug. Pro drug metabolism can lead to selective metabolic trapping, required for scintigraphy. Herpes simplex virus type-1 thymidine kinase (H S V-1 t k{sup +}) has been use for `suicide` in vivo tumor gene therapy. It has been proposed that radiolabelled nucleosides can be used as radiopharmaceuticals to detect H S V-1 t k{sup +} gene expression where the H S V-1 t k{sup +} gene serves a reporter or therapeutic function. Animal gene therapy models have been studied using purine-([{sup 18} F]F H P G; [{sup 18} F]-A C V), and pyrimidine- ([{sup 123}/{sup 131} I]I V R F U; [{sup 124}/{sup 131I}]) antiviral nucleosides. Principles of gene therapy and gene therapy imaging will be reviewed and experimental data for [{sup 123}/{sup 131I}]I V R F U imaging with the H S V-1 t k{sup +} reporter gene will be presented

  1. Gene Ontology Consortium: going forward.

    Science.gov (United States)

    2015-01-01

    The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. A support vector machine based test for incongruence between sets of trees in tree space

    Science.gov (United States)

    2012-01-01

    Background The increased use of multi-locus data sets for phylogenetic reconstruction has increased the need to determine whether a set of gene trees significantly deviate from the phylogenetic patterns of other genes. Such unusual gene trees may have been influenced by other evolutionary processes such as selection, gene duplication, or horizontal gene transfer. Results Motivated by this problem we propose a nonparametric goodness-of-fit test for two empirical distributions of gene trees, and we developed the software GeneOut to estimate a p-value for the test. Our approach maps trees into a multi-dimensional vector space and then applies support vector machines (SVMs) to measure the separation between two sets of pre-defined trees. We use a permutation test to assess the significance of the SVM separation. To demonstrate the performance of GeneOut, we applied it to the comparison of gene trees simulated within different species trees across a range of species tree depths. Applied directly to sets of simulated gene trees with large sample sizes, GeneOut was able to detect very small differences between two set of gene trees generated under different species trees. Our statistical test can also include tree reconstruction into its test framework through a variety of phylogenetic optimality criteria. When applied to DNA sequence data simulated from different sets of gene trees, results in the form of receiver operating characteristic (ROC) curves indicated that GeneOut performed well in the detection of differences between sets of trees with different distributions in a multi-dimensional space. Furthermore, it controlled false positive and false negative rates very well, indicating a high degree of accuracy. Conclusions The non-parametric nature of our statistical test provides fast and efficient analyses, and makes it an applicable test for any scenario where evolutionary or other factors can lead to trees with different multi-dimensional distributions. The

  3. Annotation-based feature extraction from sets of SBML models.

    Science.gov (United States)

    Alm, Rebekka; Waltemath, Dagmar; Wolfien, Markus; Wolkenhauer, Olaf; Henkel, Ron

    2015-01-01

    Model repositories such as BioModels Database provide computational models of biological systems for the scientific community. These models contain rich semantic annotations that link model entities to concepts in well-established bio-ontologies such as Gene Ontology. Consequently, thematically similar models are likely to share similar annotations. Based on this assumption, we argue that semantic annotations are a suitable tool to characterize sets of models. These characteristics improve model classification, allow to identify additional features for model retrieval tasks, and enable the comparison of sets of models. In this paper we discuss four methods for annotation-based feature extraction from model sets. We tested all methods on sets of models in SBML format which were composed from BioModels Database. To characterize each of these sets, we analyzed and extracted concepts from three frequently used ontologies, namely Gene Ontology, ChEBI and SBO. We find that three out of the methods are suitable to determine characteristic features for arbitrary sets of models: The selected features vary depending on the underlying model set, and they are also specific to the chosen model set. We show that the identified features map on concepts that are higher up in the hierarchy of the ontologies than the concepts used for model annotations. Our analysis also reveals that the information content of concepts in ontologies and their usage for model annotation do not correlate. Annotation-based feature extraction enables the comparison of model sets, as opposed to existing methods for model-to-keyword comparison, or model-to-model comparison.

  4. GeneBins: a database for classifying gene expression data, with application to plant genome arrays

    Directory of Open Access Journals (Sweden)

    Weiller Georg

    2007-03-01

    Full Text Available Abstract Background To interpret microarray experiments, several ontological analysis tools have been developed. However, current tools are limited to specific organisms. Results We developed a bioinformatics system to assign the probe set sequences of any organism to a hierarchical functional classification modelled on KEGG ontology. The GeneBins database currently supports the functional classification of expression data from four Affymetrix arrays; Arabidopsis thaliana, Oryza sativa, Glycine max and Medicago truncatula. An online analysis tool to identify relevant functions is also provided. Conclusion GeneBins provides resources to interpret gene expression results from microarray experiments. It is available at http://bioinfoserver.rsbs.anu.edu.au/utils/GeneBins/

  5. Gene doping: gene delivery for olympic victory.

    Science.gov (United States)

    Gould, David

    2013-08-01

    With one recently recommended gene therapy in Europe and a number of other gene therapy treatments now proving effective in clinical trials it is feasible that the same technologies will soon be adopted in the world of sport by unscrupulous athletes and their trainers in so called 'gene doping'. In this article an overview of the successful gene therapy clinical trials is provided and the potential targets for gene doping are highlighted. Depending on whether a doping gene product is secreted from the engineered cells or is retained locally to, or inside engineered cells will, to some extent, determine the likelihood of detection. It is clear that effective gene delivery technologies now exist and it is important that detection and prevention plans are in place. © 2012 The Author. British Journal of Clinical Pharmacology © 2012 The British Pharmacological Society.

  6. Maximal Abelian sets of roots

    CERN Document Server

    Lawther, R

    2018-01-01

    In this work the author lets \\Phi be an irreducible root system, with Coxeter group W. He considers subsets of \\Phi which are abelian, meaning that no two roots in the set have sum in \\Phi \\cup \\{ 0 \\}. He classifies all maximal abelian sets (i.e., abelian sets properly contained in no other) up to the action of W: for each W-orbit of maximal abelian sets we provide an explicit representative X, identify the (setwise) stabilizer W_X of X in W, and decompose X into W_X-orbits. Abelian sets of roots are closely related to abelian unipotent subgroups of simple algebraic groups, and thus to abelian p-subgroups of finite groups of Lie type over fields of characteristic p. Parts of the work presented here have been used to confirm the p-rank of E_8(p^n), and (somewhat unexpectedly) to obtain for the first time the 2-ranks of the Monster and Baby Monster sporadic groups, together with the double cover of the latter. Root systems of classical type are dealt with quickly here; the vast majority of the present work con...

  7. Superior Cross-Species Reference Genes: A Blueberry Case Study

    Science.gov (United States)

    Die, Jose V.; Rowland, Lisa J.

    2013-01-01

    The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well. PMID:24058469

  8. Superior cross-species reference genes: a blueberry case study.

    Directory of Open Access Journals (Sweden)

    Jose V Die

    Full Text Available The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well.

  9. Mining gene expression data of multiple sclerosis.

    Directory of Open Access Journals (Sweden)

    Pi Guo

    Full Text Available Microarray produces a large amount of gene expression data, containing various biological implications. The challenge is to detect a panel of discriminative genes associated with disease. This study proposed a robust classification model for gene selection using gene expression data, and performed an analysis to identify disease-related genes using multiple sclerosis as an example.Gene expression profiles based on the transcriptome of peripheral blood mononuclear cells from a total of 44 samples from 26 multiple sclerosis patients and 18 individuals with other neurological diseases (control were analyzed. Feature selection algorithms including Support Vector Machine based on Recursive Feature Elimination, Receiver Operating Characteristic Curve, and Boruta algorithms were jointly performed to select candidate genes associating with multiple sclerosis. Multiple classification models categorized samples into two different groups based on the identified genes. Models' performance was evaluated using cross-validation methods, and an optimal classifier for gene selection was determined.An overlapping feature set was identified consisting of 8 genes that were differentially expressed between the two phenotype groups. The genes were significantly associated with the pathways of apoptosis and cytokine-cytokine receptor interaction. TNFSF10 was significantly associated with multiple sclerosis. A Support Vector Machine model was established based on the featured genes and gave a practical accuracy of ∼86%. This binary classification model also outperformed the other models in terms of Sensitivity, Specificity and F1 score.The combined analytical framework integrating feature ranking algorithms and Support Vector Machine model could be used for selecting genes for other diseases.

  10. GOBO: gene expression-based outcome for breast cancer online.

    Directory of Open Access Journals (Sweden)

    Markus Ringnér

    Full Text Available Microarray-based gene expression analysis holds promise of improving prognostication and treatment decisions for breast cancer patients. However, the heterogeneity of breast cancer emphasizes the need for validation of prognostic gene signatures in larger sample sets stratified into relevant subgroups. Here, we describe a multifunctional user-friendly online tool, GOBO (http://co.bmc.lu.se/gobo, allowing a range of different analyses to be performed in an 1881-sample breast tumor data set, and a 51-sample breast cancer cell line set, both generated on Affymetrix U133A microarrays. GOBO supports a wide range of applications including: 1 rapid assessment of gene expression levels in subgroups of breast tumors and cell lines, 2 identification of co-expressed genes for creation of potential metagenes, 3 association with outcome for gene expression levels of single genes, sets of genes, or gene signatures in multiple subgroups of the 1881-sample breast cancer data set. The design and implementation of GOBO facilitate easy incorporation of additional query functions and applications, as well as additional data sets irrespective of tumor type and array platform.

  11. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  12. Setting to earth for computer

    International Nuclear Information System (INIS)

    Gallego V, Luis Eduardo; Montana Ch, Johny Hernan; Tovar P, Andres Fernando; Amortegui, Francisco

    2000-01-01

    The program GMT allows the analysis of setting to earth for tensions DC and AC (of low frequency) of diverse configurations composed by cylindrical electrodes interconnected, in a homogeneous land or stratified (two layers). This analysis understands among other aspects: calculation of the setting resistance to earth, elevation of potential of the system (GPR), calculation of current densities in the conductors, potentials calculation in which point on the land surface (profile and surfaces), tensions calculation in passing and of contact, also, it carries out the interpretation of resistivity measures for Wenner and Schlumberger methods, finding a model of two layers

  13. Recommendation Sets and Choice Queries

    DEFF Research Database (Denmark)

    Viappiani, Paolo Renato; Boutilier, Craig

    2011-01-01

    Utility elicitation is an important component of many applications, such as decision support systems and recommender systems. Such systems query users about their preferences and offer recommendations based on the system's belief about the user's utility function. We analyze the connection between...... the problem of generating optimal recommendation sets and the problem of generating optimal choice queries, considering both Bayesian and regret-based elicitation. Our results show that, somewhat surprisingly, under very general circumstances, the optimal recommendation set coincides with the optimal query....

  14. Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

    Directory of Open Access Journals (Sweden)

    Qiusheng Kong

    Full Text Available Gene expression analysis in watermelon (Citrullus lanatus fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC, β-actin (ClACT, and alpha tubulin 5 (ClTUA5 as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1, a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.

  15. Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

    Science.gov (United States)

    Kong, Qiusheng; Yuan, Jingxian; Gao, Lingyun; Zhao, Liqiang; Cheng, Fei; Huang, Yuan; Bie, Zhilong

    2015-01-01

    Gene expression analysis in watermelon (Citrullus lanatus) fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR) is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC), β-actin (ClACT), and alpha tubulin 5 (ClTUA5) as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND) was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1), a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.

  16. Evolution of homeobox genes.

    Science.gov (United States)

    Holland, Peter W H

    2013-01-01

    Many homeobox genes encode transcription factors with regulatory roles in animal and plant development. Homeobox genes are found in almost all eukaryotes, and have diversified into 11 gene classes and over 100 gene families in animal evolution, and 10 to 14 gene classes in plants. The largest group in animals is the ANTP class which includes the well-known Hox genes, plus other genes implicated in development including ParaHox (Cdx, Xlox, Gsx), Evx, Dlx, En, NK4, NK3, Msx, and Nanog. Genomic data suggest that the ANTP class diversified by extensive tandem duplication to generate a large array of genes, including an NK gene cluster and a hypothetical ProtoHox gene cluster that duplicated to generate Hox and ParaHox genes. Expression and functional data suggest that NK, Hox, and ParaHox gene clusters acquired distinct roles in patterning the mesoderm, nervous system, and gut. The PRD class is also diverse and includes Pax2/5/8, Pax3/7, Pax4/6, Gsc, Hesx, Otx, Otp, and Pitx genes. PRD genes are not generally arranged in ancient genomic clusters, although the Dux, Obox, and Rhox gene clusters arose in mammalian evolution as did several non-clustered PRD genes. Tandem duplication and genome duplication expanded the number of homeobox genes, possibly contributing to the evolution of developmental complexity, but homeobox gene loss must not be ignored. Evolutionary changes to homeobox gene expression have also been documented, including Hox gene expression patterns shifting in concert with segmental diversification in vertebrates and crustaceans, and deletion of a Pitx1 gene enhancer in pelvic-reduced sticklebacks. WIREs Dev Biol 2013, 2:31-45. doi: 10.1002/wdev.78 For further resources related to this article, please visit the WIREs website. The author declares that he has no conflicts of interest. Copyright © 2012 Wiley Periodicals, Inc.

  17. Gene cluster statistics with gene families.

    Science.gov (United States)

    Raghupathy, Narayanan; Durand, Dannie

    2009-05-01

    Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data

  18. GOseek: a gene ontology search engine using enhanced keywords.

    Science.gov (United States)

    Taha, Kamal

    2013-01-01

    We propose in this paper a biological search engine called GOseek, which overcomes the limitation of current gene similarity tools. Given a set of genes, GOseek returns the most significant genes that are semantically related to the given genes. These returned genes are usually annotated to one of the Lowest Common Ancestors (LCA) of the Gene Ontology (GO) terms annotating the given genes. Most genes have several annotation GO terms. Therefore, there may be more than one LCA for the GO terms annotating the given genes. The LCA annotating the genes that are most semantically related to the given gene is the one that receives the most aggregate semantic contribution from the GO terms annotating the given genes. To identify this LCA, GOseek quantifies the contribution of the GO terms annotating the given genes to the semantics of their LCAs. That is, it encodes the semantic contribution into a numeric format. GOseek uses microarray experiment data to rank result genes based on their significance. We evaluated GOseek experimentally and compared it with a comparable gene prediction tool. Results showed marked improvement over the tool.

  19. Carboxylesterase 1 genes

    DEFF Research Database (Denmark)

    Rasmussen, Henrik Berg; Madsen, Majbritt Busk

    2018-01-01

    The carboxylesterase 1 gene (CES1) encodes a hydrolase that metabolizes commonly used drugs. The CES1-related pseudogene, carboxylesterase 1 pseudogene 1 (CES1P1), has been implicated in gene exchange with CES1 and in the formation of hybrid genes including the carboxylesterase 1A2 gene (CES1A2...

  20. Identification and validation of suitable endogenous reference genes for gene expression studies in human peripheral blood

    Directory of Open Access Journals (Sweden)

    Turner Renee J

    2009-08-01

    Full Text Available Abstract Background Gene expression studies require appropriate normalization methods. One such method uses stably expressed reference genes. Since suitable reference genes appear to be unique for each tissue, we have identified an optimal set of the most stably expressed genes in human blood that can be used for normalization. Methods Whole-genome Affymetrix Human 2.0 Plus arrays were examined from 526 samples of males and females ages 2 to 78, including control subjects and patients with Tourette syndrome, stroke, migraine, muscular dystrophy, and autism. The top 100 most stably expressed genes with a broad range of expression levels were identified. To validate the best candidate genes, we performed quantitative RT-PCR on a subset of 10 genes (TRAP1, DECR1, FPGS, FARP1, MAPRE2, PEX16, GINS2, CRY2, CSNK1G2 and A4GALT, 4 commonly employed reference genes (GAPDH, ACTB, B2M and HMBS and PPIB, previously reported to be stably expressed in blood. Expression stability and ranking analysis were performed using GeNorm and NormFinder algorithms. Results Reference genes were ranked based on their expression stability and the minimum number of genes needed for nomalization as calculated using GeNorm showed that the fewest, most stably expressed genes needed for acurate normalization in RNA expression studies of human whole blood is a combination of TRAP1, FPGS, DECR1 and PPIB. We confirmed the ranking of the best candidate control genes by using an alternative algorithm (NormFinder. Conclusion The reference genes identified in this study are stably expressed in whole blood of humans of both genders with multiple disease conditions and ages 2 to 78. Importantly, they also have different functions within cells and thus should be expressed independently of each other. These genes should be useful as normalization genes for microarray and RT-PCR whole blood studies of human physiology, metabolism and disease.

  1. Gene panel testing for inherited cancer risk.

    Science.gov (United States)

    Hall, Michael J; Forman, Andrea D; Pilarski, Robert; Wiesner, Georgia; Giri, Veda N

    2014-09-01

    Next-generation sequencing technologies have ushered in the capability to assess multiple genes in parallel for genetic alterations that may contribute to inherited risk for cancers in families. Thus, gene panel testing is now an option in the setting of genetic counseling and testing for cancer risk. This article describes the many gene panel testing options clinically available to assess inherited cancer susceptibility, the potential advantages and challenges associated with various types of panels, clinical scenarios in which gene panels may be particularly useful in cancer risk assessment, and testing and counseling considerations. Given the potential issues for patients and their families, gene panel testing for inherited cancer risk is recommended to be offered in conjunction or consultation with an experienced cancer genetic specialist, such as a certified genetic counselor or geneticist, as an integral part of the testing process. Copyright © 2014 by the National Comprehensive Cancer Network.

  2. Parameter setting and input reduction

    NARCIS (Netherlands)

    Evers, A.; van Kampen, N.J.|info:eu-repo/dai/nl/126439737

    2008-01-01

    The language acquisition procedure identifies certain properties of the target grammar before others. The evidence from the input is processed in a stepwise order. Section 1 equates that order and its typical effects with an order of parameter setting. The question is how the acquisition procedure

  3. Filtration set for gaseous fluids

    International Nuclear Information System (INIS)

    Lebrun, B.; Couvrat-Desvergnes, A.

    1988-01-01

    This filtration set is made by a cylindrical vessel containing upstairs to downstairs, the gas inlet, a sealed floor for man inspection, a horizontal granular filter bed, a linen with a porosity inferior to the granulometry of the filter bed, a light support layer of material of larger granulometry, gas permeable tubes and an annular collector connecting the tubes to the outlet [fr

  4. Fuzzy-Set Case Studies

    Science.gov (United States)

    Mikkelsen, Kim Sass

    2017-01-01

    Contemporary case studies rely on verbal arguments and set theory to build or evaluate theoretical claims. While existing procedures excel in the use of qualitative information (information about kind), they ignore quantitative information (information about degree) at central points of the analysis. Effectively, contemporary case studies rely on…

  5. The IRI marketing data set

    NARCIS (Netherlands)

    Bronnenberg, B.J.; Kruger, M.W.; Mela, C.

    2008-01-01

    This paper describes a new data set available to academic researchers (at the following website: http://mktsci.pubs.informs.org). These data are comprised of store sales and consumer panel data for 30 product categories. The store sales data contain 5 years of product sales, pricing, and promotion

  6. Repeated Interaction in Standard Setting

    NARCIS (Netherlands)

    Larouche, Pierre; Schütt, Florian

    2016-01-01

    As part of the standard-setting process, certain patents become essential. This may allow the owners of these standard-essential patents to hold up implementers of the standard, who can no longer turn to substitute technologies. However, many real-world standards evolve over time, with several

  7. Overcoming Barriers in Unhealthy Settings

    Directory of Open Access Journals (Sweden)

    Michael K. Lemke

    2016-03-01

    Full Text Available We investigated the phenomenon of sustained health-supportive behaviors among long-haul commercial truck drivers, who belong to an occupational segment with extreme health disparities. With a focus on setting-level factors, this study sought to discover ways in which individuals exhibit resiliency while immersed in endemically obesogenic environments, as well as understand setting-level barriers to engaging in health-supportive behaviors. Using a transcendental phenomenological research design, 12 long-haul truck drivers who met screening criteria were selected using purposeful maximum sampling. Seven broad themes were identified: access to health resources, barriers to health behaviors, recommended alternative settings, constituents of health behavior, motivation for health behaviors, attitude toward health behaviors, and trucking culture. We suggest applying ecological theories of health behavior and settings approaches to improve driver health. We also propose the Integrative and Dynamic Healthy Commercial Driving (IDHCD paradigm, grounded in complexity science, as a new theoretical framework for improving driver health outcomes.

  8. Behavior Management in Afterschool Settings

    Science.gov (United States)

    Mahoney, Joseph L.

    2014-01-01

    Although behavioral management is one of the most challenging aspects of working in an afterschool setting, staff do not typically receive formal training in evidence-based approaches to handling children's behavior problems. Common approaches to behavioral management such as punishment or time-out are temporary solutions because they do not…

  9. Blocking sets in Desarguesian planes

    NARCIS (Netherlands)

    Blokhuis, A.; Miklós, D.; Sós, V.T.; Szönyi, T.

    1996-01-01

    We survey recent results concerning the size of blocking sets in desarguesian projective and affine planes, and implications of these results and the technique to prove them, to related problemis, such as the size of maximal partial spreads, small complete arcs, small strong representative systems

  10. Sex hormones and gene expression signatures in peripheral blood from postmenopausal women - the NOWAC postgenome study

    Directory of Open Access Journals (Sweden)

    Rylander Charlotta

    2011-03-01

    Full Text Available Abstract Background Postmenopausal hormone therapy (HT influences endogenous hormone concentrations and increases the risk of breast cancer. Gene expression profiling may reveal the mechanisms behind this relationship. Our objective was to explore potential associations between sex hormones and gene expression in whole blood from a population-based, random sample of postmenopausal women Methods Gene expression, as measured by the Applied Biosystems microarray platform, was compared between hormone therapy (HT users and non-users and between high and low hormone plasma concentrations using both gene-wise analysis and gene set analysis. Gene sets found to be associated with HT use were further analysed for enrichment in functional clusters and network predictions. The gene expression matrix included 285 samples and 16185 probes and was adjusted for significant technical variables. Results Gene-wise analysis revealed several genes significantly associated with different types of HT use. The functional cluster analyses provided limited information on these genes. Gene set analysis revealed 22 gene sets that were enriched between high and low estradiol concentration (HT-users excluded. Among these were seven oestrogen related gene sets, including our gene list associated with systemic estradiol use, which thereby represents a novel oestrogen signature. Seven gene sets were related to immune response. Among the 15 gene sets enriched for progesterone, 11 overlapped with estradiol. No significant gene expression patterns were found for testosterone, follicle stimulating hormone (FSH or sex hormone binding globulin (SHBG. Conclusions Distinct gene expression patterns associated with sex hormones are detectable in a random group of postmenopausal women, as demonstrated by the finding of a novel oestrogen signature.

  11. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease.

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. http://rged.wall-eva.net. © The Author(s) 2014. Published by Oxford University Press.

  12. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Availability and implementation: Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. Database URL: http://rged.wall-eva.net PMID:25252782

  13. Good genes, complementary genes and human mate preferences.

    Science.gov (United States)

    Roberts, S Craig; Little, Anthony C

    2008-09-01

    The past decade has witnessed a rapidly growing interest in the biological basis of human mate choice. Here we review recent studies that demonstrate preferences for traits which might reveal genetic quality to prospective mates, with potential but still largely unknown influence on offspring fitness. These include studies assessing visual, olfactory and auditory preferences for potential good-gene indicator traits, such as dominance or bilateral symmetry. Individual differences in these robust preferences mainly arise through within and between individual variation in condition and reproductive status. Another set of studies have revealed preferences for traits indicating complementary genes, focussing on discrimination of dissimilarity at genes in the major histocompatibility complex (MHC). As in animal studies, we are only just beginning to understand how preferences for specific traits vary and inter-relate, how consideration of good and compatible genes can lead to substantial variability in individual mate choice decisions and how preferences expressed in one sensory modality may reflect those in another. Humans may be an ideal model species in which to explore these interesting complexities.

  14. Candidate Gene Identification of Flowering Time Genes in Cotton

    Directory of Open Access Journals (Sweden)

    Corrinne E. Grover

    2015-07-01

    Full Text Available Flowering time control is critically important to all sexually reproducing angiosperms in both natural ecological and agronomic settings. Accordingly, there is much interest in defining the genes involved in the complex flowering-time network and how these respond to natural and artificial selection, the latter often entailing transitions in day-length responses. Here we describe a candidate gene analysis in the cotton genus , which uses homologs from the well-described flowering network to bioinformatically and phylogenetically identify orthologs in the published genome sequence from Ulbr., one of the two model diploid progenitors of the commercially important allopolyploid cottons, L. and L. Presence and patterns of expression were evaluated from 13 aboveground tissues related to flowering for each of the candidate genes using allopolyploid as a model. Furthermore, we use a comparative context to determine copy number variability of each key gene family across 10 published angiosperm genomes. Data suggest a pattern of repeated loss of duplicates following ancient whole-genome doubling events in diverse lineages. The data presented here provide a foundation for understanding both the parallel evolution of day-length neutrality in domesticated cottons and the flowering-time network, in general, in this important crop plant.

  15. Duplicability of self-interacting human genes.

    LENUS (Irish Health Repository)

    Pérez-Bercoff, Asa

    2010-01-01

    BACKGROUND: There is increasing interest in the evolution of protein-protein interactions because this should ultimately be informative of the patterns of evolution of new protein functions within the cell. One model proposes that the evolution of new protein-protein interactions and protein complexes proceeds through the duplication of self-interacting genes. This model is supported by data from yeast. We examined the relationship between gene duplication and self-interaction in the human genome. RESULTS: We investigated the patterns of self-interaction and duplication among 34808 interactions encoded by 8881 human genes, and show that self-interacting proteins are encoded by genes with higher duplicability than genes whose proteins lack this type of interaction. We show that this result is robust against the system used to define duplicate genes. Finally we compared the presence of self-interactions amongst proteins whose genes have duplicated either through whole-genome duplication (WGD) or small-scale duplication (SSD), and show that the former tend to have more interactions in general. After controlling for age differences between the two sets of duplicates this result can be explained by the time since the gene duplication. CONCLUSIONS: Genes encoding self-interacting proteins tend to have higher duplicability than proteins lacking self-interactions. Moreover these duplicate genes have more often arisen through whole-genome rather than small-scale duplication. Finally, self-interacting WGD genes tend to have more interaction partners in general in the PIN, which can be explained by their overall greater age. This work adds to our growing knowledge of the importance of contextual factors in gene duplicability.

  16. upSET, the Drosophila homologue of SET3, Is Required for Viability and the Proper Balance of Active and Repressive Chromatin Marks

    Directory of Open Access Journals (Sweden)

    Kyle A. McElroy

    2017-02-01

    Full Text Available Chromatin plays a critical role in faithful implementation of gene expression programs. Different post-translational modifications (PTMs of histone proteins reflect the underlying state of gene activity, and many chromatin proteins write, erase, bind, or are repelled by, these histone marks. One such protein is UpSET, the Drosophila homolog of yeast Set3 and mammalian KMT2E (MLL5. Here, we show that UpSET is necessary for the proper balance between active and repressed states. Using CRISPR/Cas-9 editing, we generated S2 cells that are mutant for upSET. We found that loss of UpSET is tolerated in S2 cells, but that heterochromatin is misregulated, as evidenced by a strong decrease in H3K9me2 levels assessed by bulk histone PTM quantification. To test whether this finding was consistent in the whole organism, we deleted the upSET coding sequence using CRISPR/Cas-9, which we found to be lethal in both sexes in flies. We were able to rescue this lethality using a tagged upSET transgene, and found that UpSET protein localizes to transcriptional start sites (TSS of active genes throughout the genome. Misregulated heterochromatin is apparent by suppressed position effect variegation of the wm4 allele in heterozygous upSET-deleted flies. Using nascent-RNA sequencing in the upSET-mutant S2 lines, we show that this result applies to heterochromatin genes generally. Our findings support a critical role for UpSET in maintaining heterochromatin, perhaps by delimiting the active chromatin environment.

  17. No Evidence That Schizophrenia Candidate Genes Are More Associated With Schizophrenia Than Noncandidate Genes.

    Science.gov (United States)

    Johnson, Emma C; Border, Richard; Melroy-Greif, Whitney E; de Leeuw, Christiaan A; Ehringer, Marissa A; Keller, Matthew C

    2017-11-15

    A recent analysis of 25 historical candidate gene polymorphisms for schizophrenia in the largest genome-wide association study conducted to date suggested that these commonly studied variants were no more associated with the disorder than would be expected by chance. However, the same study identified other variants within those candidate genes that demonstrated genome-wide significant associations with schizophrenia. As such, it is possible that variants within historic schizophrenia candidate genes are associated with schizophrenia at levels above those expected by chance, even if the most-studied specific polymorphisms are not. The present study used association statistics from the largest schizophrenia genome-wide association study conducted to date as input to a gene set analysis to investigate whether variants within schizophrenia candidate genes are enriched for association with schizophrenia. As a group, variants in the most-studied candidate genes were no more associated with schizophrenia than were variants in control sets of noncandidate genes. While a small subset of candidate genes did appear to be significantly associated with schizophrenia, these genes were not particularly noteworthy given the large number of more strongly associated noncandidate genes. The history of schizophrenia research should serve as a cautionary tale to candidate gene investigators examining other phenotypes: our findings indicate that the most investigated candidate gene hypotheses of schizophrenia are not well supported by genome-wide association studies, and it is likely that this will be the case for other complex traits as well. Copyright © 2017 Society of Biological Psychiatry. Published by Elsevier Inc. All rights reserved.

  18. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato; Kuwahara, Hiroyuki; Yu, Ge; Guo, Lili; Gao, Xin

    2016-01-01

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  19. Learning gene regulatory networks from gene expression data using weighted consensus

    KAUST Repository

    Fujii, Chisato

    2016-08-25

    An accurate determination of the network structure of gene regulatory systems from high-throughput gene expression data is an essential yet challenging step in studying how the expression of endogenous genes is controlled through a complex interaction of gene products and DNA. While numerous methods have been proposed to infer the structure of gene regulatory networks, none of them seem to work consistently over different data sets with high accuracy. A recent study to compare gene network inference methods showed that an average-ranking-based consensus method consistently performs well under various settings. Here, we propose a linear programming-based consensus method for the inference of gene regulatory networks. Unlike the average-ranking-based one, which treats the contribution of each individual method equally, our new consensus method assigns a weight to each method based on its credibility. As a case study, we applied the proposed consensus method on synthetic and real microarray data sets, and compared its performance to that of the average-ranking-based consensus and individual inference methods. Our results show that our weighted consensus method achieves superior performance over the unweighted one, suggesting that assigning weights to different individual methods rather than giving them equal weights improves the accuracy. © 2016 Elsevier B.V.

  20. Hypergraphs combinatorics of finite sets

    CERN Document Server

    Berge, C

    1989-01-01

    Graph Theory has proved to be an extremely useful tool for solving combinatorial problems in such diverse areas as Geometry, Algebra, Number Theory, Topology, Operations Research and Optimization. It is natural to attempt to generalise the concept of a graph, in order to attack additional combinatorial problems. The idea of looking at a family of sets from this standpoint took shape around 1960. In regarding each set as a ``generalised edge'' and in calling the family itself a ``hypergraph'', the initial idea was to try to extend certain classical results of Graph Theory such as the theorems of Turán and König. It was noticed that this generalisation often led to simplification; moreover, one single statement, sometimes remarkably simple, could unify several theorems on graphs. This book presents what seems to be the most significant work on hypergraphs.

  1. Abstraction by Set-Membership

    DEFF Research Database (Denmark)

    Mödersheim, Sebastian Alexander

    2010-01-01

    The abstraction and over-approximation of protocols and web services by a set of Horn clauses is a very successful method in practice. It has however limitations for protocols and web services that are based on databases of keys, contracts, or even access rights, where revocation is possible, so...... that the set of true facts does not monotonically grow with the transitions. We extend the scope of these over-approximation methods by defining a new way of abstraction that can handle such databases, and we formally prove that the abstraction is sound. We realize a translator from a convenient specification...... language to standard Horn clauses and use the verifier ProVerif and the theorem prover SPASS to solve them. We show by a number of examples that this approach is practically feasible for wide variety of verification problems of security protocols and web services....

  2. Food systems in correctional settings

    DEFF Research Database (Denmark)

    Smoyer, Amy; Kjær Minke, Linda

    management of food systems may improve outcomes for incarcerated people and help correctional administrators to maximize their health and safety. This report summarizes existing research on food systems in correctional settings and provides examples of food programmes in prison and remand facilities......Food is a central component of life in correctional institutions and plays a critical role in the physical and mental health of incarcerated people and the construction of prisoners' identities and relationships. An understanding of the role of food in correctional settings and the effective......, including a case study of food-related innovation in the Danish correctional system. It offers specific conclusions for policy-makers, administrators of correctional institutions and prison-food-service professionals, and makes proposals for future research....

  3. First-Class Object Sets

    DEFF Research Database (Denmark)

    Ernst, Erik

    2009-01-01

    Typically, an object is a monolithic entity with a fixed interface.  To increase flexibility in this area, this paper presents first-class object sets as a language construct.  An object set offers an interface which is a disjoint union of the interfaces of its member objects.  It may also be use...... to a mainstream virtual machine in order to improve on the support for family polymorphism.  The approach is made precise by means of a small calculus, and the soundness of its type system has been shown by a mechanically checked proof in Coq....

  4. Compositional models for credal sets

    Czech Academy of Sciences Publication Activity Database

    Vejnarová, Jiřina

    2017-01-01

    Roč. 90, č. 1 (2017), s. 359-373 ISSN 0888-613X R&D Projects: GA ČR(CZ) GA16-12010S Institutional support: RVO:67985556 Keywords : Imprecise probabilities * Credal sets * Multidimensional models * Conditional independence Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics Impact factor: 2.845, year: 2016 http://library.utia.cas.cz/separaty/2017/MTR/vejnarova-0483288.pdf

  5. The Value of Value Sets

    DEFF Research Database (Denmark)

    Sløk-Madsen, Stefan Kirkegaard; Christensen, Jesper

    The world over classrooms in business schools are being taught that corporate values can impact performance. The argument is typically that culture matter more than strategy plans and culture can be influenced and indeed changed by a shared corporate value set. While the claim seems intuitively a...... a unique contribution to the effects of investment in shared company values, and to whether agent rationality can be fundamentally changed by committed organizational efforts....

  6. Some remarks on good sets

    Indian Academy of Sciences (India)

    R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

    A subset S of is said to be full, if S is a maximal good set in 1S × 2S ืทททื nS. ([3], p. 183). Two points .... assume there is a p for which |np|= |mp|. Then. ∑k j=1 nj xj = 0 ..... M G Nadkarni for suggesting the problems and for encouragement and ...

  7. Scale setting in lattice QCD

    International Nuclear Information System (INIS)

    Sommer, Rainer

    2014-02-01

    The principles of scale setting in lattice QCD as well as the advantages and disadvantages of various commonly used scales are discussed. After listing criteria for good scales, I concentrate on the main presently used ones with an emphasis on scales derived from the Yang-Mills gradient flow. For these I discuss discretisation errors, statistical precision and mass effects. A short review on numerical results also brings me to an unpleasant disagreement which remains to be explained.

  8. Set Reordering for Paletted Data

    KAUST Repository

    Schneider, Jens

    2011-03-01

    We present a novel method to recycle bits of paletted data sets. We exploit that the codebook of such data can be reordered without affecting the content. Enumerating all possible permutations of N codebook entries yields an additional O(N log2 N) bits that can be used without storage overhed for the losless encoding of a limited amount of tags, meta-information, or part of the actual data. © 2011 IEEE.

  9. Exemestane in the prevention setting

    OpenAIRE

    Litton, Jennifer Keating; Bevers, Therese B.; Arun, Banu K.

    2012-01-01

    Aromatase inhibitors are well-established therapies in the neoadjuvant, adjuvant and metastatic settings for breast cancer. In adjuvant trials, this class of drugs has shown preventative properties by decreasing the rate of contralateral breast cancer. Recently, the National Cancer Institute of Canada Clinical Trials Group MAP.3 study evaluated exemestane as a breast cancer prevention agent for women with specified higher risks of developing breast cancer. We review the history of exemestane ...

  10. Scale setting in lattice QCD

    Energy Technology Data Exchange (ETDEWEB)

    Sommer, Rainer [DESY, Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC

    2014-02-15

    The principles of scale setting in lattice QCD as well as the advantages and disadvantages of various commonly used scales are discussed. After listing criteria for good scales, I concentrate on the main presently used ones with an emphasis on scales derived from the Yang-Mills gradient flow. For these I discuss discretisation errors, statistical precision and mass effects. A short review on numerical results also brings me to an unpleasant disagreement which remains to be explained.

  11. Setting priorities for safeguards upgrades

    International Nuclear Information System (INIS)

    Al-Ayat, R.A.; Judd, B.R.; Patenaude, C.J.; Sicherman, A.

    1987-01-01

    This paper describes an analytic approach and a computer program for setting priorities among safeguards upgrades. The approach provides safeguards decision makers with a systematic method for allocating their limited upgrade resources. The priorities are set based on the upgrades cost and their contribution to safeguards effectiveness. Safeguards effectiveness is measured by the probability of defeat for a spectrum of potential insider and outsider adversaries. The computer program, MI$ER, can be used alone or as a companion to ET and SAVI, programs designed to evaluate safeguards effectiveness against insider and outsider threats, respectively. Setting the priority required judgments about the relative importance (threat likelihoods and consequences) of insider and outsider threats. Although these judgments are inherently subjective, MI$ER can analyze the sensitivity of the upgrade priorities to these weights and determine whether or not they are critical to the priority ranking. MI$ER produces tabular and graphical results for comparing benefits and identifying the most cost-effective upgrades for a given expenditure. This framework provides decision makers with an explicit and consistent analysis to support their upgrades decisions and to allocate the safeguards resources in a cost-effective manner

  12. A course on Borel sets

    CERN Document Server

    Srivastava, S M

    1998-01-01

    The roots of Borel sets go back to the work of Baire [8]. He was trying to come to grips with the abstract notion of a function introduced by Dirich­ let and Riemann. According to them, a function was to be an arbitrary correspondence between objects without giving any method or procedure by which the correspondence could be established. Since all the specific functions that one studied were determined by simple analytic expressions, Baire delineated those functions that can be constructed starting from con­ tinuous functions and iterating the operation 0/ pointwise limit on a se­ quence 0/ functions. These functions are now known as Baire functions. Lebesgue [65] and Borel [19] continued this work. In [19], Borel sets were defined for the first time. In his paper, Lebesgue made a systematic study of Baire functions and introduced many tools and techniques that are used even today. Among other results, he showed that Borel functions coincide with Baire functions. The study of Borel sets got an impetus from...

  13. Introduction to axiomatic set theory

    CERN Document Server

    Takeuti, Gaisi

    1971-01-01

    In 1963, the first author introduced a course in set theory at the Uni­ versity of Illinois whose main objectives were to cover G6del's work on the consistency of the axiom of choice (AC) and the generalized con­ tinuum hypothesis (GCH), and Cohen's work on the independence of AC and the GCH. Notes taken in 1963 by the second author were the taught by him in 1966, revised extensively, and are presented here as an introduction to axiomatic set theory. Texts in set theory frequently develop the subject rapidly moving from key result to key result and suppressing many details. Advocates of the fast development claim at least two advantages. First, key results are highlighted, and second, the student who wishes to master the sub­ ject is compelled to develop the details on his own. However, an in­ structor using a "fast development" text must devote much class time to assisting his students in their efforts to bridge gaps in the text. We have chosen instead a development that is quite detailed and complete. F...

  14. Statistical Redundancy Testing for Improved Gene Selection in Cancer Classification Using Microarray Data

    Directory of Open Access Journals (Sweden)

    J. Sunil Rao

    2007-01-01

    Full Text Available In gene selection for cancer classifi cation using microarray data, we define an eigenvalue-ratio statistic to measure a gene’s contribution to the joint discriminability when this gene is included into a set of genes. Based on this eigenvalueratio statistic, we define a novel hypothesis testing for gene statistical redundancy and propose two gene selection methods. Simulation studies illustrate the agreement between statistical redundancy testing and gene selection methods. Real data examples show the proposed gene selection methods can select a compact gene subset which can not only be used to build high quality cancer classifiers but also show biological relevance.

  15. Gene doping in sports.

    Science.gov (United States)

    Unal, Mehmet; Ozer Unal, Durisehvar

    2004-01-01

    Gene or cell doping is defined by the World Anti-Doping Agency (WADA) as "the non-therapeutic use of genes, genetic elements and/or cells that have the capacity to enhance athletic performance". New research in genetics and genomics will be used not only to diagnose and treat disease, but also to attempt to enhance human performance. In recent years, gene therapy has shown progress and positive results that have highlighted the potential misuse of this technology and the debate of 'gene doping'. Gene therapies developed for the treatment of diseases such as anaemia (the gene for erythropoietin), muscular dystrophy (the gene for insulin-like growth factor-1) and peripheral vascular diseases (the gene for vascular endothelial growth factor) are potential doping methods. With progress in gene technology, many other genes with this potential will be discovered. For this reason, it is important to develop timely legal regulations and to research the field of gene doping in order to develop methods of detection. To protect the health of athletes and to ensure equal competitive conditions, the International Olympic Committee, WADA and International Sports Federations have accepted performance-enhancing substances and methods as being doping, and have forbidden them. Nevertheless, the desire to win causes athletes to misuse these drugs and methods. This paper reviews the current status of gene doping and candidate performance enhancement genes, and also the use of gene therapy in sports medicine and ethics of genetic enhancement. Copyright 2004 Adis Data Information BV

  16. The Drosophila melanogaster methuselah gene: a novel gene with ancient functions.

    Directory of Open Access Journals (Sweden)

    Ana Rita Araújo

    Full Text Available The Drosophila melanogaster G protein-coupled receptor gene, methuselah (mth, has been described as a novel gene that is less than 10 million years old. Nevertheless, it shows a highly specific expression pattern in embryos, larvae, and adults, and has been implicated in larval development, stress resistance, and in the setting of adult lifespan, among others. Although mth belongs to a gene subfamily with 16 members in D. melanogaster, there is no evidence for functional redundancy in this subfamily. Therefore, it is surprising that a novel gene influences so many traits. Here, we explore the alternative hypothesis that mth is an old gene. Under this hypothesis, in species distantly related to D. melanogaster, there should be a gene with features similar to those of mth. By performing detailed phylogenetic, synteny, protein structure, and gene expression analyses we show that the D. virilis GJ12490 gene is the orthologous of mth in species distantly related to D. melanogaster. We also show that, in D. americana (a species of the virilis group of Drosophila, a common amino acid polymorphism at the GJ12490 orthologous gene is significantly associated with developmental time, size, and lifespan differences. Our results imply that GJ12490 orthologous genes are candidates for developmental time and lifespan differences in Drosophila in general.

  17. Expression of the histone chaperone SET/TAF-Iβ during the strobilation process of Mesocestoides corti (Platyhelminthes, Cestoda).

    Science.gov (United States)

    Costa, Caroline B; Monteiro, Karina M; Teichmann, Aline; da Silva, Edileuza D; Lorenzatto, Karina R; Cancela, Martín; Paes, Jéssica A; Benitz, André de N D; Castillo, Estela; Margis, Rogério; Zaha, Arnaldo; Ferreira, Henrique B

    2015-08-01

    The histone chaperone SET/TAF-Iβ is implicated in processes of chromatin remodelling and gene expression regulation. It has been associated with the control of developmental processes, but little is known about its function in helminth parasites. In Mesocestoides corti, a partial cDNA sequence related to SET/TAF-Iβ was isolated in a screening for genes differentially expressed in larvae (tetrathyridia) and adult worms. Here, the full-length coding sequence of the M. corti SET/TAF-Iβ gene was analysed and the encoded protein (McSET/TAF) was compared with orthologous sequences, showing that McSET/TAF can be regarded as a SET/TAF-Iβ family member, with a typical nucleosome-assembly protein (NAP) domain and an acidic tail. The expression patterns of the McSET/TAF gene and protein were investigated during the strobilation process by RT-qPCR, using a set of five reference genes, and by immunoblot and immunofluorescence, using monospecific polyclonal antibodies. A gradual increase in McSET/TAF transcripts and McSET/TAF protein was observed upon development induction by trypsin, demonstrating McSET/TAF differential expression during strobilation. These results provided the first evidence for the involvement of a protein from the NAP family of epigenetic effectors in the regulation of cestode development.

  18. Expression profiling identifies genes involved in emphysema severity

    Directory of Open Access Journals (Sweden)

    Bowman Rayleen V

    2009-09-01

    Full Text Available Abstract Chronic obstructive pulmonary disease (COPD is a major public health problem. The aim of this study was to identify genes involved in emphysema severity in COPD patients. Gene expression profiling was performed on total RNA extracted from non-tumor lung tissue from 30 smokers with emphysema. Class comparison analysis based on gas transfer measurement was performed to identify differentially expressed genes. Genes were then selected for technical validation by quantitative reverse transcriptase-PCR (qRT-PCR if also represented on microarray platforms used in previously published emphysema studies. Genes technically validated advanced to tests of biological replication by qRT-PCR using an independent test set of 62 lung samples. Class comparison identified 98 differentially expressed genes (p p Gene expression profiling of lung from emphysema patients identified seven candidate genes associated with emphysema severity including COL6A3, SERPINF1, ZNHIT6, NEDD4, CDKN2A, NRN1 and GSTM3.

  19. Human Gene Therapy: Genes without Frontiers?

    Science.gov (United States)

    Simon, Eric J.

    2002-01-01

    Describes the latest advancements and setbacks in human gene therapy to provide reference material for biology teachers to use in their science classes. Focuses on basic concepts such as recombinant DNA technology, and provides examples of human gene therapy such as severe combined immunodeficiency syndrome, familial hypercholesterolemia, and…

  20. Gene mutations in hepatocellular adenomas

    DEFF Research Database (Denmark)

    Raft, Marie B; Jørgensen, Ernö N; Vainer, Ben

    2015-01-01

    is associated with bi-allelic mutations in the TCF1 gene and morphologically has marked steatosis. β-catenin activating HCA has increased activity of the Wnt/β-catenin pathway and is associated with possible malignant transformation. Inflammatory HCA is characterized by an oncogene-induced inflammation due...... to alterations in the Janus kinase/signal transducer and activator of transcription (JAK/STAT) pathway. In the diagnostic setting, sub classification of HCA is based primarily on immunohistochemical analyzes, and has had an increasing impact on choice of treatment and individual prognostic assessment....... This review offers an overview of the reported gene mutations associated with hepatocellular adenomas together with a discussion of the diagnostic and prognostic value....

  1. EcoGene 3.0.

    Science.gov (United States)

    Zhou, Jindan; Rudd, Kenneth E

    2013-01-01

    EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection.

  2. EcoGene 3.0

    Science.gov (United States)

    Zhou, Jindan; Rudd, Kenneth E.

    2013-01-01

    EcoGene (http://ecogene.org) is a database and website devoted to continuously improving the structural and functional annotation of Escherichia coli K-12, one of the most well understood model organisms, represented by the MG1655(Seq) genome sequence and annotations. Major improvements to EcoGene in the past decade include (i) graphic presentations of genome map features; (ii) ability to design Boolean queries and Venn diagrams from EcoArray, EcoTopics or user-provided GeneSets; (iii) the genome-wide clone and deletion primer design tool, PrimerPairs; (iv) sequence searches using a customized EcoBLAST; (v) a Cross Reference table of synonymous gene and protein identifiers; (vi) proteome-wide indexing with GO terms; (vii) EcoTools access to >2000 complete bacterial genomes in EcoGene-RefSeq; (viii) establishment of a MySql relational database; and (ix) use of web content management systems. The biomedical literature is surveyed daily to provide citation and gene function updates. As of September 2012, the review of 37 397 abstracts and articles led to creation of 98 425 PubMed-Gene links and 5415 PubMed-Topic links. Annotation updates to Genbank U00096 are transmitted from EcoGene to NCBI. Experimental verifications include confirmation of a CTG start codon, pseudogene restoration and quality assurance of the Keio strain collection. PMID:23197660

  3. Rough set soft computing cancer classification and network: one stone, two birds.

    Science.gov (United States)

    Zhang, Yue

    2010-07-15

    Gene expression profiling provides tremendous information to help unravel the complexity of cancer. The selection of the most informative genes from huge noise for cancer classification has taken centre stage, along with predicting the function of such identified genes and the construction of direct gene regulatory networks at different system levels with a tuneable parameter. A new study by Wang and Gotoh described a novel Variable Precision Rough Sets-rooted robust soft computing method to successfully address these problems and has yielded some new insights. The significance of this progress and its perspectives will be discussed in this article.

  4. Coexpression landscape in ATTED-II: usage of gene list and gene network for various types of pathways.

    Science.gov (United States)

    Obayashi, Takeshi; Kinoshita, Kengo

    2010-05-01

    Gene coexpression analyses are a powerful method to predict the function of genes and/or to identify genes that are functionally related to query genes. The basic idea of gene coexpression analyses is that genes with similar functions should have similar expression patterns under many different conditions. This approach is now widely used by many experimental researchers, especially in the field of plant biology. In this review, we will summarize recent successful examples obtained by using our gene coexpression database, ATTED-II. Specifically, the examples will describe the identification of new genes, such as the subunits of a complex protein, the enzymes in a metabolic pathway and transporters. In addition, we will discuss the discovery of a new intercellular signaling factor and new regulatory relationships between transcription factors and their target genes. In ATTED-II, we provide two basic views of gene coexpression, a gene list view and a gene network view, which can be used as guide gene approach and narrow-down approach, respectively. In addition, we will discuss the coexpression effectiveness for various types of gene sets.

  5. Gene therapy in cystic fibrosis.

    Science.gov (United States)

    Flotte, T R; Laube, B L

    2001-09-01

    Theoretically, cystic fibrosis transmembrane conductance regulator (CFTR) gene replacement during the neonatal period can decrease morbidity and mortality from cystic fibrosis (CF). In vivo gene transfers have been accomplished in CF patients. Choice of vector, mode of delivery to airways, translocation of genetic information, and sufficient expression level of the normalized CFTR gene are issues that currently are being addressed in the field. The advantages and limitations of viral vectors are a function of the parent virus. Viral vectors used in this setting include adenovirus (Ad) and adeno-associated virus (AAV). Initial studies with Ad vectors resulted in a vector that was efficient for gene transfer with dose-limiting inflammatory effects due to the large amount of viral protein delivered. The next generation of Ad vectors, with more viral coding sequence deletions, has a longer duration of activity and elicits a lesser degree of cell-mediated immunity in mice. A more recent generation of Ad vectors has no viral genes remaining. Despite these changes, the problem of humoral immunity remains with Ad vectors. A variety of strategies such as vector systems requiring single, or widely spaced, administrations, pharmacologic immunosuppression at administration, creation of a stealth vector, modification of immunogenic epitopes, or tolerance induction are being considered to circumvent humoral immunity. AAV vectors have been studied in animal and human models. They do not appear to induce inflammatory changes over a wide range of doses. The level of CFTR messenger RNA expression is difficult to ascertain with AAV vectors since the small size of the vector relative to the CFTR gene leaves no space for vector-specific sequences on which to base assays to distinguish endogenous from vector-expressed messenger RNA. In general, AAV vectors appear to be safe and have superior duration profiles. Cationic liposomes are lipid-DNA complexes. These vectors generally have been

  6. Amuse Restaurant Set Lunch 2017

    OpenAIRE

    Amuse Restaurant

    2017-01-01

    Since opening, Amuse Restaurant has garnered rave reviews from the Country’s most trusted and renowned food critics. Praise has flowed for Conor’s individual style of cooking, which brings Asian flavours, namely Japanese, to modern French cuisine. The menu focuses mainly on tasting menus as it is the best way to experience this kind of food, however there is a three course set menu available Tuesday to Thursday for mid week dining. The lunch menu is a three course affair with the option of a ...

  7. Psychiatric diagnosis in legal settings

    Directory of Open Access Journals (Sweden)

    Alfred Allan

    2005-12-01

    Full Text Available When asked to give a diagnosis in legal settings practitioners should be mindful of the tentative nature of psychiatric diag- noses and that courts require that such a diagnosis must have scientific credibility. South African courts are not explicit about the test they will apply to determine whether a diagno- sis is scientifically credible, but some guidance can be found in United States case law. This paper examines these criteria with reference to the disorders included in the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV-TR.

  8. Improved precision and accuracy for microarrays using updated probe set definitions

    Directory of Open Access Journals (Sweden)

    Larsson Ola

    2007-02-01

    Full Text Available Abstract Background Microarrays enable high throughput detection of transcript expression levels. Different investigators have recently introduced updated probe set definitions to more accurately map probes to our current knowledge of genes and transcripts. Results We demonstrate that updated probe set definitions provide both better precision and accuracy in probe set estimates compared to the original Affymetrix definitions. We show that the improved precision mainly depends on the increased number of probes that are integrated into each probe set, but we also demonstrate an improvement when the same number of probes is used. Conclusion Updated probe set definitions does not only offer expression levels that are more accurately associated to genes and transcripts but also improvements in the estimated transcript expression levels. These results give support for the use of updated probe set definitions for analysis and meta-analysis of microarray data.

  9. Automatic generation of gene finders for eukaryotic species

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Krogh, A.

    2006-01-01

    and quality of reliable gene annotation grows. Results We present a procedure, Agene, that automatically generates a species-specific gene predictor from a set of reliable mRNA sequences and a genome. We apply a Hidden Markov model (HMM) that implements explicit length distribution modelling for all gene......Background The number of sequenced eukaryotic genomes is rapidly increasing. This means that over time it will be hard to keep supplying customised gene finders for each genome. This calls for procedures to automatically generate species-specific gene finders and to re-train them as the quantity...... structure blocks using acyclic discrete phase type distributions. The state structure of the each HMM is generated dynamically from an array of sub-models to include only gene features represented in the training set. Conclusion Acyclic discrete phase type distributions are well suited to model sequence...

  10. Gene expression profiles in stages II and III colon cancers

    DEFF Research Database (Denmark)

    Thorsteinsson, Morten; Kirkeby, Lene T; Hansen, Raino

    2012-01-01

    PURPOSE: A 128-gene signature has been proposed to predict outcome in patients with stages II and III colorectal cancers. In the present study, we aimed to reproduce and validate the 128-gene signature in external and independent material. METHODS: Gene expression data from the original material...... were retrieved from the Gene Expression Omnibus (GEO) (n¿=¿111) in addition to a Danish data set (n¿=¿37). All patients had stages II and III colon cancers. A Prediction Analysis of Microarray classifier, based on the 128-gene signature and the original training set of stage I (n¿=¿65) and stage IV (n...... correctly predicted as stage IV-like, and the remaining patients were predicted as stage I-like and unclassifiable, respectively. Stage II patients could not be stratified. CONCLUSIONS: The 128-gene signature showed reproducibility in stage III colon cancer, but could not predict recurrence in stage II...

  11. Tumor targeted gene therapy

    International Nuclear Information System (INIS)

    Kang, Joo Hyun

    2006-01-01

    Knowledge of molecular mechanisms governing malignant transformation brings new opportunities for therapeutic intervention against cancer using novel approaches. One of them is gene therapy based on the transfer of genetic material to an organism with the aim of correcting a disease. The application of gene therapy to the cancer treatment had led to the development of new experimental approaches such as suicidal gene therapy, inhibition of oncogenes and restoration of tumor-suppressor genes. Suicidal gene therapy is based on the expression in tumor cells of a gene encoding an enzyme that converts a prodrug into a toxic product. Representative suicidal genes are Herpes simplex virus type 1 thymidine kinase (HSV1-tk) and cytosine deaminase (CD). Especially, physicians and scientists of nuclear medicine field take an interest in suicidal gene therapy because they can monitor the location and magnitude, and duration of expression of HSV1-tk and CD by PET scanner

  12. Variations in CCL3L gene cluster sequence and non-specific gene copy numbers

    Directory of Open Access Journals (Sweden)

    Edberg Jeffrey C

    2010-03-01

    Full Text Available Abstract Background Copy number variations (CNVs of the gene CC chemokine ligand 3-like1 (CCL3L1 have been implicated in HIV-1 susceptibility, but the association has been inconsistent. CCL3L1 shares homology with a cluster of genes localized to chromosome 17q12, namely CCL3, CCL3L2, and, CCL3L3. These genes are involved in host defense and inflammatory processes. Several CNV assays have been developed for the CCL3L1 gene. Findings Through pairwise and multiple alignments of these genes, we have shown that the homology between these genes ranges from 50% to 99% in complete gene sequences and from 70-100% in the exonic regions, with CCL3L1 and CCL3L3 being identical. By use of MEGA 4 and BioEdit, we aligned sense primers, anti-sense primers, and probes used in several previously described assays against pre-multiple alignments of all four chemokine genes. Each set of probes and primers aligned and matched with overlapping sequences in at least two of the four genes, indicating that previously utilized RT-PCR based CNV assays are not specific for only CCL3L1. The four available assays measured median copies of 2 and 3-4 in European and African American, respectively. The concordance between the assays ranged from 0.44-0.83 suggesting individual discordant calls and inconsistencies with the assays from the expected gene coverage from the known sequence. Conclusions This indicates that some of the inconsistencies in the association studies could be due to assays that provide heterogenous results. Sequence information to determine CNV of the three genes separately would allow to test whether their association with the pathogenesis of a human disease or phenotype is affected by an individual gene or by a combination of these genes.

  13. IMSF: Infinite Methodology Set Framework

    Science.gov (United States)

    Ota, Martin; Jelínek, Ivan

    Software development is usually an integration task in enterprise environment - few software applications work autonomously now. It is usually a collaboration of heterogeneous and unstable teams. One serious problem is lack of resources, a popular result being outsourcing, ‘body shopping’, and indirectly team and team member fluctuation. Outsourced sub-deliveries easily become black boxes with no clear development method used, which has a negative impact on supportability. Such environments then often face the problems of quality assurance and enterprise know-how management. The used methodology is one of the key factors. Each methodology was created as a generalization of a number of solved projects, and each methodology is thus more or less connected with a set of task types. When the task type is not suitable, it causes problems that usually result in an undocumented ad-hoc solution. This was the motivation behind formalizing a simple process for collaborative software engineering. Infinite Methodology Set Framework (IMSF) defines the ICT business process of adaptive use of methods for classified types of tasks. The article introduces IMSF and briefly comments its meta-model.

  14. Setting up crowd science projects.

    Science.gov (United States)

    Scheliga, Kaja; Friesike, Sascha; Puschmann, Cornelius; Fecher, Benedikt

    2016-11-29

    Crowd science is scientific research that is conducted with the participation of volunteers who are not professional scientists. Thanks to the Internet and online platforms, project initiators can draw on a potentially large number of volunteers. This crowd can be involved to support data-rich or labour-intensive projects that would otherwise be unfeasible. So far, research on crowd science has mainly focused on analysing individual crowd science projects. In our research, we focus on the perspective of project initiators and explore how crowd science projects are set up. Based on multiple case study research, we discuss the objectives of crowd science projects and the strategies of their initiators for accessing volunteers. We also categorise the tasks allocated to volunteers and reflect on the issue of quality assurance as well as feedback mechanisms. With this article, we contribute to a better understanding of how crowd science projects are set up and how volunteers can contribute to science. We suggest that our findings are of practical relevance for initiators of crowd science projects, for science communication as well as for informed science policy making. © The Author(s) 2016.

  15. Gaspe hole sets depth record

    Energy Technology Data Exchange (ETDEWEB)

    1970-03-09

    The deepest diamond-cored hole in the Western Hemisphere, Gulf Sunnybank No. 1 on the Gaspe Peninsula of Quebec, has been completed at a depth of 11,600 ft. This is the deepest cored hole to be drilled anywhere in search of oil and gas production, and the deepest to be drilled using a wire-line core recovery technique. The well was completed in 183 days, and was cored continuously below the surface casing which was set and cemented at 1,004 ft. After underreaming a portion of the bottom of the hole, intermediate casing was set and cemented at 8,000 ft as a safety precaution against possible high oil or gas-fluid pressure. Actual coring time, after deducting time for underreaming and casing operations, was 152 days. Because of the cost of transporting a conventional oil-drilling rig to the E. location, the 89-ft mining rig was modified for the project. The contractor was Heath and Sherwood Drilling (Western) Ltd.

  16. Enteral Feeding Set Handling Techniques.

    Science.gov (United States)

    Lyman, Beth; Williams, Maria; Sollazzo, Janet; Hayden, Ashley; Hensley, Pam; Dai, Hongying; Roberts, Cristine

    2017-04-01

    Enteral nutrition therapy is common practice in pediatric clinical settings. Often patients will receive a pump-assisted bolus feeding over 30 minutes several times per day using the same enteral feeding set (EFS). This study aims to determine the safest and most efficacious way to handle the EFS between feedings. Three EFS handling techniques were compared through simulation for bacterial growth, nursing time, and supply costs: (1) rinsing the EFS with sterile water after each feeding, (2) refrigerating the EFS between feedings, and (3) using a ready-to-hang (RTH) product maintained at room temperature. Cultures were obtained at baseline, hour 12, and hour 21 of the 24-hour cycle. A time-in-motion analysis was conducted and reported in average number of seconds to complete each procedure. Supply costs were inventoried for 1 month comparing the actual usage to our estimated usage. Of 1080 cultures obtained, the overall bacterial growth rate was 8.7%. The rinse and refrigeration techniques displayed similar bacterial growth (11.4% vs 10.3%, P = .63). The RTH technique displayed the least bacterial growth of any method (4.4%, P = .002). The time analysis in minutes showed the rinse method was the most time-consuming (44.8 ± 2.7) vs refrigeration (35.8 ± 2.6) and RTH (31.08 ± 0.6) ( P refrigerating the EFS between uses is the next most efficacious method for handling the EFS between bolus feeds.

  17. Disease candidate gene identification and prioritization using protein interaction networks

    Directory of Open Access Journals (Sweden)

    Aronow Bruce J

    2009-02-01

    Full Text Available Abstract Background Although most of the current disease candidate gene identification and prioritization methods depend on functional annotations, the coverage of the gene functional annotations is a limiting factor. In the current study, we describe a candidate gene prioritization method that is entirely based on protein-protein interaction network (PPIN analyses. Results For the first time, extended versions of the PageRank and HITS algorithms, and the K-Step Markov method are applied to prioritize disease candidate genes in a training-test schema. Using a list of known disease-related genes from our earlier study as a training set ("seeds", and the rest of the known genes as a test list, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Under appropriate settings – for example, a back probability of 0.3 for PageRank with Priors and HITS with Priors, and step size 6 for K-Step Markov method – the three methods achieved a comparable AUC value, suggesting a similar performance. Conclusion Even though network-based methods are generally not as effective as integrated functional annotation-based methods for disease candidate gene prioritization, in a one-to-one comparison, PPIN-based candidate gene prioritization performs better than all other gene features or annotations. Additionally, we demonstrate that methods used for studying both social and Web networks can be successfully used for disease candidate gene prioritization.

  18. Inferring Phylogenetic Networks from Gene Order Data

    Directory of Open Access Journals (Sweden)

    Alexey Anatolievich Morozov

    2013-01-01

    Full Text Available Existing algorithms allow us to infer phylogenetic networks from sequences (DNA, protein or binary, sets of trees, and distance matrices, but there are no methods to build them using the gene order data as an input. Here we describe several methods to build split networks from the gene order data, perform simulation studies, and use our methods for analyzing and interpreting different real gene order datasets. All proposed methods are based on intermediate data, which can be generated from genome structures under study and used as an input for network construction algorithms. Three intermediates are used: set of jackknife trees, distance matrix, and binary encoding. According to simulations and case studies, the best intermediates are jackknife trees and distance matrix (when used with Neighbor-Net algorithm. Binary encoding can also be useful, but only when the methods mentioned above cannot be used.

  19. ON NANO Λg-CLOSED SETS

    OpenAIRE

    Rajasekaran, Ilangovan; Nethaji, Ochanan

    2017-01-01

    Abstaract−In this paper, we introduce nano ∧g-closed sets in nano topological spaces. Some properties of nano ∧g-closed sets and nano ∧g-open sets are weaker forms of nano closed sets and nano open sets

  20. Expression of SET Protein in the Ovaries of Patients with Polycystic Ovary Syndrome

    OpenAIRE

    Xu Boqun; Dai Xiaonan; Cui YuGui; Gao Lingling; Dai Xue; Chao Gao; Diao Feiyang; Liu Jiayin; Li Gao; Mei Li; Yuan Zhang; Xiang Ma

    2013-01-01

    Background. We previously found that expression of SET gene was up-regulated in polycystic ovaries by using microarray. It suggested that SET may be an attractive candidate regulator involved in the pathophysiology of polycystic ovary syndrome (PCOS). In this study, expression and cellular localization of SET protein were investigated in human polycystic and normal ovaries. Method. Ovarian tissues, six normal ovaries and six polycystic ovaries, were collected during transsexual operation and ...

  1. Identification and Analysis of the SET-Domain Family in Silkworm, Bombyx mori

    Directory of Open Access Journals (Sweden)

    Hailong Zhao

    2015-01-01

    Full Text Available As an important economic insect, Bombyx mori is also a useful model organism for lepidopteran insect. SET-domain-containing proteins belong to a group of enzymes named after a common domain that utilizes the cofactor S-adenosyl-L-methionine (SAM to achieve methylation of its substrates. Many SET-domain-containing proteins have been shown to display catalytic activity towards particular lysine residues on histones, but emerging evidence also indicates that various nonhistone proteins are specifically targeted by this clade of enzymes. To explore their diverse functions of SET-domain superfamily in insect, we identified, cloned, and analyzed the SET-domains proteins in silkworm, Bombyx mori. Firstly, 24 genes containing SET domain from silkworm genome were characterized and 17 of them belonged to six subfamilies of SUV39, SET1, SET2, SUV4-20, EZ, and SMYD. Secondly, SET domains of silkworm SET-domain family were intraspecifically and interspecifically conserved, especially for the catalytic core “NHSC” motif, substrate binding site, and catalytic site in the SET domain. Lastly, further analyses indicated that silkworm SET-domain gene BmSu(var3-9 owned different characterization and expression profiles compared to other invertebrates. Overall, our results provide a new insight into the functional and evolutionary features of SET-domain family.

  2. In-silico human genomics with GeneCards

    Directory of Open Access Journals (Sweden)

    Stelzer Gil

    2011-10-01

    Full Text Available Abstract Since 1998, the bioinformatics, systems biology, genomics and medical communities have enjoyed a synergistic relationship with the GeneCards database of human genes (http://www.genecards.org. This human gene compendium was created to help to introduce order into the increasing chaos of information flow. As a consequence of viewing details and deep links related to specific genes, users have often requested enhanced capabilities, such that, over time, GeneCards has blossomed into a suite of tools (including GeneDecks, GeneALaCart, GeneLoc, GeneNote and GeneAnnot for a variety of analyses of both single human genes and sets thereof. In this paper, we focus on inhouse and external research activities which have been enabled, enhanced, complemented and, in some cases, motivated by GeneCards. In turn, such interactions have often inspired and propelled improvements in GeneCards. We describe here the evolution and architecture of this project, including examples of synergistic applications in diverse areas such as synthetic lethality in cancer, the annotation of genetic variations in disease, omics integration in a systems biology approach to kidney disease, and bioinformatics tools.

  3. Radiotechnologies and gene therapy

    International Nuclear Information System (INIS)

    Xia Jinsong

    2001-01-01

    Gene therapy is an exciting frontier in medicine today. Radiologist will make an uniquely contribution to these exciting new technologies at every level by choosing sites for targeting therapy, perfecting and establishing routes of delivery, developing imaging strategies to monitor therapy and assess gene expression, developing radiotherapeutic used of gene therapy

  4. Genes misregulated in C. elegans deficient in Dicer, RDE-4, or RDE-1 are enriched for innate immunity genes.

    Science.gov (United States)

    Welker, Noah C; Habig, Jeffrey W; Bass, Brenda L

    2007-07-01

    We describe the first microarray analysis of a whole animal containing a mutation in the Dicer gene. We used adult Caenorhabditis elegans and, to distinguish among different roles of Dicer, we also performed microarray analyses of animals with mutations in rde-4 and rde-1, which are involved in silencing by siRNA, but not miRNA. Surprisingly, we find that the X chromosome is greatly enriched for genes regulated by Dicer. Comparison of all three microarray data sets indicates the majority of Dicer-regulated genes are not dependent on RDE-4 or RDE-1, including the X-linked genes. However, all three data sets are enriched in genes important for innate immunity and, specifically, show increased expression of innate immunity genes.

  5. Functionally enigmatic genes: a case study of the brain ignorome.

    Directory of Open Access Journals (Sweden)

    Ashutosh K Pandey

    Full Text Available What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a genomic bandwagon effect. Finally we ask to what extent massive genomic, imaging, and phenotype data sets can be used to provide high-throughput functional annotation for an entire ignorome. In a majority of cases we have been able to extract and add significant information for these neglected genes. In several cases--ELMOD1, TMEM88B, and DZANK1--we have exploited sequence polymorphisms, large phenome data sets, and reverse genetic methods to evaluate the function of ignorome genes.

  6. Functionally enigmatic genes: a case study of the brain ignorome.

    Science.gov (United States)

    Pandey, Ashutosh K; Lu, Lu; Wang, Xusheng; Homayouni, Ramin; Williams, Robert W

    2014-01-01

    What proportion of genes with intense and selective expression in specific tissues, cells, or systems are still almost completely uncharacterized with respect to biological function? In what ways do these functionally enigmatic genes differ from well-studied genes? To address these two questions, we devised a computational approach that defines so-called ignoromes. As proof of principle, we extracted and analyzed a large subset of genes with intense and selective expression in brain. We find that publications associated with this set are highly skewed--the top 5% of genes absorb 70% of the relevant literature. In contrast, approximately 20% of genes have essentially no neuroscience literature. Analysis of the ignorome over the past decade demonstrates that it is stubbornly persistent, and the rapid expansion of the neuroscience literature has not had the expected effect on numbers of these genes. Surprisingly, ignorome genes do not differ from well-studied genes in terms of connectivity in coexpression networks. Nor do they differ with respect to numbers of orthologs, paralogs, or protein domains. The major distinguishing characteristic between these sets of genes is date of discovery, early discovery being associated with greater research momentum--a gen