WorldWideScience

Sample records for niche-specific gene set

  1. Bacterial niche-specific genome expansion is coupled with highly frequent gene disruptions in deep-sea sediments

    KAUST Repository

    Wang, Yong

    2011-12-21

    The complexity and dynamics of microbial metagenomes may be evaluated by genome size, gene duplication and the disruption rate between lineages. In this study, we pyrosequenced the metagenomes of microbes obtained from the brine and sediment of a deep-sea brine pool in the Red Sea to explore the possible genomic adaptations of the microbes in response to environmental changes. The microbes from the brine and sediments (both surface and deep layers) of the Atlantis II Deep brine pool had similar communities whereas the effective genome size varied from 7.4 Mb in the brine to more than 9 Mb in the sediment. This genome expansion in the sediment samples was due to gene duplication as evidenced by enrichment of the homologs. The duplicated genes were highly disrupted, on average by 47.6% and 70% for the surface and deep layers of the Atlantis II Deep sediment samples, respectively. The disruptive effects appeared to be mainly due to point mutations and frameshifts. In contrast, the homologs from the Atlantis II Deep brine sample were highly conserved and they maintained relatively small copy numbers. Likely, the adaptation of the microbes in the sediments was coupled with pseudogenizations and possibly functional diversifications of the paralogs in the expanded genomes. The maintenance of the pseudogenes in the large genomes is discussed. © 2011 Wang et al.

  2. Comparative genomic analysis of Lactobacillus mucosae LM1 identifies potential niche-specific genes and pathways for gastrointestinal adaptation.

    Science.gov (United States)

    Valeriano, Valerie Diane V; Oh, Ju Kyoung; Bagon, Bernadette B; Kim, Heebal; Kang, Dae-Kyung

    2017-12-23

    Lactobacillus mucosae is currently of interest as putative probiotics due to their metabolic capabilities and ability to colonize host mucosal niches. L. mucosae LM1 has been studied in its functions in cell adhesion and pathogen inhibition, etc. It demonstrated unique abilities to use energy from carbohydrate and non-carbohydrate sources. Due to these functions, we report the first complete genome sequence of an L. mucosae strain, L. mucosae LM1. Analysis of the pan-genome in comparison with closely-related Lactobacillus species identified a complete glycogen metabolism pathway, as well as folate biosynthesis, complementing previous proteomic data on the LM1 strain. It also revealed common and unique niche-adaptation genes among the various L. mucosae strains. The aim of this study was to derive genomic information that would reveal the probable mechanisms underlying the probiotic effect of L. mucosae LM1, and provide a better understanding of the nature of L. mucosae sp. Copyright © 2017 Elsevier Inc. All rights reserved.

  3. Niche-specific cognitive strategies

    DEFF Research Database (Denmark)

    Hulgard, K.; Ratcliffe, J. M.

    2014-01-01

    Related species with different diets are predicted to rely on different cognitive strategies: those best suited for locating available and appropriate foods. Here we tested two predictions of the niche-specific cognitive strategies hypothesis in bats, which suggests that predatory species should ...... the niche-specific cognitive strategies hypothesis and suggest that for gleaning and clutter-resistant aerial hawking bats, learning to associate shape with food interferes with subsequent spatial memory learning.......Related species with different diets are predicted to rely on different cognitive strategies: those best suited for locating available and appropriate foods. Here we tested two predictions of the niche-specific cognitive strategies hypothesis in bats, which suggests that predatory species should...... rely more on object memory than on spatial memory for finding food and that the opposite is true of frugivorous and nectivorous species. Specifically, we predicted that: (1) predatory bats would readily learn to associate shapes with palatable prey and (2) once bats had made such associations...

  4. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida

    Energy Technology Data Exchange (ETDEWEB)

    Wu X.; van der Lelie D.; Monchy, S.; Taghavi, S.; Zhu, W.; Ramos, J.

    2011-03-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands.

  5. Comparative genomics and functional analysis of niche-specific adaptation in Pseudomonas putida.

    Science.gov (United States)

    Wu, Xiao; Monchy, Sébastien; Taghavi, Safiyh; Zhu, Wei; Ramos, Juan; van der Lelie, Daniel

    2011-03-01

    Pseudomonas putida is a gram-negative rod-shaped gammaproteobacterium that is found throughout various environments. Members of the species P. putida show a diverse spectrum of metabolic activities, which is indicative of their adaptation to various niches, which includes the ability to live in soils and sediments contaminated with high concentrations of heavy metals and organic contaminants. Pseudomonas putida strains are also found as plant growth-promoting rhizospheric and endophytic bacteria. The genome sequences of several P. putida species have become available and provide a unique tool to study the specific niche adaptation of the various P. putida strains. In this review, we compare the genomes of four P. putida strains: the rhizospheric strain KT2440, the endophytic strain W619, the aromatic hydrocarbon-degrading strain F1 and the manganese-oxidizing strain GB-1. Comparative genomics provided a powerful tool to gain new insights into the adaptation of P. putida to specific lifestyles and environmental niches, and clearly demonstrated that horizontal gene transfer played a key role in this adaptation process, as many of the niche-specific functions were found to be encoded on clearly defined genomic islands. Journal compilation © 2010 Federation of European Microbiological Societies. Published by Blackwell Publishing Ltd. No claim to original US government works.

  6. Gene set analysis of the EADGENE chicken data-set

    DEFF Research Database (Denmark)

    Skarman, Axel; Jiang, Li; Hornshøj, Henrik

    2009-01-01

     Abstract Background: Gene set analysis is considered to be a way of improving our biological interpretation of the observed expression patterns. This paper describes different methods applied to analyse expression data from a chicken DNA microarray dataset. Results: Applying different gene set...

  7. Third party annotation gene data set of eutherian lysozyme genes

    Directory of Open Access Journals (Sweden)

    Marko Premzl

    2014-12-01

    Full Text Available The eutherian comparative genomic analysis protocol annotated most comprehensive eutherian lysozyme gene data set. Among 209 potential coding sequences, the third party annotation gene data set of eutherian lysozyme genes included 116 complete coding sequences that first described seven major gene clusters. As one new framework of future experiments, the present integrated gene annotations, phylogenetic analysis and protein molecular evolution analysis proposed new classification and nomenclature of eutherian lysozyme genes.

  8. Synchronized dynamics of bacterial niche-specific functions during biofilm development in a cold seep brine pool

    KAUST Repository

    Zhang, Weipeng

    2015-07-14

    The biology of biofilm in deep-sea environments is barely being explored. Here, biofilms were developed at the brine pool (characterized by limited carbon sources) and the normal bottom water adjacent to Thuwal cold seeps. Comparative metagenomics based on 50 Gb datasets identified polysaccharide degradation, nitrate reduction, and proteolysis as enriched functional categories for brine biofilms. The genomes of two dominant species: a novel deltaproteobacterium and a novel epsilonproteobacterium in the brine biofilms were reconstructed. Despite rather small genome sizes, the deltaproteobacterium possessed enhanced polysaccharide fermentation pathways, whereas the epsilonproteobacterium was a versatile nitrogen reactor possessing nar, nap and nif gene clusters. These metabolic functions, together with specific regulatory and hypersaline-tolerant genes, made the two bacteria unique compared with their close relatives including those from hydrothermal vents. Moreover, these functions were regulated by biofilm development, as both the abundance and the expression level of key functional genes were higher in later-stage biofilms, and co-occurrences between the two dominant bacteria were demonstrated. Collectively, unique mechanisms were revealed: i) polysaccharides fermentation, proteolysis interacted with nitrogen cycling to form a complex chain for energy generation; ii) remarkably, exploiting and organizing niche-specific functions would be an important strategy for biofilm-dependent adaptation to the extreme conditions.

  9. Third party annotation gene data set of eutherian lysozyme genes

    OpenAIRE

    Premzl, Marko

    2014-01-01

    The eutherian comparative genomic analysis protocol annotated most comprehensive eutherian lysozyme gene data set. Among 209 potential coding sequences, the third party annotation gene data set of eutherian lysozyme genes included 116 complete coding sequences that first described seven major gene clusters. As one new framework of future experiments, the present integrated gene annotations, phylogenetic analysis and protein molecular evolution analysis proposed new classification and nomencla...

  10. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    Science.gov (United States)

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  11. Gene set analysis for longitudinal gene expression data

    Directory of Open Access Journals (Sweden)

    Piepho Hans-Peter

    2011-07-01

    Full Text Available Abstract Background Gene set analysis (GSA has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes. Nowadays, an increasing number of microarray studies are conducted to explore the dynamic changes of gene expression in a variety of species and biological scenarios. In these longitudinal studies, gene expression is repeatedly measured over time such that a GSA needs to take into account the within-gene correlations in addition to possible between-gene correlations. Results We provide a robust nonparametric approach to compare the expressions of longitudinally measured sets of genes under multiple treatments or experimental conditions. The limiting distributions of our statistics are derived when the number of genes goes to infinity while the number of replications can be small. When the number of genes in a gene set is small, we recommend permutation tests based on our nonparametric test statistics to achieve reliable type I error and better power while incorporating unknown correlations between and within-genes. Simulation results demonstrate that the proposed method has a greater power than other methods for various data distributions and heteroscedastic correlation structures. This method was used for an IL-2 stimulation study and significantly altered gene sets were identified. Conclusions The simulation study and the real data application showed that the proposed gene set analysis provides a promising tool for longitudinal microarray analysis. R scripts for simulating longitudinal data and calculating the nonparametric statistics are posted on the North Dakota INBRE website http://ndinbre.org/programs/bioinformatics.php. Raw microarray data is available in Gene Expression Omnibus (National Center for Biotechnology Information with

  12. Gene set analysis for interpreting genetic studies

    DEFF Research Database (Denmark)

    Pers, Tune H

    2016-01-01

    Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...

  13. Delimiting Coalescence Genes (C-Genes) in Phylogenomic Data Sets.

    Science.gov (United States)

    Springer, Mark S; Gatesy, John

    2018-02-26

    coalescence methods have emerged as a popular alternative for inferring species trees with large genomic datasets, because these methods explicitly account for incomplete lineage sorting. However, statistical consistency of summary coalescence methods is not guaranteed unless several model assumptions are true, including the critical assumption that recombination occurs freely among but not within coalescence genes (c-genes), which are the fundamental units of analysis for these methods. Each c-gene has a single branching history, and large sets of these independent gene histories should be the input for genome-scale coalescence estimates of phylogeny. By contrast, numerous studies have reported the results of coalescence analyses in which complete protein-coding sequences are treated as c-genes even though exons for these loci can span more than a megabase of DNA. Empirical estimates of recombination breakpoints suggest that c-genes may be much shorter, especially when large clades with many species are the focus of analysis. Although this idea has been challenged recently in the literature, the inverse relationship between c-gene size and increased taxon sampling in a dataset-the 'recombination ratchet'-is a fundamental property of c-genes. For taxonomic groups characterized by genes with long intron sequences, complete protein-coding sequences are likely not valid c-genes and are inappropriate units of analysis for summary coalescence methods unless they occur in recombination deserts that are devoid of incomplete lineage sorting (ILS). Finally, it has been argued that coalescence methods are robust when the no-recombination within loci assumption is violated, but recombination must matter at some scale because ILS, a by-product of recombination, is the raison d'etre for coalescence methods. That is, extensive recombination is required to yield the large number of independently segregating c-genes used to infer a species tree. If coalescent methods are powerful

  14. Gene set selection via LASSO penalized regression (SLPR).

    Science.gov (United States)

    Frost, H Robert; Amos, Christopher I

    2017-07-07

    Gene set testing is an important bioinformatics technique that addresses the challenges of power, interpretation and replication. To better support the analysis of large and highly overlapping gene set collections, researchers have recently developed a number of multiset methods that jointly evaluate all gene sets in a collection to identify a parsimonious group of functionally independent sets. Unfortunately, current multiset methods all use binary indicators for gene and gene set activity and assume that a gene is active if any containing gene set is active. This simplistic model limits performance on many types of genomic data. To address this limitation, we developed gene set Selection via LASSO Penalized Regression (SLPR), a novel mapping of multiset gene set testing to penalized multiple linear regression. The SLPR method assumes a linear relationship between continuous measures of gene activity and the activity of all gene sets in the collection. As we demonstrate via simulation studies and the analysis of TCGA data using MSigDB gene sets, the SLPR method outperforms existing multiset methods when the true biological process is well approximated by continuous activity measures and a linear association between genes and gene sets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  15. Studying the Complex Expression Dependences between Sets of Coexpressed Genes

    Directory of Open Access Journals (Sweden)

    Mario Huerta

    2014-01-01

    Full Text Available Organisms simplify the orchestration of gene expression by coregulating genes whose products function together in the cell. The use of clustering methods to obtain sets of coexpressed genes from expression arrays is very common; nevertheless there are no appropriate tools to study the expression networks among these sets of coexpressed genes. The aim of the developed tools is to allow studying the complex expression dependences that exist between sets of coexpressed genes. For this purpose, we start detecting the nonlinear expression relationships between pairs of genes, plus the coexpressed genes. Next, we form networks among sets of coexpressed genes that maintain nonlinear expression dependences between all of them. The expression relationship between the sets of coexpressed genes is defined by the expression relationship between the skeletons of these sets, where this skeleton represents the coexpressed genes with a well-defined nonlinear expression relationship with the skeleton of the other sets. As a result, we can study the nonlinear expression relationships between a target gene and other sets of coexpressed genes, or start the study from the skeleton of the sets, to study the complex relationships of activation and deactivation between the sets of coexpressed genes that carry out the different cellular processes present in the expression experiments.

  16. Influence of Niche-Specific Nutrients on Secondary Metabolism in Vibrionaceae

    DEFF Research Database (Denmark)

    Giubergia, Sonia; Phippen, Christopher; Gotfredsen, Charlotte Held

    2016-01-01

    . A challenge in microbial natural product discovery is the elicitation of the biosynthetic gene clusters that are silent when microorganisms are grown under standard laboratory conditions. We hypothesized that, since the clusters are not lost during proliferation in the natural niche of the microorganisms...

  17. GARNET – gene set analysis with exploration of annotation relations

    Directory of Open Access Journals (Sweden)

    Seo Jihae

    2011-02-01

    Full Text Available Abstract Background Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. Results GARNET (Gene Annotation Relationship NEtwork Tools is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules - gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. Conclusions GARNET (gene annotation relationship network tools is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/.

  18. Niche-specific cognitive strategies: object memory interferes with spatial memory in the predatory bat Myotis nattereri.

    Science.gov (United States)

    Hulgard, Katrine; Ratcliffe, John M

    2014-09-15

    Related species with different diets are predicted to rely on different cognitive strategies: those best suited for locating available and appropriate foods. Here we tested two predictions of the niche-specific cognitive strategies hypothesis in bats, which suggests that predatory species should rely more on object memory than on spatial memory for finding food and that the opposite is true of frugivorous and nectivorous species. Specifically, we predicted that: (1) predatory bats would readily learn to associate shapes with palatable prey and (2) once bats had made such associations, these would interfere with their subsequent learning of a spatial memory task. We trained free-flying Myotis nattereri to approach palatable and unpalatable insect prey suspended below polystyrene objects. Experimentally naïve bats learned to associate different objects with palatable and unpalatable prey but performed no better than chance in a subsequent spatial memory experiment. Because experimental sequence was predicted to be of consequence, we introduced a second group of bats first to the spatial memory experiment. These bats learned to associate prey position with palatability. Control trials indicated that bats made their decisions based on information acquired through echolocation. Previous studies have shown that bat species that eat mainly nectar and fruit rely heavily on spatial memory, reflecting the relative consistency of distribution of fruit and nectar compared with insects. Our results support the niche-specific cognitive strategies hypothesis and suggest that for gleaning and clutter-resistant aerial hawking bats, learning to associate shape with food interferes with subsequent spatial memory learning. © 2014. Published by The Company of Biologists Ltd.

  19. Principles for the organization of gene-sets.

    Science.gov (United States)

    Li, Wentian; Freudenberg, Jan; Oswald, Michaela

    2015-12-01

    A gene-set, an important concept in microarray expression analysis and systems biology, is a collection of genes and/or their products (i.e. proteins) that have some features in common. There are many different ways to construct gene-sets, but a systematic organization of these ways is lacking. Gene-sets are mainly organized ad hoc in current public-domain databases, with group header names often determined by practical reasons (such as the types of technology in obtaining the gene-sets or a balanced number of gene-sets under a header). Here we aim at providing a gene-set organization principle according to the level at which genes are connected: homology, physical map proximity, chemical interaction, biological, and phenotypic-medical levels. We also distinguish two types of connections between genes: actual connection versus sharing of a label. Actual connections denote direct biological interactions, whereas shared label connection denotes shared membership in a group. Some extensions of the framework are also addressed such as overlapping of gene-sets, modules, and the incorporation of other non-protein-coding entities such as microRNAs. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Influence of Niche-Specific Nutrients on Secondary Metabolism in Vibrionaceae.

    Science.gov (United States)

    Giubergia, Sonia; Phippen, Christopher; Gotfredsen, Charlotte H; Nielsen, Kristian Fog; Gram, Lone

    2016-07-01

    Many factors, such as the substrate and the growth phase, influence biosynthesis of secondary metabolites in microorganisms. Therefore, it is crucial to consider these factors when establishing a bioprospecting strategy. Mimicking the conditions of the natural environment has been suggested as a means of inducing or influencing microbial secondary metabolite production. The purpose of the present study was to determine how the bioactivity of Vibrionaceae was influenced by carbon sources typical of their natural environment. We determined how mannose and chitin, compared to glucose, influenced the antibacterial activity of a collection of Vibrionaceae strains isolated because of their ability to produce antibacterial compounds but that in subsequent screenings seemed to have lost this ability. The numbers of bioactive isolates were 2- and 3.5-fold higher when strains were grown on mannose and chitin, respectively, than on glucose. As secondary metabolites are typically produced during late growth, potential producers were also allowed 1 to 2 days of growth before exposure to the pathogen. This strategy led to a 3-fold increase in the number of bioactive strains on glucose and an 8-fold increase on both chitin and mannose. We selected two bioactive strains belonging to species for which antibacterial activity had not previously been identified. Using ultrahigh-performance liquid chromatography-high-resolution mass spectrometry and bioassay-guided fractionation, we found that the siderophore fluvibactin was responsible for the antibacterial activity of Vibrio furnissii and Vibrio fluvialis These results suggest a role of chitin in the regulation of secondary metabolism in vibrios and demonstrate that considering bacterial ecophysiology during development of screening strategies will facilitate bioprospecting. A challenge in microbial natural product discovery is the elicitation of the biosynthetic gene clusters that are silent when microorganisms are grown under

  1. Comparing the Healthy Nose and Nasopharynx Microbiota Reveals Continuity As Well As Niche-Specificity

    Directory of Open Access Journals (Sweden)

    Ilke De Boeck

    2017-11-01

    Full Text Available To improve our understanding of upper respiratory tract (URT diseases and the underlying microbial pathogenesis, a better characterization of the healthy URT microbiome is crucial. In this first large-scale study, we obtained more insight in the URT microbiome of healthy adults. Hereto, we collected paired nasal and nasopharyngeal swabs from 100 healthy participants in a citizen-science project. High-throughput 16S rRNA gene V4 amplicon sequencing was performed and samples were processed using the Divisive Amplicon Denoising Algorithm 2 (DADA2 algorithm. This allowed us to identify the bacterial richness and diversity of the samples in terms of amplicon sequence variants (ASVs, with special attention to intragenus variation. We found both niches to have a low overall species richness and uneven distribution. Moreover, based on hierarchical clustering, nasopharyngeal samples could be grouped into some bacterial community types at genus level, of which four were supported to some extent by prediction strength evaluation: one intermixed type with a higher bacterial diversity where Staphylococcus, Corynebacterium, and Dolosigranulum appeared main bacterial members in different relative abundances, and three types dominated by either Moraxella, Streptococcus, or Fusobacterium. Some of these bacterial community types such as Streptococcus and Fusobacterium were nasopharynx-specific and never occurred in the nose. No clear association between the nasopharyngeal bacterial profiles at genus level and the variables age, gender, blood type, season of sampling, or common respiratory allergies was found in this study population, except for smoking showing a positive association with Corynebacterium and Staphylococcus. Based on the fine-scale resolution of the ASVs, both known commensal and potential pathogenic bacteria were found within several genera – particularly in Streptococcus and Moraxella – in our healthy study population. Of interest, the

  2. IGSA: Individual Gene Sets Analysis, including Enrichment and Clustering.

    Science.gov (United States)

    Wu, Lingxiang; Chen, Xiujie; Zhang, Denan; Zhang, Wubing; Liu, Lei; Ma, Hongzhe; Yang, Jingbo; Xie, Hongbo; Liu, Bo; Jin, Qing

    2016-01-01

    Analysis of gene sets has been widely applied in various high-throughput biological studies. One weakness in the traditional methods is that they neglect the heterogeneity of genes expressions in samples which may lead to the omission of some specific and important gene sets. It is also difficult for them to reflect the severities of disease and provide expression profiles of gene sets for individuals. We developed an application software called IGSA that leverages a powerful analytical capacity in gene sets enrichment and samples clustering. IGSA calculates gene sets expression scores for each sample and takes an accumulating clustering strategy to let the samples gather into the set according to the progress of disease from mild to severe. We focus on gastric, pancreatic and ovarian cancer data sets for the performance of IGSA. We also compared the results of IGSA in KEGG pathways enrichment with David, GSEA, SPIA, ssGSEA and analyzed the results of IGSA clustering and different similarity measurement methods. Notably, IGSA is proved to be more sensitive and specific in finding significant pathways, and can indicate related changes in pathways with the severity of disease. In addition, IGSA provides with significant gene sets profile for each sample.

  3. Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data

    Directory of Open Access Journals (Sweden)

    Tintle Nathan L

    2012-08-01

    Full Text Available Abstract Background Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. Results We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Conclusions Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.

  4. Zebrafish Expression Ontology of Gene Sets (ZEOGS): a tool to analyze enrichment of zebrafish anatomical terms in large gene sets.

    Science.gov (United States)

    Prykhozhij, Sergey V; Marsico, Annalisa; Meijsing, Sebastiaan H

    2013-09-01

    The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene expression

  5. Zebrafish Expression Ontology of Gene Sets (ZEOGS): A Tool to Analyze Enrichment of Zebrafish Anatomical Terms in Large Gene Sets

    Science.gov (United States)

    Marsico, Annalisa

    2013-01-01

    Abstract The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene

  6. A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

    Science.gov (United States)

    Seok, Junhee; Davis, Ronald W; Xiao, Wenzhong

    2015-01-01

    Accumulated biological knowledge is often encoded as gene sets, collections of genes associated with similar biological functions or pathways. The use of gene sets in the analyses of high-throughput gene expression data has been intensively studied and applied in clinical research. However, the main interest remains in finding modules of biological knowledge, or corresponding gene sets, significantly associated with disease conditions. Risk prediction from censored survival times using gene sets hasn't been well studied. In this work, we propose a hybrid method that uses both single gene and gene set information together to predict patient survival risks from gene expression profiles. In the proposed method, gene sets provide context-level information that is poorly reflected by single genes. Complementarily, single genes help to supplement incomplete information of gene sets due to our imperfect biomedical knowledge. Through the tests over multiple data sets of cancer and trauma injury, the proposed method showed robust and improved performance compared with the conventional approaches with only single genes or gene sets solely. Additionally, we examined the prediction result in the trauma injury data, and showed that the modules of biological knowledge used in the prediction by the proposed method were highly interpretable in biology. A wide range of survival prediction problems in clinical genomics is expected to benefit from the use of biological knowledge.

  7. Discovery of cancer common and specific driver gene sets

    Science.gov (United States)

    2017-01-01

    Abstract Cancer is known as a disease mainly caused by gene alterations. Discovery of mutated driver pathways or gene sets is becoming an important step to understand molecular mechanisms of carcinogenesis. However, systematically investigating commonalities and specificities of driver gene sets among multiple cancer types is still a great challenge, but this investigation will undoubtedly benefit deciphering cancers and will be helpful for personalized therapy and precision medicine in cancer treatment. In this study, we propose two optimization models to de novo discover common driver gene sets among multiple cancer types (ComMDP) and specific driver gene sets of one certain or multiple cancer types to other cancers (SpeMDP), respectively. We first apply ComMDP and SpeMDP to simulated data to validate their efficiency. Then, we further apply these methods to 12 cancer types from The Cancer Genome Atlas (TCGA) and obtain several biologically meaningful driver pathways. As examples, we construct a common cancer pathway model for BRCA and OV, infer a complex driver pathway model for BRCA carcinogenesis based on common driver gene sets of BRCA with eight cancer types, and investigate specific driver pathways of the liquid cancer lymphoblastic acute myeloid leukemia (LAML) versus other solid cancer types. In these processes more candidate cancer genes are also found. PMID:28168295

  8. Time-Course Gene Set Analysis for Longitudinal Gene Expression Data.

    Directory of Open Access Journals (Sweden)

    Boris P Hejblum

    2015-06-01

    Full Text Available Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial, and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package.

  9. A pursuit of lineage-specific and niche-specific proteome features in the world of archaea

    Directory of Open Access Journals (Sweden)

    Roy Chowdhury Anindya

    2012-06-01

    Full Text Available Abstract Background Archaea evoke interest among researchers for two enigmatic characteristics –a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Results Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Conclusions Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.

  10. A pursuit of lineage-specific and niche-specific proteome features in the world of archaea.

    Science.gov (United States)

    Roy Chowdhury, Anindya; Dutta, Chitra

    2012-06-12

    Archaea evoke interest among researchers for two enigmatic characteristics -a combination of bacterial and eukaryotic components in their molecular architectures and an enormous diversity in their life-style and metabolic capabilities. Despite considerable research efforts, lineage- specific/niche-specific molecular features of the whole archaeal world are yet to be fully unveiled. The study offers the first large-scale in silico proteome analysis of all archaeal species of known genome sequences with a special emphasis on methanogenic and sulphur-metabolising archaea. Overall amino acid usage in archaea is dominated by GC-bias. But the environmental factors like oxygen requirement or thermal adaptation seem to play important roles in selection of residues with no GC-bias at the codon level. All methanogens, irrespective of their thermal/salt adaptation, show higher usage of Cys and have relatively acidic proteomes, while the proteomes of sulphur-metabolisers have higher aromaticity and more positive charges. Despite of exhibiting thermophilic life-style, korarchaeota possesses an acidic proteome. Among the distinct trends prevailing in COGs (Cluster of Orthologous Groups of proteins) distribution profiles, crenarchaeal organisms display higher intra-order variations in COGs repertoire, especially in the metabolic ones, as compared to euryarchaea. All methanogens are characterised by a presence of 22 exclusive COGs. Divergences in amino acid usage, aromaticity/charge profiles and COG repertoire among methanogens and sulphur-metabolisers, aerobic and anaerobic archaea or korarchaeota and nanoarchaeota, as elucidated in the present study, point towards the presence of distinct molecular strategies for niche specialization in the archaeal world.

  11. Analysis of gene set using shrinkage covariance matrix approach

    Science.gov (United States)

    Karjanto, Suryaefiza; Aripin, Rasimah

    2013-09-01

    Microarray methodology has been exploited for different applications such as gene discovery and disease diagnosis. This technology is also used for quantitative and highly parallel measurements of gene expression. Recently, microarrays have been one of main interests of statisticians because they provide a perfect example of the paradigms of modern statistics. In this study, the alternative approach to estimate the covariance matrix has been proposed to solve the high dimensionality problem in microarrays. The extension of traditional Hotelling's T2 statistic is constructed for determining the significant gene sets across experimental conditions using shrinkage approach. Real data sets were used as illustrations to compare the performance of the proposed methods with other methods. The results across the methods are consistent, implying that this approach provides an alternative to existing techniques.

  12. Network enrichment analysis: extension of gene-set enrichment analysis to gene networks

    Directory of Open Access Journals (Sweden)

    Alexeyenko Andrey

    2012-09-01

    Full Text Available Abstract Background Gene-set enrichment analyses (GEA or GSEA are commonly used for biological characterization of an experimental gene-set. This is done by finding known functional categories, such as pathways or Gene Ontology terms, that are over-represented in the experimental set; the assessment is based on an overlap statistic. Rich biological information in terms of gene interaction network is now widely available, but this topological information is not used by GEA, so there is a need for methods that exploit this type of information in high-throughput data analysis. Results We developed a method of network enrichment analysis (NEA that extends the overlap statistic in GEA to network links between genes in the experimental set and those in the functional categories. For the crucial step in statistical inference, we developed a fast network randomization algorithm in order to obtain the distribution of any network statistic under the null hypothesis of no association between an experimental gene-set and a functional category. We illustrate the NEA method using gene and protein expression data from a lung cancer study. Conclusions The results indicate that the NEA method is more powerful than the traditional GEA, primarily because the relationships between gene sets were more strongly captured by network connectivity rather than by simple overlaps.

  13. GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Chris Cheadle

    2007-01-01

    Full Text Available Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.Results: We have developed (gene set matrix analysis GSMA as a useful method for the rapid testing of group-wise up- or downregulation of gene expression simultaneously for multiple lists of genes (gene sets against entire distributions of gene expression changes (datasets for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.

  14. Assessing the functional coherence of gene sets with metrics based on the Gene Ontology graph.

    Science.gov (United States)

    Richards, Adam J; Muller, Brian; Shotwell, Matthew; Cowart, L Ashley; Rohrer, Bäerbel; Lu, Xinghua

    2010-06-15

    The results of initial analyses for many high-throughput technologies commonly take the form of gene or protein sets, and one of the ensuing tasks is to evaluate the functional coherence of these sets. The study of gene set function most commonly makes use of controlled vocabulary in the form of ontology annotations. For a given gene set, the statistical significance of observing these annotations or 'enrichment' may be tested using a number of methods. Instead of testing for significance of individual terms, this study is concerned with the task of assessing the global functional coherence of gene sets, for which novel metrics and statistical methods have been devised. The metrics of this study are based on the topological properties of graphs comprised of genes and their Gene Ontology annotations. A novel aspect of these methods is that both the enrichment of annotations and the relationships among annotations are considered when determining the significance of functional coherence. We applied our methods to perform analyses on an existing database and on microarray experimental results. Here, we demonstrated that our approach is highly discriminative in terms of differentiating coherent gene sets from random ones and that it provides biologically sensible evaluations in microarray analysis. We further used examples to show the utility of graph visualization as a tool for studying the functional coherence of gene sets. The implementation is provided as a freely accessible web application at: http://projects.dbbe.musc.edu/gosteiner. Additionally, the source code written in the Python programming language, is available under the General Public License of the Free Software Foundation. Supplementary data are available at Bioinformatics online.

  15. Beyond main effects of gene?sets: harsh parenting moderates the association between a dopamine gene?set and child externalizing behavior

    OpenAIRE

    Windhorst, Dafna A.; Mileva?Seitz, Viara R.; Rippe, Ralph C. A.; Tiemeier, Henning; Jaddoe, Vincent W. V.; Verhulst, Frank C.; van IJzendoorn, Marinus H.; Bakermans?Kranenburg, Marian J.

    2016-01-01

    textabstractBackground: In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental inte...

  16. Core set approach to reduce uncertainty of gene trees

    Directory of Open Access Journals (Sweden)

    Okuhara Yoshiyasu

    2006-05-01

    Full Text Available Abstract Background A genealogy based on gene sequences within a species plays an essential role in the estimation of the character, structure, and evolutionary history of that species. Because intraspecific sequences are more closely related than interspecific ones, detailed information on the evolutionary process may be available by determining all the node sequences of trees and provide insight into functional constraints and adaptations. However, strong evolutionary correlations on a few lineages make this determination difficult as a whole, and the maximum parsimony (MP method frequently allows a number of topologies with a same total branching length. Results Kitazoe et al. developed multidimensional vector-space representation of phylogeny. It converts additivity of evolutionary distances to orthogonality among the vectors expressing branches, and provides a unified index to measure deviations from the orthogoality. In this paper, this index is used to detect and exclude sequences with large deviations from orthogonality, and then selects a maximum subset ("core set" of sequences for which MP generates a single solution. Once the core set tree is formed whose all the node sequences are given, the excluded sequences are found to have basically two phylogenetic positions on this tree, respectively. Fortunately, since multiple substitutions are rare in intra-species sequences, the variance of nucleotide transitions is confined to a small range. By applying the core set approach to 38 partial env sequences of HIV-1 in a single patient and also 198 mitochondrial COI and COII DNA sequences of Anopheles dirus, we demonstrate how consistently this approach constructs the tree. Conclusion In the HIV dataset, we confirmed that the obtained core set tree is the unique maximum set for which MP proposes a single tree. In the mosquito data set, the fluctuation of nucleotide transitions caused by the sequences excluded from the core set was very small

  17. Integrative analysis of survival-associated gene sets in breast cancer.

    Science.gov (United States)

    Varn, Frederick S; Ung, Matthew H; Lou, Shao Ke; Cheng, Chao

    2015-03-12

    Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient's cancer. Identifying robust gene sets that are consistently predictive of a patient's clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set's activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network's topology and applied the GSAS metric to characterize its role in patient survival. Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used

  18. Locally linear embedding and neighborhood rough set-based gene selection for gene expression data classification.

    Science.gov (United States)

    Sun, L; Xu, J-C; Wang, W; Yin, Y

    2016-08-30

    Cancer subtype recognition and feature selection are important problems in the diagnosis and treatment of tumors. Here, we propose a novel gene selection approach applied to gene expression data classification. First, two classical feature reduction methods including locally linear embedding (LLE) and rough set (RS) are summarized. The advantages and disadvantages of these algorithms were analyzed and an optimized model for tumor gene selection was developed based on LLE and neighborhood RS (NRS). Bhattacharyya distance was introduced to delete irrelevant genes, pair-wise redundant analysis was performed to remove strongly correlated genes, and the wavelet soft threshold was determined to eliminate noise in the gene datasets. Next, prior optimized search processing was carried out. A new approach combining dimension reduction of LLE and feature reduction of NRS (LLE-NRS) was developed for selecting gene subsets, and then an open source software Weka was applied to distinguish different tumor types and verify the cross-validation classification accuracy of our proposed method. The experimental results demonstrated that the classification performance of the proposed LLE-NRS for selecting gene subset outperforms those of other related models in terms of accuracy, and our proposed approach is feasible and effective in the field of high-dimensional tumor classification.

  19. The histone methyltransferases Set5 and Set1 have overlapping functions in gene silencing and telomere maintenance.

    Science.gov (United States)

    Jezek, Meagan; Gast, Alison; Choi, Grace; Kulkarni, Rushmie; Quijote, Jeremiah; Graham-Yooll, Andrew; Park, DoHwan; Green, Erin M

    2017-02-01

    Genes adjacent to telomeres are subject to transcriptional repression mediated by an integrated set of chromatin modifying and remodeling factors. The telomeres of Saccharomyces cerevisiae have served as a model for dissecting the function of diverse chromatin proteins in gene silencing, and their study has revealed overlapping roles for many chromatin proteins in either promoting or antagonizing gene repression. The H3K4 methyltransferase Set1, which is commonly linked to transcriptional activation, has been implicated in telomere silencing. Set5 is an H4 K5, K8, and K12 methyltransferase that functions with Set1 to promote repression at telomeres. Here, we analyzed the combined role for Set1 and Set5 in gene expression control at native yeast telomeres. Our data reveal that Set1 and Set5 promote a Sir protein-independent mechanism of repression that may primarily rely on regulation of H4K5ac and H4K8ac at telomeric regions. Furthermore, cells lacking both Set1 and Set5 have highly correlated transcriptomes to mutants in telomere maintenance pathways and display defects in telomere stability, linking their roles in silencing to protection of telomeres. Our data therefore provide insight into and clarify potential mechanisms by which Set1 contributes to telomere silencing and shed light on the function of Set5 at telomeres.

  20. Three gene expression vector sets for concurrently expressing multiple genes in Saccharomyces cerevisiae.

    Science.gov (United States)

    Ishii, Jun; Kondo, Takashi; Makino, Harumi; Ogura, Akira; Matsuda, Fumio; Kondo, Akihiko

    2014-05-01

    Yeast has the potential to be used in bulk-scale fermentative production of fuels and chemicals due to its tolerance for low pH and robustness for autolysis. However, expression of multiple external genes in one host yeast strain is considerably labor-intensive due to the lack of polycistronic transcription. To promote the metabolic engineering of yeast, we generated systematic and convenient genetic engineering tools to express multiple genes in Saccharomyces cerevisiae. We constructed a series of multi-copy and integration vector sets for concurrently expressing two or three genes in S. cerevisiae by embedding three classical promoters. The comparative expression capabilities of the constructed vectors were monitored with green fluorescent protein, and the concurrent expression of genes was monitored with three different fluorescent proteins. Our multiple gene expression tool will be helpful to the advanced construction of genetically engineered yeast strains in a variety of research fields other than metabolic engineering. © 2014 Federation of European Microbiological Societies. Published by John Wiley & Sons Ltd. All rights reserved.

  1. Speed reading for genes: bookmarks set the pace.

    Science.gov (United States)

    Follmer, Nicole E; Francis, Nicole J

    2011-11-15

    During mitosis, most transcription ceases. Mitotic gene bookmarking marks genes for reactivation to ensure reestablishment of transcription states and cell-cycle progression. In a recent issue of Nature Cell Biology, Zhao et al. (2011) investigate how gene bookmarking leads to accelerated kinetics of transcriptional reactivation after mitosis. Copyright © 2011 Elsevier Inc. All rights reserved.

  2. Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.

    Directory of Open Access Journals (Sweden)

    Lijing Xu

    2011-04-01

    Full Text Available High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05. These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT. GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature.GCAT is freely available at http://binf1.memphis.edu/gcat.

  3. Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.

    Science.gov (United States)

    Xu, Lijing; Furlotte, Nicholas; Lin, Yunyue; Heinrich, Kevin; Berry, Michael W; George, Ebenezer O; Homayouni, Ramin

    2011-04-14

    High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI) to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv) for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO) and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPvmethod to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT). GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature. GCAT is freely available at http://binf1.memphis.edu/gcat.

  4. The null hypothesis of GSEA, and a novel statistical model for competitive gene set analysis

    DEFF Research Database (Denmark)

    Debrabant, Birgit

    2017-01-01

    . This is a major handicap to the interpretation of results obtained from a gene set analysis. RESULTS: This work presents a hierarchical statistical model based on the notion of dependence measures, which overcomes this problem. The two levels of the model naturally reflect the modular structure of many gene set......MOTIVATION: Competitive gene set analysis intends to assess whether a specific set of genes is more associated with a trait than the remaining genes. However, the statistical models assumed to date to underly these methods do not enable a clear cut formulation of the competitive null hypothesis...

  5. Dual-label flow cytometry-based host cell adhesion assay to ascertain the prospect of probiotic Lactobacillus plantarum in niche-specific antibacterial therapy.

    Science.gov (United States)

    Mukherjee, Sandipan; Ramesh, Aiyagari

    2017-12-01

    Host cell adhesion assays that provide quantitative insight on the potential of lactic acid bacteria (LAB) to inhibit adhesion of intestinal pathogens can be leveraged for the development of niche-specific anti-adhesion therapy. Herein, we report a dual-colour flow cytometry (FCM) analysis to assess the ability of probiotic Lactobacillus plantarum strains to impede adhesion of Enterococcus faecalis, Listeria monocytogenes and Staphylococcus aureus onto HT-29 cells. FCM in conjunction with a hierarchical cluster analysis could discern the anti-adhesion potential of L. plantarum strains, wherein the efficacy of L. plantarum DF9 was on a par with the probiotic L. rhamnosus GG. Combination of FCM with principal component analysis illustrated the relative influence of LAB strains on adhesion parameters kd and em of the pathogen and identified probiotic LAB suitable for anti-adhesion intervention. The analytical merit of the FCM analysis was captured in host cell adhesion assays that measured relative elimination of adhered LAB vis-à-vis pathogens, on exposure to either LAB bacteriocins or therapeutic antibiotics. It is envisaged that the dual-colour FCM-based adhesion assay described herein would enable a fundamental understanding of the host cell adhesion process and stimulate interest in probiotic LAB as safe anti-adhesion therapeutic agents against gastrointestinal pathogens.

  6. A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction.

    Science.gov (United States)

    Jung, Hye-Young; Leem, Sangseob; Lee, Sungyoung; Park, Taesung

    2016-12-01

    Gene-gene interaction (GGI) is one of the most popular approaches for finding the missing heritability of common complex traits in genetic association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. In order to identify the best interaction model associated with disease susceptibility, MDR compares all possible genotype combinations in terms of their predictability of disease status from a simple binary high(H) and low(L) risk classification. However, this simple binary classification does not reflect the uncertainty of H/L classification. We regard classifying H/L as equivalent to defining the degree of membership of two risk groups H/L. By adopting the fuzzy set theory, we propose Fuzzy MDR which takes into account the uncertainty of H/L classification. Fuzzy MDR allows the possibility of partial membership of H/L through a membership function which transforms the degree of uncertainty into a [0,1] scale. The best genotype combinations can be selected which maximizes a new fuzzy set based accuracy measure. Two simulation studies are conducted to compare the power of the proposed Fuzzy MDR with that of MDR. Our results show that Fuzzy MDR has higher power than MDR. We illustrate the proposed Fuzzy MDR by analysing bipolar disorder (BD) trait of the WTCCC dataset to detect GGI associated with BD. We propose a novel Fuzzy MDR method to detect gene-gene interaction by taking into account the uncertainly of H/L classification and show that it has higher power than MDR. Fuzzy MDR can be easily extended to handle continuous phenotypes as well. The program written in R for the proposed Fuzzy MDR is available at https://statgen.snu.ac.kr/software/FuzzyMDR. Copyright © 2016 Elsevier Ltd. All rights reserved.

  7. Identifying the genetic variation of gene expression using gene sets: application of novel gene Set eQTL approach to PharmGKB and KEGG.

    Directory of Open Access Journals (Sweden)

    Ryan Abo

    Full Text Available Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available performing all pair wise comparisons is computationally burdensome and may not provide optimal power to detect associations. Consideration of different approaches to account for the high dimensionality and multiple testing issues may provide increased efficiency and statistical power. Here we present a novel approach to model and test the association between genetic variation and mRNA gene expression levels in the context of gene sets (GSs and pathways, referred to as gene set - expression quantitative trait loci analysis (GS-eQTL. The method uses GSs to initially group SNPs and mRNA expression, followed by the application of principal components analysis (PCA to collapse the variation and reduce the dimensionality within the GSs. We applied GS-eQTL to assess the association between SNP and mRNA expression level data collected from a cell-based model system using PharmGKB and KEGG defined GSs. We observed a large number of significant GS-eQTL associations, in which the most significant associations arose between genetic variation and mRNA expression from the same GS. However, a number of associations involving genetic variation and mRNA expression from different GSs were also identified. Our proposed GS-eQTL method effectively addresses the multiple testing limitations in eQTL studies and provides biological context for SNP-expression associations.

  8. Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior.

    Science.gov (United States)

    Windhorst, Dafna A; Mileva-Seitz, Viara R; Rippe, Ralph C A; Tiemeier, Henning; Jaddoe, Vincent W V; Verhulst, Frank C; van IJzendoorn, Marinus H; Bakermans-Kranenburg, Marian J

    2016-08-01

    In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome-wide environmental interaction (GWEI) studies in the search for genetic variation underlying individual differences in behavior. Genetic variants in 12 autosomal dopaminergic genes were available in an ethnically homogenous part of a population-based cohort. Harsh parenting was assessed with maternal (n = 1881) and paternal (n = 1710) reports at age 3. Externalizing behavior was assessed with the Child Behavior Checklist (CBCL) at age 5 (71 ± 3.7 months). We conducted gene-set analyses of the association between variation in dopaminergic genes and externalizing behavior, stratified for harsh parenting. The association was statistically significant or approached significance for children without harsh parenting experiences, but was absent in the group with harsh parenting. Similarly, significant associations between single genes and externalizing behavior were only found in the group without harsh parenting. Effect sizes in the groups with and without harsh parenting did not differ significantly. Gene-environment interaction tests were conducted for individual genetic variants, resulting in two significant interaction effects (rs1497023 and rs4922132) after correction for multiple testing. Our findings are suggestive of G × E interplay, with associations between dopamine genes and externalizing behavior present in children without harsh parenting, but not in children with harsh parenting experiences. Harsh parenting may overrule the role of genetic factors in externalizing behavior. Gene-based and gene-set

  9. Identifying the optimal gene and gene set in hepatocellular carcinoma based on differential expression and differential co-expression algorithm.

    Science.gov (United States)

    Dong, Li-Yang; Zhou, Wei-Zhong; Ni, Jun-Wei; Xiang, Wei; Hu, Wen-Hao; Yu, Chang; Li, Hai-Yan

    2017-02-01

    The objective of this study was to identify the optimal gene and gene set for hepatocellular carcinoma (HCC) utilizing differential expression and differential co-expression (DEDC) algorithm. The DEDC algorithm consisted of four parts: calculating differential expression (DE) by absolute t-value in t-statistics; computing differential co-expression (DC) based on Z-test; determining optimal thresholds on the basis of Chi-squared (χ2) maximization and the corresponding gene was the optimal gene; and evaluating functional relevance of genes categorized into different partitions to determine the optimal gene set with highest mean minimum functional information (FI) gain (Δ*G). The optimal thresholds divided genes into four partitions, high DE and high DC (HDE-HDC), high DE and low DC (HDE-LDC), low DE and high DC (LDE‑HDC), and low DE and low DC (LDE-LDC). In addition, the optimal gene was validated by conducting reverse transcription-polymerase chain reaction (RT-PCR) assay. The optimal threshold for DC and DE were 1.032 and 1.911, respectively. Using the optimal gene, the genes were divided into four partitions including: HDE-HDC (2,053 genes), HED-LDC (2,822 genes), LDE-HDC (2,622 genes), and LDE-LDC (6,169 genes). The optimal gene was microtubule‑associated protein RP/EB family member 1 (MAPRE1), and RT-PCR assay validated the significant difference between the HCC and normal state. The optimal gene set was nucleoside metabolic process (GO\\GO:0009116) with Δ*G = 18.681 and 24 HDE-HDC partitions in total. In conclusion, we successfully investigated the optimal gene, MAPRE1, and gene set, nucleoside metabolic process, which may be potential biomarkers for targeted therapy and provide significant insight for revealing the pathological mechanism underlying HCC.

  10. Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior

    NARCIS (Netherlands)

    J. Windhorst (Judith); V. Mileva-Seitz; R.C.A. Rippe (Ralph C.A.); H.W. Tiemeier (Henning); V.W.V. Jaddoe (Vincent); F.C. Verhulst (Frank); M.H. van IJzendoorn (Rien); M.J. Bakermans-Kranenburg (Marian)

    2016-01-01

    textabstractBackground: In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and

  11. Comparative study on gene set and pathway topology-based enrichment methods.

    Science.gov (United States)

    Bayerlová, Michaela; Jung, Klaus; Kramer, Frank; Klemm, Florian; Bleckmann, Annalen; Beißbarth, Tim

    2015-10-22

    Enrichment analysis is a popular approach to identify pathways or sets of genes which are significantly enriched in the context of differentially expressed genes. The traditional gene set enrichment approach considers a pathway as a simple gene list disregarding any knowledge of gene or protein interactions. In contrast, the new group of so called pathway topology-based methods integrates the topological structure of a pathway into the analysis. We comparatively investigated gene set and pathway topology-based enrichment approaches, considering three gene set and four topological methods. These methods were compared in two extensive simulation studies and on a benchmark of 36 real datasets, providing the same pathway input data for all methods. In the benchmark data analysis both types of methods showed a comparable ability to detect enriched pathways. The first simulation study was conducted with KEGG pathways, which showed considerable gene overlaps between each other. In this study with original KEGG pathways, none of the topology-based methods outperformed the gene set approach. Therefore, a second simulation study was performed on non-overlapping pathways created by unique gene IDs. Here, methods accounting for pathway topology reached higher accuracy than the gene set methods, however their sensitivity was lower. We conducted one of the first comprehensive comparative works on evaluating gene set against pathway topology-based enrichment methods. The topological methods showed better performance in the simulation scenarios with non-overlapping pathways, however, they were not conclusively better in the other scenarios. This suggests that simple gene set approach might be sufficient to detect an enriched pathway under realistic circumstances. Nevertheless, more extensive studies and further benchmark data are needed to systematically evaluate these methods and to assess what gain and cost pathway topology information introduces into enrichment analysis. Both

  12. An Independent Filter for Gene Set Testing Based on Spectral Enrichment

    NARCIS (Netherlands)

    Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

    2015-01-01

    Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in

  13. Annotating gene sets by mining large literature collections with protein networks.

    Science.gov (United States)

    Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

    2018-01-01

    Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.

  14. Combining multiple tools outperforms individual methods in gene set enrichment analyses

    Science.gov (United States)

    Ng, Milica; Wilson, Nicholas J.; Sheridan, Julie M.; Huynh, Huy; Wilson, Michael J.

    2017-01-01

    Abstract Motivation: Gene set enrichment (GSE) analysis allows researchers to efficiently extract biological insight from long lists of differentially expressed genes by interrogating them at a systems level. In recent years, there has been a proliferation of GSE analysis methods and hence it has become increasingly difficult for researchers to select an optimal GSE tool based on their particular dataset. Moreover, the majority of GSE analysis methods do not allow researchers to simultaneously compare gene set level results between multiple experimental conditions. Results: The ensemble of genes set enrichment analyses (EGSEA) is a method developed for RNA-sequencing data that combines results from twelve algorithms and calculates collective gene set scores to improve the biological relevance of the highest ranked gene sets. EGSEA’s gene set database contains around 25 000 gene sets from sixteen collections. It has multiple visualization capabilities that allow researchers to view gene sets at various levels of granularity. EGSEA has been tested on simulated data and on a number of human and mouse datasets and, based on biologists’ feedback, consistently outperforms the individual tools that have been combined. Our evaluation demonstrates the superiority of the ensemble approach for GSE analysis, and its utility to effectively and efficiently extrapolate biological functions and potential involvement in disease processes from lists of differentially regulated genes. Availability and Implementation: EGSEA is available as an R package at http://www.bioconductor.org/packages/EGSEA/. The gene sets collections are available in the R package EGSEAdata from http://www.bioconductor.org/packages/EGSEAdata/. Contacts:monther.alhamdoosh@csl.com.au ormritchie@wehi.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27694195

  15. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, D.A. van; Goeman, J.J.; Jong, E. de; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set

  16. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, van D.A.M.; Goeman, J.J.; Jong, de E.; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    Background: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set

  17. Application of biclustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

    Directory of Open Access Journals (Sweden)

    Andrew Williams

    2015-12-01

    Full Text Available Background: The presence of diverse types of nanomaterials (NMs in commerce is growing at an exponential pace. As a result, human exposure to these materials in the environment is inevitable, necessitating the need for rapid and reliable toxicity testing methods to accurately assess the potential hazards associated with NMs. In this study, we applied biclustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related biclusters of genes showing similar expression profiles were identified. The identified biclusters were then used to conduct a gene set enrichment analysis on pulmonary gene expression profiles derived from mice exposed to nano-titanium dioxide (nano-TiO2, carbon black (CB or carbon nanotubes (CNTs to determine the disease significance of these data-driven gene sets.Results: Biclusters representing inflammation (chemokine activity, DNA binding, cell cycle, apoptosis, reactive oxygen species (ROS and fibrosis processes were identified. All of the NM studies were significant with respect to the bicluster related to chemokine activity (DAVID; FDR p-value = 0.032. The bicluster related to pulmonary fibrosis was enriched in studies where toxicity induced by CNT and CB studies was investigated, suggesting the potential for these materials to induce lung fibrosis. The pro-fibrogenic potential of CNTs is well established. Although CB has not been shown to induce fibrosis, it induces stronger inflammatory, oxidative stress and DNA damage responses than nano-TiO2 particles.Conclusion: The results of the analysis correctly identified all NMs to be inflammogenic and only CB and CNTs as potentially fibrogenic. In addition to identifying several

  18. FunGeneNet: a web tool to estimate enrichment of functional interactions in experimental gene sets.

    Science.gov (United States)

    Tiys, Evgeny S; Ivanisenko, Timofey V; Demenkov, Pavel S; Ivanisenko, Vladimir A

    2018-02-09

    Estimation of functional connectivity in gene sets derived from genome-wide or other biological experiments is one of the essential tasks of bioinformatics. A promising approach for solving this problem is to compare gene networks built using experimental gene sets with random networks. One of the resources that make such an analysis possible is CrossTalkZ, which uses the FunCoup database. However, existing methods, including CrossTalkZ, do not take into account individual types of interactions, such as protein/protein interactions, expression regulation, transport regulation, catalytic reactions, etc., but rather work with generalized types characterizing the existence of any connection between network members. We developed the online tool FunGeneNet, which utilizes the ANDSystem and STRING to reconstruct gene networks using experimental gene sets and to estimate their difference from random networks. To compare the reconstructed networks with random ones, the node permutation algorithm implemented in CrossTalkZ was taken as a basis. To study the FunGeneNet applicability, the functional connectivity analysis of networks constructed for gene sets involved in the Gene Ontology biological processes was conducted. We showed that the method sensitivity exceeds 0.8 at a specificity of 0.95. We found that the significance level of the difference between gene networks of biological processes and random networks is determined by the type of connections considered between objects. At the same time, the highest reliability is achieved for the generalized form of connections that takes into account all the individual types of connections. By taking examples of the thyroid cancer networks and the apoptosis network, it is demonstrated that key participants in these processes are involved in the interactions of those types by which these networks differ from random ones. FunGeneNet is a web tool aimed at proving the functionality of networks in a wide range of sizes of

  19. Identification of a core set of genes that signifies pathways underlying cardiac hypertrophy

    DEFF Research Database (Denmark)

    Strom, C.C.; Kruhoffer, M.; Knudsen, S.

    2004-01-01

    Although the molecular signals underlying cardiac hypertrophy have been the subject of intense investigation, the extent of common and distinct gene regulation between different forms of cardiac hypertrophy remains unclear. We hypothesized that a general and comparative analysis of hypertrophic...... gene expression, using microarray technology in multiple models of cardiac hypertrophy, including aortic banding, myocardial infarction, an arteriovenous shunt and pharmacologically induced hypertrophy, would uncover networks of conserved hypertrophy-specific genes and identify novel genes involved...... in hypertrophic signalling. From gene expression analyses (8740 probe sets, n = 46) of rat ventricular RNA, we identified a core set of 139 genes with consistent differential expression in all hypertrophy models as compared to their controls, including 78 genes not previously associated with hypertrophy and 61...

  20. iBBiG: iterative binary bi-clustering of gene sets.

    Science.gov (United States)

    Gusenleitner, Daniel; Howe, Eleanor A; Bentink, Stefan; Quackenbush, John; Culhane, Aedín C

    2012-10-01

    Meta-analysis of genomics data seeks to identify genes associated with a biological phenotype across multiple datasets; however, merging data from different platforms by their features (genes) is challenging. Meta-analysis using functionally or biologically characterized gene sets simplifies data integration is biologically intuitive and is seen as having great potential, but is an emerging field with few established statistical methods. We transform gene expression profiles into binary gene set profiles by discretizing results of gene set enrichment analyses and apply a new iterative bi-clustering algorithm (iBBiG) to identify groups of gene sets that are coordinately associated with groups of phenotypes across multiple studies. iBBiG is optimized for meta-analysis of large numbers of diverse genomics data that may have unmatched samples. It does not require prior knowledge of the number or size of clusters. When applied to simulated data, it outperforms commonly used clustering methods, discovers overlapping clusters of diverse sizes and is robust in the presence of noise. We apply it to meta-analysis of breast cancer studies, where iBBiG extracted novel gene set-phenotype association that predicted tumor metastases within tumor subtypes. Implemented in the Bioconductor package iBBiG CONTACT: aedin@jimmy.harvard.edu.

  1. CAsubtype: An R Package to Identify Gene Sets Predictive of Cancer Subtypes and Clinical Outcomes.

    Science.gov (United States)

    Kong, Hualei; Tong, Pan; Zhao, Xiaodong; Sun, Jielin; Li, Hua

    2018-03-01

    In the past decade, molecular classification of cancer has gained high popularity owing to its high predictive power on clinical outcomes as compared with traditional methods commonly used in clinical practice. In particular, using gene expression profiles, recent studies have successfully identified a number of gene sets for the delineation of cancer subtypes that are associated with distinct prognosis. However, identification of such gene sets remains a laborious task due to the lack of tools with flexibility, integration and ease of use. To reduce the burden, we have developed an R package, CAsubtype, to efficiently identify gene sets predictive of cancer subtypes and clinical outcomes. By integrating more than 13,000 annotated gene sets, CAsubtype provides a comprehensive repertoire of candidates for new cancer subtype identification. For easy data access, CAsubtype further includes the gene expression and clinical data of more than 2000 cancer patients from TCGA. CAsubtype first employs principal component analysis to identify gene sets (from user-provided or package-integrated ones) with robust principal components representing significantly large variation between cancer samples. Based on these principal components, CAsubtype visualizes the sample distribution in low-dimensional space for better understanding of the distinction between samples and classifies samples into subgroups with prevalent clustering algorithms. Finally, CAsubtype performs survival analysis to compare the clinical outcomes between the identified subgroups, assessing their clinical value as potentially novel cancer subtypes. In conclusion, CAsubtype is a flexible and well-integrated tool in the R environment to identify gene sets for cancer subtype identification and clinical outcome prediction. Its simple R commands and comprehensive data sets enable efficient examination of the clinical value of any given gene set, thus facilitating hypothesis generating and testing in biological and

  2. Gene set analysis: limitations in popular existing methods and proposed improvements.

    Science.gov (United States)

    Mishra, Pashupati; Törönen, Petri; Leino, Yrjö; Holm, Liisa

    2014-10-01

    Gene set analysis is the analysis of a set of genes that collectively contribute to a biological process. Most popular gene set analysis methods are based on empirical P-value that requires large number of permutations. Despite numerous gene set analysis methods developed in the past decade, the most popular methods still suffer from serious limitations. We present a gene set analysis method (mGSZ) based on Gene Set Z-scoring function (GSZ) and asymptotic P-values. Asymptotic P-value calculation requires fewer permutations, and thus speeds up the gene set analysis process. We compare the GSZ-scoring function with seven popular gene set scoring functions and show that GSZ stands out as the best scoring function. In addition, we show improved performance of the GSA method when the max-mean statistics is replaced by the GSZ scoring function. We demonstrate the importance of both gene and sample permutations by showing the consequences in the absence of one or the other. A comparison of asymptotic and empirical methods of P-value estimation demonstrates a clear advantage of asymptotic P-value over empirical P-value. We show that mGSZ outperforms the state-of-the-art methods based on two different evaluations. We compared mGSZ results with permutation and rotation tests and show that rotation does not improve our asymptotic P-values. We also propose well-known asymptotic distribution models for three of the compared methods. mGSZ is available as R package from cran.r-project.org. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  3. Comparison of gene sets for expression profiling: prediction of metastasis from low-malignant breast cancer

    DEFF Research Database (Denmark)

    Thomassen, Mads; Tan, Qihua; Eiriksdottir, Freyja

    2007-01-01

    PURPOSE: In the low-risk group of breast cancer patients, a subgroup experiences metastatic recurrence of the disease. The aim of this study was to examine the performance of gene sets, developed mainly from high-risk tumors, in a group of low-malignant tumors. EXPERIMENTAL DESIGN: Twenty...... sets, mainly developed in high-risk cancers, predict metastasis from low-malignant cancer.......-six tumors from low-risk patients and 34 low-malignant T2 tumors from patients with slightly higher risk have been examined by genome-wide gene expression analysis. Nine prognostic gene sets were tested in this data set. RESULTS: A 32-gene profile (HUMAC32) that accurately predicts metastasis has previously...

  4. Efficient RNAi-based gene family knockdown via set cover optimization.

    Science.gov (United States)

    Zhao, Wenzhong; Fanning, M Leigh; Lane, Terran

    2005-01-01

    We address the problem of selecting an efficient set of initiator molecules (siRNAs) for RNA interference (RNAi)-based gene family knockdown experiments. Our goal is to select a minimal set of siRNAs that (a) cover a targeted gene family or a specified subset of it, (b) do not cover any untargeted genes, and (c) are individually highly effective at inducing knockdown. We give two formalizations of the gene family knockdown problem. First, we show that the problem, minimizing the number of siRNAs required to knock down a family of genes, is NP-Hard via a reduction to the set cover problem. Second, we generalize the basic problem to incorporate additional biological constraints and optimality criteria. To solve the resulting set-cover variants, we modify the classical branch-and-bound algorithm to include some of these biological criteria. We find that in many typical cases these constraints reduce the search space enough that we are able to compute exact minimal siRNA covers within reasonable time. For larger cases, we propose a probabilistic greedy algorithm for finding minimal siRNA covers efficiently. We apply our methods to two different gene families, the FREP genes from Biomphalaria glabrata and the olfactory genes from Caenorhabditis elegans. Our computational results on real biological data show that the probabilistic greedy algorithm produces siRNA covers as good as the branch-and-bound algorithm in most cases. Both algorithms return minimal siRNA covers with high predicted probability that the selected siRNAs will be effective at inducing knockdown. We also examine the role of "off-target" interactions: the constraint of avoiding covering untargeted genes can, in some cases, substantially increase the complexity of the resulting solution. Overall, we find that in many common cases our approach significantly reduces the number of siRNAs required in gene family knockdown experiments, as compared to knocking down genes independently.

  5. SiBIC: a web server for generating gene set networks based on biclusters obtained by maximal frequent itemset mining.

    Science.gov (United States)

    Takahashi, Kei-ichiro; Takigawa, Ichigaku; Mamitsuka, Hiroshi

    2013-01-01

    Detecting biclusters from expression data is useful, since biclusters are coexpressed genes under only part of all given experimental conditions. We present a software called SiBIC, which from a given expression dataset, first exhaustively enumerates biclusters, which are then merged into rather independent biclusters, which finally are used to generate gene set networks, in which a gene set assigned to one node has coexpressed genes. We evaluated each step of this procedure: 1) significance of the generated biclusters biologically and statistically, 2) biological quality of merged biclusters, and 3) biological significance of gene set networks. We emphasize that gene set networks, in which nodes are not genes but gene sets, can be more compact than usual gene networks, meaning that gene set networks are more comprehensible. SiBIC is available at http://utrecht.kuicr.kyoto-u.ac.jp:8080/miami/faces/index.jsp.

  6. Identification of a conserved set of upregulated genes in mouse skeletal muscle hypertrophy and regrowth

    Science.gov (United States)

    Chaillou, Thomas; Jackson, Janna R.; England, Jonathan H.; Kirby, Tyler J.; Richards-White, Jena; Esser, Karyn A.; Dupont-Versteegden, Esther E.

    2014-01-01

    The purpose of this study was to compare the gene expression profile of mouse skeletal muscle undergoing two forms of growth (hypertrophy and regrowth) with the goal of identifying a conserved set of differentially expressed genes. Expression profiling by microarray was performed on the plantaris muscle subjected to 1, 3, 5, 7, 10, and 14 days of hypertrophy or regrowth following 2 wk of hind-limb suspension. We identified 97 differentially expressed genes (≥2-fold increase or ≥50% decrease compared with control muscle) that were conserved during the two forms of muscle growth. The vast majority (∼90%) of the differentially expressed genes was upregulated and occurred at a single time point (64 out of 86 genes), which most often was on the first day of the time course. Microarray analysis from the conserved upregulated genes showed a set of genes related to contractile apparatus and stress response at day 1, including three genes involved in mechanotransduction and four genes encoding heat shock proteins. Our analysis further identified three cell cycle-related genes at day and several genes associated with extracellular matrix (ECM) at both days 3 and 10. In conclusion, we have identified a core set of genes commonly upregulated in two forms of muscle growth that could play a role in the maintenance of sarcomere stability, ECM remodeling, cell proliferation, fast-to-slow fiber type transition, and the regulation of skeletal muscle growth. These findings suggest conserved regulatory mechanisms involved in the adaptation of skeletal muscle to increased mechanical loading. PMID:25554798

  7. Identification of a conserved set of upregulated genes in mouse skeletal muscle hypertrophy and regrowth.

    Science.gov (United States)

    Chaillou, Thomas; Jackson, Janna R; England, Jonathan H; Kirby, Tyler J; Richards-White, Jena; Esser, Karyn A; Dupont-Versteegden, Esther E; McCarthy, John J

    2015-01-01

    The purpose of this study was to compare the gene expression profile of mouse skeletal muscle undergoing two forms of growth (hypertrophy and regrowth) with the goal of identifying a conserved set of differentially expressed genes. Expression profiling by microarray was performed on the plantaris muscle subjected to 1, 3, 5, 7, 10, and 14 days of hypertrophy or regrowth following 2 wk of hind-limb suspension. We identified 97 differentially expressed genes (≥2-fold increase or ≥50% decrease compared with control muscle) that were conserved during the two forms of muscle growth. The vast majority (∼90%) of the differentially expressed genes was upregulated and occurred at a single time point (64 out of 86 genes), which most often was on the first day of the time course. Microarray analysis from the conserved upregulated genes showed a set of genes related to contractile apparatus and stress response at day 1, including three genes involved in mechanotransduction and four genes encoding heat shock proteins. Our analysis further identified three cell cycle-related genes at day and several genes associated with extracellular matrix (ECM) at both days 3 and 10. In conclusion, we have identified a core set of genes commonly upregulated in two forms of muscle growth that could play a role in the maintenance of sarcomere stability, ECM remodeling, cell proliferation, fast-to-slow fiber type transition, and the regulation of skeletal muscle growth. These findings suggest conserved regulatory mechanisms involved in the adaptation of skeletal muscle to increased mechanical loading. Copyright © 2015 the American Physiological Society.

  8. Mechanism-based biomarker gene sets for glutathione depletion-related hepatotoxicity in rats

    International Nuclear Information System (INIS)

    Gao Weihua; Mizukawa, Yumiko; Nakatsu, Noriyuki; Minowa, Yosuke; Yamada, Hiroshi; Ohno, Yasuo; Urushidani, Tetsuro

    2010-01-01

    Chemical-induced glutathione depletion is thought to be caused by two types of toxicological mechanisms: PHO-type glutathione depletion [glutathione conjugated with chemicals such as phorone (PHO) or diethyl maleate (DEM)], and BSO-type glutathione depletion [i.e., glutathione synthesis inhibited by chemicals such as L-buthionine-sulfoximine (BSO)]. In order to identify mechanism-based biomarker gene sets for glutathione depletion in rat liver, male SD rats were treated with various chemicals including PHO (40, 120 and 400 mg/kg), DEM (80, 240 and 800 mg/kg), BSO (150, 450 and 1500 mg/kg), and bromobenzene (BBZ, 10, 100 and 300 mg/kg). Liver samples were taken 3, 6, 9 and 24 h after administration and examined for hepatic glutathione content, physiological and pathological changes, and gene expression changes using Affymetrix GeneChip Arrays. To identify differentially expressed probe sets in response to glutathione depletion, we focused on the following two courses of events for the two types of mechanisms of glutathione depletion: a) gene expression changes occurring simultaneously in response to glutathione depletion, and b) gene expression changes after glutathione was depleted. The gene expression profiles of the identified probe sets for the two types of glutathione depletion differed markedly at times during and after glutathione depletion, whereas Srxn1 was markedly increased for both types as glutathione was depleted, suggesting that Srxn1 is a key molecule in oxidative stress related to glutathione. The extracted probe sets were refined and verified using various compounds including 13 additional positive or negative compounds, and they established two useful marker sets. One contained three probe sets (Akr7a3, Trib3 and Gstp1) that could detect conjugation-type glutathione depletors any time within 24 h after dosing, and the other contained 14 probe sets that could detect glutathione depletors by any mechanism. These two sets, with appropriate scoring

  9. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    Science.gov (United States)

    2013-01-01

    Background Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values sets and the CTD-derived gene sets in gene expression (GE) data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the

  10. Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

    DEFF Research Database (Denmark)

    Ersland, Kari M; Christoforou, Andrea; Stansberg, Christine

    2012-01-01

    the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n...

  11. Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes

    Directory of Open Access Journals (Sweden)

    Datta Somnath

    2006-08-01

    Full Text Available Abstract Background A cluster analysis is the most commonly performed procedure (often regarded as a first step on a set of gene expression profiles. In most cases, a post hoc analysis is done to see if the genes in the same clusters can be functionally correlated. While past successes of such analyses have often been reported in a number of microarray studies (most of which used the standard hierarchical clustering, UPGMA, with one minus the Pearson's correlation coefficient as a measure of dissimilarity, often times such groupings could be misleading. More importantly, a systematic evaluation of the entire set of clusters produced by such unsupervised procedures is necessary since they also contain genes that are seemingly unrelated or may have more than one common function. Here we quantify the performance of a given unsupervised clustering algorithm applied to a given microarray study in terms of its ability to produce biologically meaningful clusters using a reference set of functional classes. Such a reference set may come from prior biological knowledge specific to a microarray study or may be formed using the growing databases of gene ontologies (GO for the annotated genes of the relevant species. Results In this paper, we introduce two performance measures for evaluating the results of a clustering algorithm in its ability to produce biologically meaningful clusters. The first measure is a biological homogeneity index (BHI. As the name suggests, it is a measure of how biologically homogeneous the clusters are. This can be used to quantify the performance of a given clustering algorithm such as UPGMA in grouping genes for a particular data set and also for comparing the performance of a number of competing clustering algorithms applied to the same data set. The second performance measure is called a biological stability index (BSI. For a given clustering algorithm and an expression data set, it measures the consistency of the clustering

  12. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  13. Investigating the effect of paralogs on microarray gene-set analysis

    Directory of Open Access Journals (Sweden)

    Seoighe Cathal

    2011-01-01

    Full Text Available Abstract Background In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. Results We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http://www.cbio.uct.ac.za/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. Conclusions The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.

  14. Investigating the effect of paralogs on microarray gene-set analysis

    LENUS (Irish Health Repository)

    Faure, Andre J

    2011-01-24

    Abstract Background In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. Results We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http:\\/\\/www.cbio.uct.ac.za\\/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. Conclusions The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.

  15. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    Science.gov (United States)

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  16. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data.

    Science.gov (United States)

    Hettne, Kristina M; Boorsma, André; van Dartel, Dorien A M; Goeman, Jelle J; de Jong, Esther; Piersma, Aldert H; Stierum, Rob H; Kleinjans, Jos C; Kors, Jan A

    2013-01-29

    Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set analysis (GSA) methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human) and 588 (mouse) gene sets from the Comparative Toxicogenomics Database (CTD). We tested for significant differential expression (SDE) (false discovery rate -corrected p-values data sets of five chemicals (from experimental models). We tested for SDE of gene sets for six fibrates in a peroxisome proliferator-activated receptor alpha (PPARA) knock-out GE dataset and compared to results from the Connectivity Map. We tested for SDE of 319 next-gen TM-derived gene sets for environmental toxicants in three GE data sets of triazoles, and tested for SDE of 442 gene sets associated with embryonic structures. We compared the gene sets to triazole effects seen in the Whole Embryo Culture (WEC), and used principal component analysis (PCA) to discriminate triazoles from other chemicals. Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other

  17. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    Directory of Open Access Journals (Sweden)

    Hettne Kristina M

    2013-01-01

    Full Text Available Abstract Background Availability of chemical response-specific lists of genes (gene sets for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM, and that these can be used with gene set analysis (GSA methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human and 588 (mouse gene sets from the Comparative Toxicogenomics Database (CTD. We tested for significant differential expression (SDE (false discovery rate -corrected p-values Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.

  18. Optimal structural inference of signaling pathways from unordered and overlapping gene sets.

    Science.gov (United States)

    Acharya, Lipi R; Judeh, Thair; Wang, Guangdi; Zhu, Dongxiao

    2012-02-15

    A plethora of bioinformatics analysis has led to the discovery of numerous gene sets, which can be interpreted as discrete measurements emitted from latent signaling pathways. Their potential to infer signaling pathway structures, however, has not been sufficiently exploited. Existing methods accommodating discrete data do not explicitly consider signal cascading mechanisms that characterize a signaling pathway. Novel computational methods are thus needed to fully utilize gene sets and broaden the scope from focusing only on pairwise interactions to the more general cascading events in the inference of signaling pathway structures. We propose a gene set based simulated annealing (SA) algorithm for the reconstruction of signaling pathway structures. A signaling pathway structure is a directed graph containing up to a few hundred nodes and many overlapping signal cascades, where each cascade represents a chain of molecular interactions from the cell surface to the nucleus. Gene sets in our context refer to discrete sets of genes participating in signal cascades, the basic building blocks of a signaling pathway, with no prior information about gene orderings in the cascades. From a compendium of gene sets related to a pathway, SA aims to search for signal cascades that characterize the optimal signaling pathway structure. In the search process, the extent of overlap among signal cascades is used to measure the optimality of a structure. Throughout, we treat gene sets as random samples from a first-order Markov chain model. We evaluated the performance of SA in three case studies. In the first study conducted on 83 KEGG pathways, SA demonstrated a significantly better performance than Bayesian network methods. Since both SA and Bayesian network methods accommodate discrete data, use a 'search and score' network learning strategy and output a directed network, they can be compared in terms of performance and computational time. In the second study, we compared SA and

  19. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    K.M. Hettne (Kristina); J. Boorsma (Jeffrey); D.A.M. van Dartel (Dorien A M); J.J. Goeman (Jelle); E.C. de Jong (Esther); A.H. Piersma (Aldert); R.H. Stierum (Rob); J. Kleinjans (Jos); J.A. Kors (Jan)

    2013-01-01

    textabstractBackground: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with

  20. Meta-analysis of differentiating mouse embryonic stem cell gene expression kinetics reveals early change of a small gene set.

    Directory of Open Access Journals (Sweden)

    Clive H Glover

    2006-11-01

    Full Text Available Stem cell differentiation involves critical changes in gene expression. Identification of these should provide endpoints useful for optimizing stem cell propagation as well as potential clues about mechanisms governing stem cell maintenance. Here we describe the results of a new meta-analysis methodology applied to multiple gene expression datasets from three mouse embryonic stem cell (ESC lines obtained at specific time points during the course of their differentiation into various lineages. We developed methods to identify genes with expression changes that correlated with the altered frequency of functionally defined, undifferentiated ESC in culture. In each dataset, we computed a novel statistical confidence measure for every gene which captured the certainty that a particular gene exhibited an expression pattern of interest within that dataset. This permitted a joint analysis of the datasets, despite the different experimental designs. Using a ranking scheme that favored genes exhibiting patterns of interest, we focused on the top 88 genes whose expression was consistently changed when ESC were induced to differentiate. Seven of these (103728_at, 8430410A17Rik, Klf2, Nr0b1, Sox2, Tcl1, and Zfp42 showed a rapid decrease in expression concurrent with a decrease in frequency of undifferentiated cells and remained predictive when evaluated in additional maintenance and differentiating protocols. Through a novel meta-analysis, this study identifies a small set of genes whose expression is useful for identifying changes in stem cell frequencies in cultures of mouse ESC. The methods and findings have broader applicability to understanding the regulation of self-renewal of other stem cell types.

  1. FUNC: a package for detecting significant associations between gene sets and ontological annotations

    Directory of Open Access Journals (Sweden)

    Rahm Erhard

    2007-02-01

    Full Text Available Abstract Background Genome-wide expression, sequence and association studies typically yield large sets of gene candidates, which must then be further analysed and interpreted. Information about these genes is increasingly being captured and organized in ontologies, such as the Gene Ontology. Relationships between the gene sets identified by experimental methods and biological knowledge can be made explicit and used in the interpretation of results. However, it is often difficult to assess the statistical significance of such analyses since many inter-dependent categories are tested simultaneously. Results We developed the program package FUNC that includes and expands on currently available methods to identify significant associations between gene sets and ontological annotations. Implemented are several tests in particular well suited for genome wide sequence comparisons, estimates of the family-wise error rate, the false discovery rate, a sensitive estimator of the global significance of the results and an algorithm to reduce the complexity of the results. Conclusion FUNC is a versatile and useful tool for the analysis of genome-wide data. It is freely available under the GPL license and also accessible via a web service.

  2. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Science.gov (United States)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  3. Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder

    DEFF Research Database (Denmark)

    Naaijen, Jill; Bralten, Janita; Poelmans, Geert

    2017-01-01

    Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance...... is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants......, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome...

  4. Can survival prediction be improved by merging gene expression data sets?

    Directory of Open Access Journals (Sweden)

    Haleh Yasrebi

    Full Text Available BACKGROUND: High-throughput gene expression profiling technologies generating a wealth of data, are increasingly used for characterization of tumor biopsies for clinical trials. By applying machine learning algorithms to such clinically documented data sets, one hopes to improve tumor diagnosis, prognosis, as well as prediction of treatment response. However, the limited number of patients enrolled in a single trial study limits the power of machine learning approaches due to over-fitting. One could partially overcome this limitation by merging data from different studies. Nevertheless, such data sets differ from each other with regard to technical biases, patient selection criteria and follow-up treatment. It is therefore not clear at all whether the advantage of increased sample size outweighs the disadvantage of higher heterogeneity of merged data sets. Here, we present a systematic study to answer this question specifically for breast cancer data sets. We use survival prediction based on Cox regression as an assay to measure the added value of merged data sets. RESULTS: Using time-dependent Receiver Operating Characteristic-Area Under the Curve (ROC-AUC and hazard ratio as performance measures, we see in overall no significant improvement or deterioration of survival prediction with merged data sets as compared to individual data sets. This apparently was due to the fact that a few genes with strong prognostic power were not available on all microarray platforms and thus were not retained in the merged data sets. Surprisingly, we found that the overall best performance was achieved with a single-gene predictor consisting of CYB5D1. CONCLUSIONS: Merging did not deteriorate performance on average despite (a The diversity of microarray platforms used. (b The heterogeneity of patients cohorts. (c The heterogeneity of breast cancer disease. (d Substantial variation of time to death or relapse. (e The reduced number of genes in the merged data

  5. Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms

    Directory of Open Access Journals (Sweden)

    Hilu Khidir W

    2009-03-01

    Full Text Available Abstract Background Phylogenetic analyses of angiosperm relationships have used only a small percentage of available sequence data, but phylogenetic data matrices often can be augmented with existing data, especially if one allows missing characters. We explore the effects on phylogenetic analyses of adding 378 matK sequences and 240 26S rDNA sequences to the complete 3-gene, 567-taxon angiosperm phylogenetic matrix of Soltis et al. Results We performed maximum likelihood bootstrap analyses of the complete, 3-gene 567-taxon data matrix and the incomplete, 5-gene 567-taxon data matrix. Although the 5-gene matrix has more missing data (27.5% than the 3-gene data matrix (2.9%, the 5-gene analysis resulted in higher levels of bootstrap support. Within the 567-taxon tree, the increase in support is most evident for relationships among the 170 taxa for which both matK and 26S rDNA sequences were added, and there is little gain in support for relationships among the 119 taxa having neither matK nor 26S rDNA sequences. The 5-gene analysis also places the enigmatic Hydrostachys in Lamiales (BS = 97% rather than in Cornales (BS = 100% in 3-gene analysis. The placement of Hydrostachys in Lamiales is unprecedented in molecular analyses, but it is consistent with embryological and morphological data. Conclusion Adding available, and often incomplete, sets of sequences to existing data sets can be a fast and inexpensive way to increase support for phylogenetic relationships and produce novel and credible new phylogenetic hypotheses.

  6. Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    LENUS (Irish Health Repository)

    2011-10-05

    Abstract Background We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant.

  7. Using OWL reasoning to support the generation of novel gene sets for enrichment analysis.

    Science.gov (United States)

    Osumi-Sutherland, David J; Ponta, Enrico; Courtot, Melanie; Parkinson, Helen; Badi, Laura

    2018-02-14

    The Gene Ontology (GO) consists of over 40,000 terms for biological processes, cell components and gene product activities linked into a graph structure by over 90,000 relationships. It has been used to annotate the functions and cellular locations of several million gene products. The graph structure is used by a variety of tools to group annotated genes into sets whose products share function or location. These gene sets are widely used to interpret the results of genomics experiments by assessing which sets are significantly over- or under-represented in results lists. F Hoffmann-La Roche Ltd. has developed a bespoke, manually maintained controlled vocabulary (RCV) for use in over-representation analysis. Many terms in this vocabulary group GO terms in novel ways that cannot easily be derived using the graph structure of the GO. For example, some RCV terms group GO terms by the cell, chemical or tissue type they refer to. Recent improvements in the content and formal structure of the GO make it possible to use logical queries in Web Ontology Language (OWL) to automatically map these cross-cutting classifications to sets of GO terms. We used this approach to automate mapping between RCV and GO, largely replacing the increasingly unsustainable manual mapping process. We then tested the utility of the resulting groupings for over-representation analysis. We successfully mapped 85% of RCV terms to logical OWL definitions and showed that these could be used to recapitulate and extend manual mappings between RCV terms and the sets of GO terms subsumed by them. We also show that gene sets derived from the resulting GO terms sets can be used to detect the signatures of cell and tissue types in whole genome expression data. The rich formal structure of the GO makes it possible to use reasoning to dynamically generate novel, biologically relevant groupings of GO terms. GO term groupings generated with this approach can be used in. over-representation analysis to detect

  8. Shrinkage covariance matrix approach based on robust trimmed mean in gene sets detection

    Science.gov (United States)

    Karjanto, Suryaefiza; Ramli, Norazan Mohamed; Ghani, Nor Azura Md; Aripin, Rasimah; Yusop, Noorezatty Mohd

    2015-02-01

    Microarray involves of placing an orderly arrangement of thousands of gene sequences in a grid on a suitable surface. The technology has made a novelty discovery since its development and obtained an increasing attention among researchers. The widespread of microarray technology is largely due to its ability to perform simultaneous analysis of thousands of genes in a massively parallel manner in one experiment. Hence, it provides valuable knowledge on gene interaction and function. The microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints. Therefore, the sample covariance matrix in Hotelling's T2 statistic is not positive definite and become singular, thus it cannot be inverted. In this research, the Hotelling's T2 statistic is combined with a shrinkage approach as an alternative estimation to estimate the covariance matrix to detect significant gene sets. The use of shrinkage covariance matrix overcomes the singularity problem by converting an unbiased to an improved biased estimator of covariance matrix. Robust trimmed mean is integrated into the shrinkage matrix to reduce the influence of outliers and consequently increases its efficiency. The performance of the proposed method is measured using several simulation designs. The results are expected to outperform existing techniques in many tested conditions.

  9. Mining tissue specificity, gene connectivity and disease association to reveal a set of genes that modify the action of disease causing genes

    Directory of Open Access Journals (Sweden)

    Reverter Antonio

    2008-09-01

    Full Text Available Abstract Background The tissue specificity of gene expression has been linked to a number of significant outcomes including level of expression, and differential rates of polymorphism, evolution and disease association. Recent studies have also shown the importance of exploring differential gene connectivity and sequence conservation in the identification of disease-associated genes. However, no study relates gene interactions with tissue specificity and disease association. Methods We adopted an a priori approach making as few assumptions as possible to analyse the interplay among gene-gene interactions with tissue specificity and its subsequent likelihood of association with disease. We mined three large datasets comprising expression data drawn from massively parallel signature sequencing across 32 tissues, describing a set of 55,606 true positive interactions for 7,197 genes, and microarray expression results generated during the profiling of systemic inflammation, from which 126,543 interactions among 7,090 genes were reported. Results Amongst the myriad of complex relationships identified between expression, disease, connectivity and tissue specificity, some interesting patterns emerged. These include elevated rates of expression and network connectivity in housekeeping and disease-associated tissue-specific genes. We found that disease-associated genes are more likely to show tissue specific expression and most frequently interact with other disease genes. Using the thresholds defined in these observations, we develop a guilt-by-association algorithm and discover a group of 112 non-disease annotated genes that predominantly interact with disease-associated genes, impacting on disease outcomes. Conclusion We conclude that parameters such as tissue specificity and network connectivity can be used in combination to identify a group of genes, not previously confirmed as disease causing, that are involved in interactions with disease causing

  10. Interrogating differences in expression of targeted gene sets to predict breast cancer outcome.

    Science.gov (United States)

    Andres, Sarah A; Brock, Guy N; Wittliff, James L

    2013-07-02

    Genomics provides opportunities to develop precise tests for diagnostics, therapy selection and monitoring. From analyses of our studies and those of published results, 32 candidate genes were identified, whose expression appears related to clinical outcome of breast cancer. Expression of these genes was validated by qPCR and correlated with clinical follow-up to identify a gene subset for development of a prognostic test. RNA was isolated from 225 frozen invasive ductal carcinomas,and qRT-PCR was performed. Univariate hazard ratios and 95% confidence intervals for breast cancer mortality and recurrence were calculated for each of the 32 candidate genes. A multivariable gene expression model for predicting each outcome was determined using the LASSO, with 1000 splits of the data into training and testing sets to determine predictive accuracy based on the C-index. Models with gene expression data were compared to models with standard clinical covariates and models with both gene expression and clinical covariates. Univariate analyses revealed over-expression of RABEP1, PGR, NAT1, PTP4A2, SLC39A6, ESR1, EVL, TBC1D9, FUT8, and SCUBE2 were all associated with reduced time to disease-related mortality (HR between 0.8 and 0.91, adjusted p data sets for the gene expression, clinical, and combined models were 0.65, 0.63, and 0.65 for disease mortality and 0.64, 0.63, and 0.66 for disease recurrence, respectively. Molecular signatures consisting of five genes (PGR, GABRP, TBC1D9, SLC39A6 and LRBA) for disease mortality and of six genes (PGR, ESR1, GABRP, TBC1D9, SLC39A6 and LRBA) for disease recurrence were identified. These signatures were as effective as standard clinical parameters in predicting recurrence/mortality, and when combined, offered some improvement relative to clinical information alone for disease recurrence (median difference in C-values of 0.03, 95% CI of -0.08 to 0.13). Collectively, results suggest that these genes form the basis for a clinical

  11. Interrogating differences in expression of targeted gene sets to predict breast cancer outcome

    International Nuclear Information System (INIS)

    Andres, Sarah A; Brock, Guy N; Wittliff, James L

    2013-01-01

    Genomics provides opportunities to develop precise tests for diagnostics, therapy selection and monitoring. From analyses of our studies and those of published results, 32 candidate genes were identified, whose expression appears related to clinical outcome of breast cancer. Expression of these genes was validated by qPCR and correlated with clinical follow-up to identify a gene subset for development of a prognostic test. RNA was isolated from 225 frozen invasive ductal carcinomas,and qRT-PCR was performed. Univariate hazard ratios and 95% confidence intervals for breast cancer mortality and recurrence were calculated for each of the 32 candidate genes. A multivariable gene expression model for predicting each outcome was determined using the LASSO, with 1000 splits of the data into training and testing sets to determine predictive accuracy based on the C-index. Models with gene expression data were compared to models with standard clinical covariates and models with both gene expression and clinical covariates. Univariate analyses revealed over-expression of RABEP1, PGR, NAT1, PTP4A2, SLC39A6, ESR1, EVL, TBC1D9, FUT8, and SCUBE2 were all associated with reduced time to disease-related mortality (HR between 0.8 and 0.91, adjusted p < 0.05), while RABEP1, PGR, SLC39A6, and FUT8 were also associated with reduced recurrence times. Multivariable analyses using the LASSO revealed PGR, ESR1, NAT1, GABRP, TBC1D9, SLC39A6, and LRBA to be the most important predictors for both disease mortality and recurrence. Median C-indexes on test data sets for the gene expression, clinical, and combined models were 0.65, 0.63, and 0.65 for disease mortality and 0.64, 0.63, and 0.66 for disease recurrence, respectively. Molecular signatures consisting of five genes (PGR, GABRP, TBC1D9, SLC39A6 and LRBA) for disease mortality and of six genes (PGR, ESR1, GABRP, TBC1D9, SLC39A6 and LRBA) for disease recurrence were identified. These signatures were as effective as standard clinical

  12. Coverage and characteristics of the Affymetrix GeneChip Human Mapping 100K SNP set.

    Directory of Open Access Journals (Sweden)

    2006-05-01

    Full Text Available Improvements in technology have made it possible to conduct genome-wide association mapping at costs within reach of academic investigators, and experiments are currently being conducted with a variety of high-throughput platforms. To provide an appropriate context for interpreting results of such studies, we summarize here results of an investigation of one of the first of these technologies to be publicly available, the Affymetrix GeneChip Human Mapping 100K set of single nucleotide polymorphisms (SNPs. In a systematic analysis of the pattern and distribution of SNPs in the Mapping 100K set, we find that SNPs in this set are undersampled from coding regions (both nonsynonymous and synonymous and oversampled from regions outside genes, relative to SNPs in the overall HapMap database. In addition, we utilize a novel multilocus linkage disequilibrium (LD coefficient based on information content (analogous to the information content scores commonly used for linkage mapping that is equivalent to the familiar measure r2 in the special case of two loci. Using this approach, we are able to summarize for any subset of markers, such as the Affymetrix Mapping 100K set, the information available for association mapping in that subset, relative to the information available in the full set of markers included in the HapMap, and highlight circumstances in which this multilocus measure of LD provides substantial additional insight about the haplotype structure in a region over pairwise measures of LD.

  13. Genome-wide identification, phylogenetic and co-expression analysis of OsSET gene family in rice.

    Directory of Open Access Journals (Sweden)

    Zhanhua Lu

    Full Text Available BACKGROUND: SET domain is responsible for the catalytic activity of histone lysine methyltransferases (HKMTs during developmental process. Histone lysine methylation plays a crucial and diverse regulatory function in chromatin organization and genome function. Although several SET genes have been identified and characterized in plants, the understanding of OsSET gene family in rice is still very limited. METHODOLOGY/PRINCIPAL FINDINGS: In this study, a systematic analysis was performed and revealed the presence of at least 43 SET genes in rice genome. Phylogenetic and structural analysis grouped SET proteins into five classes, and supposed that the domains out of SET domain were significant for the specific of histone lysine methylation, as well as the recognition of methylated histone lysine. Based on the global microarray, gene expression profile revealed that the transcripts of OsSET genes were accumulated differentially during vegetative and reproductive developmental stages and preferentially up or down-regulated in different tissues. Cis-elements identification, co-expression analysis and GO analysis of expression correlation of 12 OsSET genes suggested that OsSET genes might be involved in cell cycle regulation and feedback. CONCLUSIONS/SIGNIFICANCE: This study will facilitate further studies on OsSET family and provide useful clues for functional validation of OsSETs.

  14. ADAGE signature analysis: differential expression analysis with data-defined gene sets.

    Science.gov (United States)

    Tan, Jie; Huyck, Matthew; Hu, Dongbo; Zelaya, René A; Hogan, Deborah A; Greene, Casey S

    2017-11-22

    Gene set enrichment analysis and overrepresentation analyses are commonly used methods to determine the biological processes affected by a differential expression experiment. This approach requires biologically relevant gene sets, which are currently curated manually, limiting their availability and accuracy in many organisms without extensively curated resources. New feature learning approaches can now be paired with existing data collections to directly extract functional gene sets from big data. Here we introduce a method to identify perturbed processes. In contrast with methods that use curated gene sets, this approach uses signatures extracted from public expression data. We first extract expression signatures from public data using ADAGE, a neural network-based feature extraction approach. We next identify signatures that are differentially active under a given treatment. Our results demonstrate that these signatures represent biological processes that are perturbed by the experiment. Because these signatures are directly learned from data without supervision, they can identify uncurated or novel biological processes. We implemented ADAGE signature analysis for the bacterial pathogen Pseudomonas aeruginosa. For the convenience of different user groups, we implemented both an R package (ADAGEpath) and a web server ( http://adage.greenelab.com ) to run these analyses. Both are open-source to allow easy expansion to other organisms or signature generation methods. We applied ADAGE signature analysis to an example dataset in which wild-type and ∆anr mutant cells were grown as biofilms on the Cystic Fibrosis genotype bronchial epithelial cells. We mapped active signatures in the dataset to KEGG pathways and compared with pathways identified using GSEA. The two approaches generally return consistent results; however, ADAGE signature analysis also identified a signature that revealed the molecularly supported link between the MexT regulon and Anr. We designed

  15. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  16. A cross-study gene set enrichment analysis identifies critical pathways in endometriosis

    Directory of Open Access Journals (Sweden)

    Bai Chunyan

    2009-09-01

    Full Text Available Abstract Background Endometriosis is an enigmatic disease. Gene expression profiling of endometriosis has been used in several studies, but few studies went further to classify subtypes of endometriosis based on expression patterns and to identify possible pathways involved in endometriosis. Some of the observed pathways are more inconsistent between the studies, and these candidate pathways presumably only represent a fraction of the pathways involved in endometriosis. Methods We applied a standardised microarray preprocessing and gene set enrichment analysis to six independent studies, and demonstrated increased concordance between these gene datasets. Results We find 16 up-regulated and 19 down-regulated pathways common in ovarian endometriosis data sets, 22 up-regulated and one down-regulated pathway common in peritoneal endometriosis data sets. Among them, 12 up-regulated and 1 down-regulated were found consistent between ovarian and peritoneal endometriosis. The main canonical pathways identified are related to immunological and inflammatory disease. Early secretory phase has the most over-represented pathways in the three uterine cycle phases. There are no overlapping significant pathways between the dataset from human endometrial endothelial cells and the datasets from ovarian endometriosis which used whole tissues. Conclusion The study of complex diseases through pathway analysis is able to highlight genes weakly connected to the phenotype which may be difficult to detect by using classical univariate statistics. By standardised microarray preprocessing and GSEA, we have increased the concordance in identifying many biological mechanisms involved in endometriosis. The identified gene pathways will shed light on the understanding of endometriosis and promote the development of novel therapies.

  17. MEGA-V: detection of variant gene sets in patient cohorts.

    Science.gov (United States)

    Gambardella, Gennaro; Cereda, Matteo; Benedetti, Lorena; Ciccarelli, Francesca D

    2017-04-15

    : Detecting significant associations between genetic variants and disease may prove particularly challenging when the variants are rare in the population and/or act together with other variants to cause the disease. We have developed a statistical framework named Mutation Enrichment Gene set Analysis of Variants (MEGA-V) that specifically detects the enrichments of genetic alterations within a process in a cohort of interest. By focusing on the mutations of several genes contributing to the same function rather than on those affecting a single gene, MEGA-V increases the power to detect statistically significant associations. MEGA-V is available at https://github.com/ciccalab/MEGA. francesca.ciccarelli@kcl.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  18. Gene expression risk signatures maintain prognostic power in multiple myeloma despite microarray probe set translation

    DEFF Research Database (Denmark)

    Hermansen, N E U; Borup, R; Andersen, M K

    2016-01-01

    INTRODUCTION: Gene expression profiling (GEP) risk models in multiple myeloma are based on 3'-end microarrays. We hypothesized that GEP risk signatures could retain prognostic power despite being translated and applied to whole-transcript microarray data. METHODS: We studied CD138-positive bone...... signatures maintain significant prognostic power in HDT myeloma patients. We suggest probe set matching for GEP risk signature translation as part of the efforts towards a microarray-independent GEP risk standard. (ClicinalTrials.gov identifier: NCT00639054)....

  19. Identification of a core set of rhizobial infection genes using data from single cell-types

    Directory of Open Access Journals (Sweden)

    Da-Song eChen

    2015-07-01

    Full Text Available Genome-wide expression studies on nodulation have varied in their scale from entire root systems to dissected nodules or root sections containing nodule primordia. More recently efforts have focused on developing methods for isolation of root hairs from infected plants and the application of laser-capture microdissection technology to nodules. Here we analyze two published data sets to identify a core set of infection genes that are expressed in the nodule and in root hairs during infection. Among the genes identified were those encoding phenylpropanoid biosynthesis enzymes including Chalcone-O-Methyltransferase which is required for the production of the potent Nod gene inducer 4’,4-dihydroxy-2-methoxychalcone. A promoter-GUS analysis in transgenic hairy roots for two genes encoding Chalcone-O-Methyltransferase isoforms revealed their expression in rhizobially infected root hairs and the nodule infection zone but not in the nitrogen fixation zone. We also describe a group of Rhizobially Induced Peroxidases whose expression overlaps with the production of superoxide in rhizobially infected root hairs and in nodules and roots. Finally, we identify a cohort of co-regulated transcription factors as candidate regulators of these processes.

  20. Normalization of oligonucleotide arrays based on the least-variant set of genes

    Science.gov (United States)

    Calza, Stefano; Valentini, Davide; Pawitan, Yudi

    2008-01-01

    Background It is well known that the normalization step of microarray data makes a difference in the downstream analysis. All normalization methods rely on certain assumptions, so differences in results can be traced to different sensitivities to violation of the assumptions. Illustrating the lack of robustness, in a striking spike-in experiment all existing normalization methods fail because of an imbalance between up- and down-regulated genes. This means it is still important to develop a normalization method that is robust against violation of the standard assumptions Results We develop a new algorithm based on identification of the least-variant set (LVS) of genes across the arrays. The array-to-array variation is evaluated in the robust linear model fit of pre-normalized probe-level data. The genes are then used as a reference set for a non-linear normalization. The method is applicable to any existing expression summaries, such as MAS5 or RMA. Conclusion We show that LVS normalization outperforms other normalization methods when the standard assumptions are not satisfied. In the complex spike-in study, LVS performs similarly to the ideal (in practice unknown) housekeeping-gene normalization. An R package called lvs is available in . PMID:18318917

  1. Identification of the Core Set of Carbon-Associated Genes in a Bioenergy Grassland Soil.

    Directory of Open Access Journals (Sweden)

    Adina Howe

    Full Text Available Despite the central role of soil microbial communities in global carbon (C cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the "core" set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP. Of 226,887 sequences associated with known enzymes involved in the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. In soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils.

  2. Gene-environment interaction in the etiology of mathematical ability using SNP sets.

    Science.gov (United States)

    Docherty, Sophia J; Kovas, Yulia; Plomin, Robert

    2011-01-01

    Mathematics ability and disability is as heritable as other cognitive abilities and disabilities, however its genetic etiology has received relatively little attention. In our recent genome-wide association study of mathematical ability in 10-year-old children, 10 SNP associations were nominated from scans of pooled DNA and validated in an individually genotyped sample. In this paper, we use a 'SNP set' composite of these 10 SNPs to investigate gene-environment (GE) interaction, examining whether the association between the 10-SNP set and mathematical ability differs as a function of ten environmental measures in the home and school in a sample of 1888 children with complete data. We found two significant GE interactions for environmental measures in the home and the school both in the direction of the diathesis-stress type of GE interaction: The 10-SNP set was more strongly associated with mathematical ability in chaotic homes and when parents are negative.

  3. Microarray analysis identifies a common set of cellular genes modulated by different HCV replicon clones

    Directory of Open Access Journals (Sweden)

    Gerosolimo Germano

    2008-06-01

    Full Text Available Abstract Background Hepatitis C virus (HCV RNA synthesis and protein expression affect cell homeostasis by modulation of gene expression. The impact of HCV replication on global cell transcription has not been fully evaluated. Thus, we analysed the expression profiles of different clones of human hepatoma-derived Huh-7 cells carrying a self-replicating HCV RNA which express all viral proteins (HCV replicon system. Results First, we compared the expression profile of HCV replicon clone 21-5 with both the Huh-7 parental cells and the 21-5 cured (21-5c cells. In these latter, the HCV RNA has been eliminated by IFN-α treatment. To confirm data, we also analyzed microarray results from both the 21-5 and two other HCV replicon clones, 22-6 and 21-7, compared to the Huh-7 cells. The study was carried out by using the Applied Biosystems (AB Human Genome Survey Microarray v1.0 which provides 31,700 probes that correspond to 27,868 human genes. Microarray analysis revealed a specific transcriptional program induced by HCV in replicon cells respect to both IFN-α-cured and Huh-7 cells. From the original datasets of differentially expressed genes, we selected by Venn diagrams a final list of 38 genes modulated by HCV in all clones. Most of the 38 genes have never been described before and showed high fold-change associated with significant p-value, strongly supporting data reliability. Classification of the 38 genes by Panther System identified functional categories that were significantly enriched in this gene set, such as histones and ribosomal proteins as well as extracellular matrix and intracellular protein traffic. The dataset also included new genes involved in lipid metabolism, extracellular matrix and cytoskeletal network, which may be critical for HCV replication and pathogenesis. Conclusion Our data provide a comprehensive analysis of alterations in gene expression induced by HCV replication and reveal modulation of new genes potentially useful

  4. Genome-Wide Gene Set Analysis for Identification of Pathways Associated with Alcohol Dependence

    Science.gov (United States)

    Biernacka, Joanna M.; Geske, Jennifer; Jenkins, Gregory D.; Colby, Colin; Rider, David N.; Karpyak, Victor M.; Choi, Doo-Sup; Fridley, Brooke L.

    2013-01-01

    It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the “Synthesis and Degradation of Ketone Bodies” pathway. Our results also support the potential involvement of the “Neuroactive Ligand Receptor Interaction” pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence. PMID:22717047

  5. Genome-wide gene-set analysis for identification of pathways associated with alcohol dependence.

    Science.gov (United States)

    Biernacka, Joanna M; Geske, Jennifer; Jenkins, Gregory D; Colby, Colin; Rider, David N; Karpyak, Victor M; Choi, Doo-Sup; Fridley, Brooke L

    2013-03-01

    It is believed that multiple genetic variants with small individual effects contribute to the risk of alcohol dependence. Such polygenic effects are difficult to detect in genome-wide association studies that test for association of the phenotype with each single nucleotide polymorphism (SNP) individually. To overcome this challenge, gene-set analysis (GSA) methods that jointly test for the effects of pre-defined groups of genes have been proposed. Rather than testing for association between the phenotype and individual SNPs, these analyses evaluate the global evidence of association with a set of related genes enabling the identification of cellular or molecular pathways or biological processes that play a role in development of the disease. It is hoped that by aggregating the evidence of association for all available SNPs in a group of related genes, these approaches will have enhanced power to detect genetic associations with complex traits. We performed GSA using data from a genome-wide study of 1165 alcohol-dependent cases and 1379 controls from the Study of Addiction: Genetics and Environment (SAGE), for all 200 pathways listed in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Results demonstrated a potential role of the 'synthesis and degradation of ketone bodies' pathway. Our results also support the potential involvement of the 'neuroactive ligand-receptor interaction' pathway, which has previously been implicated in addictive disorders. These findings demonstrate the utility of GSA in the study of complex disease, and suggest specific directions for further research into the genetic architecture of alcohol dependence.

  6. Functional classification of genes using semantic distance and fuzzy clustering approach: evaluation with reference sets and overlap analysis.

    Science.gov (United States)

    Devignes, Marie-Dominique; Benabderrahmane, Sidahmed; Smaïl-Tabbone, Malika; Napoli, Amedeo; Poch, Olivier

    2012-01-01

    Functional classification aims at grouping genes according to their molecular function or the biological process they participate in. Evaluating the validity of such unsupervised gene classification remains a challenge given the variety of distance measures and classification algorithms that can be used. We evaluate here functional classification of genes with the help of reference sets: KEGG (Kyoto Encyclopaedia of Genes and Genomes) pathways and Pfam clans. These sets represent ground truth for any distance based on GO (Gene Ontology) biological process and molecular function annotations respectively. Overlaps between clusters and reference sets are estimated by the F-score method. We test our previously described IntelliGO semantic distance with hierarchical and fuzzy C-means clustering and we compare results with the state-of-the-art DAVID (Database for Annotation Visualisation and Integrated Discovery) functional classification method. Finally, study of best matching clusters to reference sets leads us to propose a set-difference method for discovering missing information.

  7. A rank-based statistical test for measuring synergistic effects between two gene sets.

    Science.gov (United States)

    Shiraishi, Yuichi; Okada-Hatakeyama, Mariko; Miyano, Satoru

    2011-09-01

    Due to recent advances in high-throughput technologies, data on various types of genomic annotation have accumulated. These data will be crucially helpful for elucidating the combinatorial logic of transcription. Although several approaches have been proposed for inferring cooperativity among multiple factors, most approaches are haunted by the issues of normalization and threshold values. In this article, we propose a rank-based non-parametric statistical test for measuring the effects between two gene sets. This method is free from the issues of normalization and threshold value determination for gene expression values. Furthermore, we have proposed an efficient Markov chain Monte Carlo method for calculating an approximate significance value of synergy. We have applied this approach for detecting synergistic combinations of transcription factor binding motifs and histone modifications. C implementation of the method is available from http://www.hgc.jp/~yshira/software/rankSynergy.zip. yshira@hgc.jp Supplementary data are available at Bioinformatics online.

  8. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

    International Nuclear Information System (INIS)

    Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-01-01

    Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC

  9. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

    Energy Technology Data Exchange (ETDEWEB)

    Pandi, Narayanan Sathiya, E-mail: sathiyapandi@gmail.com; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC.

  10. Prediction potential of candidate biomarker sets identified and validated on gene expression data from multiple datasets

    Directory of Open Access Journals (Sweden)

    Karacali Bilge

    2007-10-01

    Full Text Available Abstract Background Independently derived expression profiles of the same biological condition often have few genes in common. In this study, we created populations of expression profiles from publicly available microarray datasets of cancer (breast, lymphoma and renal samples linked to clinical information with an iterative machine learning algorithm. ROC curves were used to assess the prediction error of each profile for classification. We compared the prediction error of profiles correlated with molecular phenotype against profiles correlated with relapse-free status. Prediction error of profiles identified with supervised univariate feature selection algorithms were compared to profiles selected randomly from a all genes on the microarray platform and b a list of known disease-related genes (a priori selection. We also determined the relevance of expression profiles on test arrays from independent datasets, measured on either the same or different microarray platforms. Results Highly discriminative expression profiles were produced on both simulated gene expression data and expression data from breast cancer and lymphoma datasets on the basis of ER and BCL-6 expression, respectively. Use of relapse-free status to identify profiles for prognosis prediction resulted in poorly discriminative decision rules. Supervised feature selection resulted in more accurate classifications than random or a priori selection, however, the difference in prediction error decreased as the number of features increased. These results held when decision rules were applied across-datasets to samples profiled on the same microarray platform. Conclusion Our results show that many gene sets predict molecular phenotypes accurately. Given this, expression profiles identified using different training datasets should be expected to show little agreement. In addition, we demonstrate the difficulty in predicting relapse directly from microarray data using supervised machine

  11. MADS goes genomic in conifers: towards determining the ancestral set of MADS-box genes in seed plants.

    Science.gov (United States)

    Gramzow, Lydia; Weilandt, Lisa; Theißen, Günter

    2014-11-01

    MADS-box genes comprise a gene family coding for transcription factors. This gene family expanded greatly during land plant evolution such that the number of MADS-box genes ranges from one or two in green algae to around 100 in angiosperms. Given the crucial functions of MADS-box genes for nearly all aspects of plant development, the expansion of this gene family probably contributed to the increasing complexity of plants. However, the expansion of MADS-box genes during one important step of land plant evolution, namely the origin of seed plants, remains poorly understood due to the previous lack of whole-genome data for gymnosperms. The newly available genome sequences of Picea abies, Picea glauca and Pinus taeda were used to identify the complete set of MADS-box genes in these conifers. In addition, MADS-box genes were identified in the growing number of transcriptomes available for gymnosperms. With these datasets, phylogenies were constructed to determine the ancestral set of MADS-box genes of seed plants and to infer the ancestral functions of these genes. Type I MADS-box genes are under-represented in gymnosperms and only a minimum of two Type I MADS-box genes have been present in the most recent common ancestor (MRCA) of seed plants. In contrast, a large number of Type II MADS-box genes were found in gymnosperms. The MRCA of extant seed plants probably possessed at least 11-14 Type II MADS-box genes. In gymnosperms two duplications of Type II MADS-box genes were found, such that the MRCA of extant gymnosperms had at least 14-16 Type II MADS-box genes. The implied ancestral set of MADS-box genes for seed plants shows simplicity for Type I MADS-box genes and remarkable complexity for Type II MADS-box genes in terms of phylogeny and putative functions. The analysis of transcriptome data reveals that gymnosperm MADS-box genes are expressed in a great variety of tissues, indicating diverse roles of MADS-box genes for the development of gymnosperms. This study is

  12. Branch-and-bound approach for parsimonious inference of a species tree from a set of gene family trees.

    Science.gov (United States)

    Doyon, Jean-Philippe; Chauve, Cedric

    2011-01-01

    We describe a Branch-and-Bound algorithm for computing a parsimonious species tree, given a set of gene family trees. Our algorithm can consider three cost measures: number of gene duplications, number of gene losses, and both combined. Moreover, to cope with intrinsic limitations of Branch-and-Bound algorithms for species trees inference regarding the number of taxa that can be considered, our algorithm can naturally take into account predefined relationships between sets of taxa. We test our algorithm on a dataset of eukaryotic gene families spanning 29 taxa.

  13. Nonoptical massive parallel DNA sequencing of BRCA1 and BRCA2 genes in a diagnostic setting.

    Science.gov (United States)

    Costa, José Luis; Sousa, Sónia; Justino, Ana; Kay, Teresa; Fernandes, Susana; Cirnes, Luis; Schmitt, Fernando; Machado, José Carlos

    2013-04-01

    The introduction of the benchtop massive parallel sequencers made it possible for the majority of clinical diagnostic laboratories to gain access to this fast evolving technology. In this study, using the Ion Torrent Personal Genome Machine, we present a strategy for the molecular diagnosis of hereditary breast and ovarian cancer and respective analytical validation. The methodology relies on a multiplex PCR amplification of the BRCA1 and BRCA2 genes combined with a variant prioritization pipeline, designed to minimize the number of false-positive calls without the introduction of false-negative results. A training set of samples was used to optimize the entire process, and a second set was used to validate and independently evaluate the performance of the workflow. Performing the study in a blind manner relative to the variants in the samples and using conventional Sanger sequencing as standard, the workflow resulted in a strategy with a maximum analytical sensitivity ≥98.6% with a confidence of 95% and a specificity of 96.9%. Importantly, no true variant was missed. This study presents a comprehensive massive parallel sequencing-Sanger sequencing based strategy, which results in a high analytical sensitivity assay that provides a time- and cost-effective strategy for the identification of mutations in the BRCA1 and BRCA2 genes. © 2013 Wiley Periodicals, Inc.

  14. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer.

    Science.gov (United States)

    Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder: association to overlapping traits in ADHD and autism.

    Science.gov (United States)

    Naaijen, J; Bralten, J; Poelmans, G; Glennon, J C; Franke, B; Buitelaar, J K

    2017-01-10

    Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance is essential for proper brain development and functioning. In this study we investigated the role of glutamate and GABA genetics in ADHD severity, autism symptom severity and inhibitory performance, based on gene set analysis, an approach to investigate multiple genetic variants simultaneously. Common variants within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome-wide association levels. The GABA gene set showed nominally significant association with inhibition (P=0.04), but this did not survive correction for multiple comparisons. None of single gene or single variant associations was significant on their own. By analyzing multiple genetic variants within candidate gene sets together, we were able to find genetic associations supporting the involvement of excitatory and inhibitory neurotransmitter systems in ADHD and ASD symptom severity in ADHD.

  16. An ancient dental gene set governs development and continuous regeneration of teeth in sharks.

    Science.gov (United States)

    Rasch, Liam J; Martin, Kyle J; Cooper, Rory L; Metscher, Brian D; Underwood, Charlie J; Fraser, Gareth J

    2016-07-15

    The evolution of oral teeth is considered a major contributor to the overall success of jawed vertebrates. This is especially apparent in cartilaginous fishes including sharks and rays, which develop elaborate arrays of highly specialized teeth, organized in rows and retain the capacity for life-long regeneration. Perpetual regeneration of oral teeth has been either lost or highly reduced in many other lineages including important developmental model species, so cartilaginous fishes are uniquely suited for deep comparative analyses of tooth development and regeneration. Additionally, sharks and rays can offer crucial insights into the characters of the dentition in the ancestor of all jawed vertebrates. Despite this, tooth development and regeneration in chondrichthyans is poorly understood and remains virtually uncharacterized from a developmental genetic standpoint. Using the emerging chondrichthyan model, the catshark (Scyliorhinus spp.), we characterized the expression of genes homologous to those known to be expressed during stages of early dental competence, tooth initiation, morphogenesis, and regeneration in bony vertebrates. We have found that expression patterns of several genes from Hh, Wnt/β-catenin, Bmp and Fgf signalling pathways indicate deep conservation over ~450 million years of tooth development and regeneration. We describe how these genes participate in the initial emergence of the shark dentition and how they are redeployed during regeneration of successive tooth generations. We suggest that at the dawn of the vertebrate lineage, teeth (i) were most likely continuously regenerative structures, and (ii) utilised a core set of genes from members of key developmental signalling pathways that were instrumental in creating a dental legacy redeployed throughout vertebrate evolution. These data lay the foundation for further experimental investigations utilizing the unique regenerative capacity of chondrichthyan models to answer evolutionary

  17. DNMT1 is associated with cell cycle and DNA replication gene sets in diffuse large B-cell lymphoma.

    Science.gov (United States)

    Loo, Suet Kee; Ab Hamid, Suzina Sheikh; Musa, Mustaffa; Wong, Kah Keng

    2018-01-01

    Dysregulation of DNA (cytosine-5)-methyltransferase 1 (DNMT1) is associated with the pathogenesis of various types of cancer. It has been previously shown that DNMT1 is frequently expressed in diffuse large B-cell lymphoma (DLBCL), however its functions remain to be elucidated in the disease. In this study, we gene expression profiled (GEP) shRNA targeting DNMT1(shDNMT1)-treated germinal center B-cell-like DLBCL (GCB-DLBCL)-derived cell line (i.e. HT) compared with non-silencing shRNA (control shRNA)-treated HT cells. Independent gene set enrichment analysis (GSEA) performed using GEPs of shRNA-treated HT cells and primary GCB-DLBCL cases derived from two publicly-available datasets (i.e. GSE10846 and GSE31312) produced three separate lists of enriched gene sets for each gene sets collection from Molecular Signatures Database (MSigDB). Subsequent Venn analysis identified 268, 145 and six consensus gene sets from analyzing gene sets in C2 collection (curated gene sets), C5 sub-collection [gene sets from gene ontology (GO) biological process ontology] and Hallmark collection, respectively to be enriched in positive correlation with DNMT1 expression profiles in shRNA-treated HT cells, GSE10846 and GSE31312 datasets [false discovery rate (FDR) 0.8) with DNMT1 expression and significantly downregulated (log fold-change <-1.35; p<0.05) following DNMT1 silencing in HT cells. These results suggest the involvement of DNMT1 in the activation of cell cycle and DNA replication in DLBCL cells. Copyright © 2017 Elsevier GmbH. All rights reserved.

  18. A novel CpG island set identifies tissue-specific methylation at developmental gene loci.

    Directory of Open Access Journals (Sweden)

    Robert Illingworth

    2008-01-01

    Full Text Available CpG islands (CGIs are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.

  19. Mechanical Unloading of Mouse Bone in Microgravity Significantly Alters Cell Cycle Gene Set Expression

    Science.gov (United States)

    Blaber, Elizabeth; Dvorochkin, Natalya; Almeida, Eduardo; Kaplan, Warren; Burns, Brnedan

    2012-07-01

    unloading in spaceflight, we conducted genome wide microarray analysis of total RNA isolated from the mouse pelvis. Specifically, 16 week old mice were subjected to 15 days spaceflight onboard NASA's STS-131 space shuttle mission. The pelvis of the mice was dissected, the bone marrow was flushed and the bones were briefly stored in RNAlater. The pelvii were then homogenized, and RNA was isolated using TRIzol. RNA concentration and quality was measured using a Nanodrop spectrometer, and 0.8% agarose gel electrophoresis. Samples of cDNA were analyzed using an Affymetrix GeneChip\\S Gene 1.0 ST (Sense Target) Array System for Mouse and GenePattern Software. We normalized the ST gene arrays using Robust Multichip Average (RMA) normalization, which summarizes perfectly matched spots on the array through the median polish algorithm, rather than normalizing according to mismatched spots. We also used Limma for statistical analysis, using the BioConductor Limma Library by Gordon Smyth, and differential expression analysis to identify genes with significant changes in expression between the two experimental conditions. Finally we used GSEApreRanked for Gene Set Enrichment Analysis (GSEA), with Kolmogorov-Smirnov style statistics to identify groups of genes that are regulated together using the t-statistics derived from Limma. Preliminary results show that 6,603 genes expressed in pelvic bone had statistically significant alterations in spaceflight compared to ground controls. These prominently included cell cycle arrest molecules p21, and p18, cell survival molecule Crbp1, and cell cycle molecules cyclin D1, and Cdk1. Additionally, GSEA results indicated alterations in molecular targets of cyclin D1 and Cdk4, senescence pathways resulting from abnormal laminin maturation, cell-cell contacts via E-cadherin, and several pathways relating to protein translation and metabolism. In total 111 gene sets out of 2,488, about 4%, showed statistically significant set alterations. These

  20. ICGA-PSO-ELM approach for accurate multiclass cancer classification resulting in reduced gene sets in which genes encoding secreted proteins are highly represented.

    Science.gov (United States)

    Saraswathi, Saras; Sundaram, Suresh; Sundararajan, Narasimhan; Zimmermann, Michael; Nilsen-Hamilton, Marit

    2011-01-01

    A combination of Integer-Coded Genetic Algorithm (ICGA) and Particle Swarm Optimization (PSO), coupled with the neural-network-based Extreme Learning Machine (ELM), is used for gene selection and cancer classification. ICGA is used with PSO-ELM to select an optimal set of genes, which is then used to build a classifier to develop an algorithm (ICGA_PSO_ELM) that can handle sparse data and sample imbalance. We evaluate the performance of ICGA-PSO-ELM and compare our results with existing methods in the literature. An investigation into the functions of the selected genes, using a systems biology approach, revealed that many of the identified genes are involved in cell signaling and proliferation. An analysis of these gene sets shows a larger representation of genes that encode secreted proteins than found in randomly selected gene sets. Secreted proteins constitute a major means by which cells interact with their surroundings. Mounting biological evidence has identified the tumor microenvironment as a critical factor that determines tumor survival and growth. Thus, the genes identified by this study that encode secreted proteins might provide important insights to the nature of the critical biological features in the microenvironment of each tumor type that allow these cells to thrive and proliferate.

  1. A Meta-Analysis of Multiple Matched Copy Number and Transcriptomics Data Sets for Inferring Gene Regulatory Relationships

    Science.gov (United States)

    Newton, Richard; Wernisch, Lorenz

    2014-01-01

    Inferring gene regulatory relationships from observational data is challenging. Manipulation and intervention is often required to unravel causal relationships unambiguously. However, gene copy number changes, as they frequently occur in cancer cells, might be considered natural manipulation experiments on gene expression. An increasing number of data sets on matched array comparative genomic hybridisation and transcriptomics experiments from a variety of cancer pathologies are becoming publicly available. Here we explore the potential of a meta-analysis of thirty such data sets. The aim of our analysis was to assess the potential of in silico inference of trans-acting gene regulatory relationships from this type of data. We found sufficient correlation signal in the data to infer gene regulatory relationships, with interesting similarities between data sets. A number of genes had highly correlated copy number and expression changes in many of the data sets and we present predicted potential trans-acted regulatory relationships for each of these genes. The study also investigates to what extent heterogeneity between cell types and between pathologies determines the number of statistically significant predictions available from a meta-analysis of experiments. PMID:25148247

  2. HIGEDA: a hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences.

    Science.gov (United States)

    Le, Thanh; Altman, Tom; Gardiner, Katheleen

    2010-02-01

    Identification of motifs in biological sequences is a challenging problem because such motifs are often short, degenerate, and may contain gaps. Most algorithms that have been developed for motif-finding use the expectation-maximization (EM) algorithm iteratively. Although EM algorithms can converge quickly, they depend strongly on initialization parameters and can converge to local sub-optimal solutions. In addition, they cannot generate gapped motifs. The effectiveness of EM algorithms in motif finding can be improved by incorporating methods that choose different sets of initial parameters to enable escape from local optima, and that allow gapped alignments within motif models. We have developed HIGEDA, an algorithm that uses the hierarchical gene-set genetic algorithm (HGA) with EM to initiate and search for the best parameters for the motif model. In addition, HIGEDA can identify gapped motifs using a position weight matrix and dynamic programming to generate an optimal gapped alignment of the motif model with sequences from the dataset. We show that HIGEDA outperforms MEME and other motif-finding algorithms on both DNA and protein sequences. Source code and test datasets are available for download at http://ouray.cudenver.edu/~tnle/, implemented in C++ and supported on Linux and MS Windows.

  3. A novel proposal of a simplified bacterial gene set and the neo-construction of a general minimized metabolic network.

    Science.gov (United States)

    Ye, Yuan-Nong; Ma, Bin-Guang; Dong, Chuan; Zhang, Hong; Chen, Ling-Ling; Guo, Feng-Biao

    2016-10-07

    A minimal gene set (MGS) is critical for the assembly of a minimal artificial cell. We have developed a proposal of simplifying bacterial gene set to approximate a bacterial MGS by the following procedure. First, we base our simplified bacterial gene set (SBGS) on experimentally determined essential genes to ensure that the genes included in the SBGS are critical. Second, we introduced a half-retaining strategy to extract persistent essential genes to ensure stability. Third, we constructed a viable metabolic network to supplement SBGS. The proposed SBGS includes 327 genes and required 431 reactions. This report describes an SBGS that preserves both self-replication and self-maintenance systems. In the minimized metabolic network, we identified five novel hub metabolites and confirmed 20 known hubs. Highly essential genes were found to distribute the connecting metabolites into more reactions. Based on our SBGS, we expanded the pool of targets for designing broad-spectrum antibacterial drugs to reduce pathogen resistance. We also suggested a rough semi-de novo strategy to synthesize an artificial cell, with potential applications in industry.

  4. Gene Set Analyses of Genome-Wide Association Studies on 49 Quantitative Traits Measured in a Single Genetic Epidemiology Dataset

    Directory of Open Access Journals (Sweden)

    Jihye Kim

    2013-09-01

    Full Text Available Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr < 0.05. Pairwise comparison of the traits in terms of the semantic similarity in their GO sets revealed surprising cases where phenotypically uncorrelated traits showed high similarity in terms of biological pathways. For example, the pH level was related to 7 other traits that showed low phenotypic correlations with it. A literature survey implies that these traits may be regulated partly by common pathways that involve neuronal or nerve systems.

  5. A dual origin of the Xist gene from a protein-coding gene and a set of transposable elements.

    Directory of Open Access Journals (Sweden)

    Eugeny A Elisaphenko

    2008-06-01

    Full Text Available X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC. Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA.

  6. FUNNEL-GSEA: FUNctioNal ELastic-net regression in time-course gene set enrichment analysis.

    Science.gov (United States)

    Zhang, Yun; Topham, David J; Thakar, Juilee; Qiu, Xing

    2017-07-01

    Gene set enrichment analyses (GSEAs) are widely used in genomic research to identify underlying biological mechanisms (defined by the gene sets), such as Gene Ontology terms and molecular pathways. There are two caveats in the currently available methods: (i) they are typically designed for group comparisons or regression analyses, which do not utilize temporal information efficiently in time-series of transcriptomics measurements; and (ii) genes overlapping in multiple molecular pathways are considered multiple times in hypothesis testing. We propose an inferential framework for GSEA based on functional data analysis, which utilizes the temporal information based on functional principal component analysis, and disentangles the effects of overlapping genes by a functional extension of the elastic-net regression. Furthermore, the hypothesis testing for the gene sets is performed by an extension of Mann-Whitney U test which is based on weighted rank sums computed from correlated observations. By using both simulated datasets and a large-scale time-course gene expression data on human influenza infection, we demonstrate that our method has uniformly better receiver operating characteristic curves, and identifies more pathways relevant to immune-response to human influenza infection than the competing approaches. The methods are implemented in R package FUNNEL, freely and publicly available at: https://github.com/yunzhang813/FUNNEL-GSEA-R-Package . xing_qiu@urmc.rochester.edu or juilee_thakar@urmc.rochester.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  7. Studying gene and gene-environment effects of uncommon and common variants on continuous traits: a marker-set approach using gene-trait similarity regression.

    Science.gov (United States)

    Tzeng, Jung-Ying; Zhang, Daowen; Pongpanich, Monnat; Smith, Chris; McCarthy, Mark I; Sale, Michèle M; Worrall, Bradford B; Hsu, Fang-Chi; Thomas, Duncan C; Sullivan, Patrick F

    2011-08-12

    Genomic association analyses of complex traits demand statistical tools that are capable of detecting small effects of common and rare variants and modeling complex interaction effects and yet are computationally feasible. In this work, we introduce a similarity-based regression method for assessing the main genetic and interaction effects of a group of markers on quantitative traits. The method uses genetic similarity to aggregate information from multiple polymorphic sites and integrates adaptive weights that depend on allele frequencies to accomodate common and uncommon variants. Collapsing information at the similarity level instead of the genotype level avoids canceling signals that have the opposite etiological effects and is applicable to any class of genetic variants without the need for dichotomizing the allele types. To assess gene-trait associations, we regress trait similarities for pairs of unrelated individuals on their genetic similarities and assess association by using a score test whose limiting distribution is derived in this work. The proposed regression framework allows for covariates, has the capacity to model both main and interaction effects, can be applied to a mixture of different polymorphism types, and is computationally efficient. These features make it an ideal tool for evaluating associations between phenotype and marker sets defined by linkage disequilibrium (LD) blocks, genes, or pathways in whole-genome analysis. Copyright © 2011 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  8. Performance of single and concatenated sets of mitochondrial genes at inferring metazoan relationships relative to full mitogenome data.

    Directory of Open Access Journals (Sweden)

    Justin C Havird

    Full Text Available Mitochondrial (mt genes are some of the most popular and widely-utilized genetic loci in phylogenetic studies of metazoan taxa. However, their linked nature has raised questions on whether using the entire mitogenome for phylogenetics is overkill (at best or pseudoreplication (at worst. Moreover, no studies have addressed the comparative phylogenetic utility of mitochondrial genes across individual lineages within the entire Metazoa. To comment on the phylogenetic utility of individual mt genes as well as concatenated subsets of genes, we analyzed mitogenomic data from 1865 metazoan taxa in 372 separate lineages spanning genera to subphyla. Specifically, phylogenies inferred from these datasets were statistically compared to ones generated from all 13 mt protein-coding (PC genes (i.e., the "supergene" set to determine which single genes performed "best" at, and the minimum number of genes required to, recover the "supergene" topology. Surprisingly, the popular marker COX1 performed poorest, while ND5, ND4, and ND2 were most likely to reproduce the "supergene" topology. Averaged across all lineages, the longest ∼2 mt PC genes were sufficient to recreate the "supergene" topology, although this average increased to ∼5 genes for datasets with 40 or more taxa. Furthermore, concatenation of the three "best" performing mt PC genes outperformed that of the three longest mt PC genes (i.e, ND5, COX1, and ND4. Taken together, while not all mt PC genes are equally interchangeable in phylogenetic studies of the metazoans, some subset can serve as a proxy for the 13 mt PC genes. However, the exact number and identity of these genes is specific to the lineage in question and cannot be applied indiscriminately across the Metazoa.

  9. Quantitative modeling of gene networks of biological systems using fuzzy Petri nets and fuzzy sets

    Directory of Open Access Journals (Sweden)

    Raed I. Hamed

    2018-01-01

    Full Text Available Quantitative demonstrating of organic frameworks has turned into an essential computational methodology in the configuration of novel and investigation of existing natural frameworks. Be that as it may, active information that portrays the framework's elements should be known keeping in mind the end goal to get pertinent results with the routine displaying strategies. This information is frequently robust or even difficult to get. Here, we exhibit a model of quantitative fuzzy rational demonstrating approach that can adapt to obscure motor information and hence deliver applicable results despite the fact that dynamic information is fragmented or just dubiously characterized. Besides, the methodology can be utilized as a part of the blend with the current cutting edge quantitative demonstrating strategies just in specific parts of the framework, i.e., where the data are absent. The contextual analysis of the methodology suggested in this paper is performed on the model of nine-quality genes. We propose a kind of FPN model in light of fuzzy sets to manage the quantitative modeling of biological systems. The tests of our model appear that the model is practical and entirely powerful for information impersonation and thinking of fuzzy expert frameworks.

  10. Application of bi-clustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials

    Directory of Open Access Journals (Sweden)

    Andrew Williams

    2017-12-01

    Full Text Available This article contains data related to the research article ‘Application of bi-clustering of gene expression data and gene set enrichment analysis methods to identify potentially disease causing nanomaterials’ (Williams and Halappanavar, 2015 [1]. The presence of diverse types of nanomaterials (NMs in commerce has grown significantly in the past decade and as a result, human exposure to these materials in the environment is inevitable. The traditional toxicity testing approaches that are reliant on animals are both time- and cost- intensive; employing which, it is not possible to complete the challenging task of safety assessment of NMs currently on the market in a timely manner. Thus, there is an urgent need for comprehensive understanding of the biological behavior of NMs, and efficient toxicity screening tools that will enable the development of predictive toxicology paradigms suited to rapidly assessing the human health impacts of exposure to NMs. In an effort to predict the long term health impacts of acute exposure to NMs, in Williams and Halappanavar (2015 [1], we applied bi-clustering and gene set enrichment analysis methods to derive essential features of altered lung transcriptome following exposure to NMs that are associated with lung-specific diseases. Several datasets from public microarray repositories describing pulmonary diseases in mouse models following exposure to a variety of substances were examined and functionally related bi-clusters showing similar gene expression profiles were identified. The identified bi-clusters were then used to conduct a gene set enrichment analysis on lung gene expression profiles derived from mice exposed to nano-titanium dioxide, carbon black or carbon nanotubes (nano-TiO2, CB and CNTs to determine the disease significance of these data-driven gene sets. The results of the analysis correctly identified all NMs to be inflammogenic, and only CB and CNTs as potentially fibrogenic. Here, we

  11. Using Variable Precision Rough Set for Selection and Classification of Biological Knowledge Integrated in DNA Gene Expression

    Directory of Open Access Journals (Sweden)

    Calvo-Dmgz D.

    2012-12-01

    Full Text Available DNA microarrays have contributed to the exponential growth of genomic and experimental data in the last decade. This large amount of gene expression data has been used by researchers seeking diagnosis of diseases like cancer using machine learning methods. In turn, explicit biological knowledge about gene functions has also grown tremendously over the last decade. This work integrates explicit biological knowledge, provided as gene sets, into the classication process by means of Variable Precision Rough Set Theory (VPRS. The proposed model is able to highlight which part of the provided biological knowledge has been important for classification. This paper presents a novel model for microarray data classification which is able to incorporate prior biological knowledge in the form of gene sets. Based on this knowledge, we transform the input microarray data into supergenes, and then we apply rough set theory to select the most promising supergenes and to derive a set of easy interpretable classification rules. The proposed model is evaluated over three breast cancer microarrays datasets obtaining successful results compared to classical classification techniques. The experimental results shows that there are not significat differences between our model and classical techniques but it is able to provide a biological-interpretable explanation of how it classifies new samples.

  12. Enriching the gene set analysis of genome-wide data by incorporating directionality of gene expression and combining statistical hypotheses and methods

    Science.gov (United States)

    Väremo, Leif; Nielsen, Jens; Nookaew, Intawat

    2013-01-01

    Gene set analysis (GSA) is used to elucidate genome-wide data, in particular transcriptome data. A multitude of methods have been proposed for this step of the analysis, and many of them have been compared and evaluated. Unfortunately, there is no consolidated opinion regarding what methods should be preferred, and the variety of available GSA software and implementations pose a difficulty for the end-user who wants to try out different methods. To address this, we have developed the R package Piano that collects a range of GSA methods into the same system, for the benefit of the end-user. Further on we refine the GSA workflow by using modifications of the gene-level statistics. This enables us to divide the resulting gene set P-values into three classes, describing different aspects of gene expression directionality at gene set level. We use our fully implemented workflow to investigate the impact of the individual components of GSA by using microarray and RNA-seq data. The results show that the evaluated methods are globally similar and the major separation correlates well with our defined directionality classes. As a consequence of this, we suggest to use a consensus scoring approach, based on multiple GSA runs. In combination with the directionality classes, this constitutes a more thorough basis for an enriched biological interpretation. PMID:23444143

  13. Complete auxotrophy for unsaturated fatty acids requires deletion of two sets of genes in Mycobacterium smegmatis.

    Science.gov (United States)

    Di Capua, Cecilia B; Doprado, Mariana; Belardinelli, Juan Manuel; Morbidoni, Héctor R

    2017-10-01

    The synthesis of unsaturated fatty acids in Mycobacterium smegmatis is poorly characterized. Bioinformatic analysis revealed four putative fatty acid desaturases in its genome, one of which, MSMEG_1886, is highly homologous to desA3, the only palmitoyl/stearoyl desaturase present in the Mycobacterium tuberculosis genome. A MSMEG_1886 deletion mutant was partially auxotrophic for oleic acid and viable at 37°C and 25°C, although with a long lag phase in liquid medium. Fatty acid analysis suggested that MSMEG_1886 is a palmitoyl/stearoyl desaturase, as the synthesis of palmitoleic acid was abrogated, while oleic acid contents dropped by half in the mutant. Deletion of the operon MSMEG_1741-1743 (highly homologous to a Pseudomonas aeruginosa acyl-CoA desaturase) had little effect on growth of the parental strain; however the double mutant MSMEG_1886-MSMEG_1741-1743 strictly required oleic acid for growth. The ΔMSMEG_1886-ΔMSMEG_1741 double mutant was able to grow (poorly but better than the ΔMSMEG_1886 single mutant) in solid and liquid media devoid of oleic acid, suggesting a repressor role for ΔMSMEG_1741. Fatty acid analysis of the described mutants suggested that MSMEG_1742-43 desaturates C18:0 and C24:0 fatty acids. Thus, although the M. smegmatis desA3 homologue is the major player in unsaturated fatty acid synthesis, a second set of genes is also involved. © 2017 John Wiley & Sons Ltd.

  14. Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets

    KAUST Repository

    Permina, Elizaveta A.

    2013-01-01

    Identification of bacterial modulons from series of gene expression measurements on microarrays is a principal problem, especially relevant for inadequately studied but practically important species. Usage of a priori information on regulatory interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set of genes essential for a regulon is used to control modulon updating. Essential genes for a regulon were selected as a subset of regulon genes highly related by different measures to each other. Using Escherichia coli as a model, we studied how modulon identification depends on the data, including the microarray experiments set, the adopted relevance measure and the regulon itself. We have found that results of modulon identification are highly dependent on all parameters studied and thus the resulting modulon varies substantially depending on the identification procedure. Yet, modulons that were identified correctly displayed higher stability during iterations, which allows developing a procedure for reliable modulon identification in the case of less studied species where the known regulatory interactions are sparse. Copyright © 2013 Taylor & Francis.

  15. The Schizophrenia-Associated BRD1 Gene Regulates Behavior, Neurotransmission, and Expression of Schizophrenia Risk Enriched Gene Sets in Mice.

    Science.gov (United States)

    Qvist, Per; Christensen, Jane Hvarregaard; Vardya, Irina; Rajkumar, Anto Praveen; Mørk, Arne; Paternoster, Veerle; Füchtbauer, Ernst-Martin; Pallesen, Jonatan; Fryland, Tue; Dyrvig, Mads; Hauberg, Mads Engel; Lundsberg, Birgitte; Fejgin, Kim; Nyegaard, Mette; Jensen, Kimmo; Nyengaard, Jens Randel; Mors, Ole; Didriksen, Michael; Børglum, Anders Dupont

    2017-07-01

    The schizophrenia-associated BRD1 gene encodes a transcriptional regulator whose comprehensive chromatin interactome is enriched with schizophrenia risk genes. However, the biology underlying the disease association of BRD1 remains speculative. This study assessed the transcriptional drive of a schizophrenia-associated BRD1 risk variant in vitro. Accordingly, to examine the effects of reduced Brd1 expression, we generated a genetically modified Brd1 +/- mouse and subjected it to behavioral, electrophysiological, molecular, and integrative genomic analyses with focus on schizophrenia-relevant parameters. Brd1 +/- mice displayed cerebral histone H3K14 hypoacetylation and a broad range of behavioral changes with translational relevance to schizophrenia. These behaviors were accompanied by striatal dopamine/serotonin abnormalities and cortical excitation-inhibition imbalances involving loss of parvalbumin immunoreactive interneurons. RNA-sequencing analyses of cortical and striatal micropunches from Brd1 +/- and wild-type mice revealed differential expression of genes enriched for schizophrenia risk, including several schizophrenia genome-wide association study risk genes (e.g., calcium channel subunits [Cacna1c and Cacnb2], cholinergic muscarinic receptor 4 [Chrm4)], dopamine receptor D 2 [Drd2], and transcription factor 4 [Tcf4]). Integrative analyses further found differentially expressed genes to cluster in functional networks and canonical pathways associated with mental illness and molecular signaling processes (e.g., glutamatergic, monoaminergic, calcium, cyclic adenosine monophosphate [cAMP], dopamine- and cAMP-regulated neuronal phosphoprotein 32 kDa [DARPP-32], and cAMP responsive element binding protein signaling [CREB]). Our study bridges the gap between genetic association and pathogenic effects and yields novel insights into the unfolding molecular changes in the brain of a new schizophrenia model that incorporates genetic risk at three levels: allelic

  16. Bioinformatic Analysis of Gene Sets Regulated by Ligand-Activated and Dominant-Negative PPARγ in Mouse Aorta

    Science.gov (United States)

    Keen, Henry L.; Halabi, Carmen M.; Beyer, Andreas M.; de Lange, Willem J.; Liu, Xuebo; Maeda, Nobuyo; Faraci, Frank M.; Casavant, Thomas L.; Sigmund, Curt D.

    2010-01-01

    Objective Drugs that activate PPARγ improve glucose sensitivity and lower blood pressure, whereas dominant negative mutations in PPARγ cause severe insulin resistance and hypertension. We hypothesize that these PPARγ mutants regulate target genes opposite to that of ligand-mediated activation and tested this hypothesis on a genome-wide scale. Methods and Results We integrated gene expression data in aorta from mice treated with the PPARγ ligand rosiglitazone with data from mice containing a globally expressed knockin of the PPARγ P465L dominant negative mutation. We also integrated our data with publicly available datasets containing 1) gene expression profiles in many human tissues, 2) PPARγ target genes in 3T3-L1 adipocytes, and 3) experimentally validated PPARγ binding sites throughout the genome. Many classical PPARγ target genes were induced by rosiglitazone and repressed by dominant-negative PPARγ. A similar pattern was observed for about 90% of the gene sets regulated both by rosiglitazone and dominant-negative PPARγ. Genes exhibiting this pattern of contrasting regulation were significantly enriched for nearby PPARγ binding sites. Conclusions These results provide convincing evidence that the PPARγ P465L mutation causes transcriptional effects that are opposite to those mediated by PPARγ ligand thus validating mice carrying the mutation as a model of PPARγ interference. PMID:20018933

  17. A tool set to allow rapid screening of dog families with PRA for association with candidate genes.

    Science.gov (United States)

    Winkler, Paige A; Davis, Jennifer A; Petersen-Jones, Simon M; Venta, Patrick J; Bartoe, Joshua T

    2017-07-01

    To develop a method to rapidly screen candidate genes for association with recessively inherited progressive retinal atrophy (PRA) in pedigrees of dog in which a causative mutation has not been identified. Thirteen PRA-affected dogs were used in this study. Two microsatellite markers (MS) were designed flanking 45 candidate genes. MS markers were analyzed for heterozygosity and allelic richness. Two dog breeds, in which the causative mutation has been identified (Entlebucher Sennenhunds [ES] and PDE6A-mutant dogs [PDE6A]), were used to validate the MS marker panel. One breed in which the causative mutation is currently unknown (Old English Sheepdog [OES]) was investigated in this study utilizing the MS panel. Marker heterozygosity excluded 38 of 45 and 41 of 45 candidate genes (ES and PDE6A, respectively) with each true culprit gene remaining on the list of nonexcluded candidate genes. Additionally, 41 of 45 genes were excluded for OES. This tool set was used quickly and efficiently to narrow down 45 candidate genes for recessively inherited PRA in two types of dogs with known mutations and one type of dog with an unknown mutation. © 2016 American College of Veterinary Ophthalmologists.

  18. Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting

    DEFF Research Database (Denmark)

    Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero, Laura

    2016-01-01

    %) individuals carried variants with a likely functional effect. Ten (40%) of these variants were located in genes associated with cardiomyopathies and 15 (60%) of the variants in genes associated with cardiac channelopathies. Nineteen individuals carried variants with unknown functional effect. Our findings...... indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies.European Journal of Human Genetics advance online publication, 21 September 2016; doi:10.1038/ejhg.2016.118....

  19. Expressed genes for plant-type ribulose 1,5-bisphosphate carboxylase/oxygenase in the photosynthetic bacterium Chromatium vinosum, which possesses two complete sets of the genes.

    OpenAIRE

    Viale, A M; Kobayashi, H; Akazawa, T

    1989-01-01

    Two sets of genes for the large and small subunits of ribulose 1,5-bisphosphate carboxylase/oxygenase (RuBisCO) were detected in the photosynthetic purple sulfur bacterium Chromatium vinosum by hybridization analysis with RuBisCO gene probes, cloned by using the lambda Fix vector, and designated rbcL-rbcS and rbcA-rbcB. rbcL and rbcA encode the large subunits, and rbcS and rbcB encode the small subunits. rbcL-rbcS was the same as that reported previously (A. M. Viale, H. Kobayashi, T. Takabe,...

  20. Identification and Validation of a New Set of Five Genes for Prediction of Risk in Early Breast Cancer

    Directory of Open Access Journals (Sweden)

    Giorgio Mustacchi

    2013-05-01

    Full Text Available Molecular tests predicting the outcome of breast cancer patients based on gene expression levels can be used to assist in making treatment decisions after consideration of conventional markers. In this study we identified a subset of 20 mRNA differentially regulated in breast cancer analyzing several publicly available array gene expression data using R/Bioconductor package. Using RTqPCR we evaluate 261 consecutive invasive breast cancer cases not selected for age, adjuvant treatment, nodal and estrogen receptor status from paraffin embedded sections. The biological samples dataset was split into a training (137 cases and a validation set (124 cases. The gene signature was developed on the training set and a multivariate stepwise Cox analysis selected five genes independently associated with DFS: FGF18 (HR = 1.13, p = 0.05, BCL2 (HR = 0.57, p = 0.001, PRC1 (HR = 1.51, p = 0.001, MMP9 (HR = 1.11, p = 0.08, SERF1a (HR = 0.83, p = 0.007. These five genes were combined into a linear score (signature weighted according to the coefficients of the Cox model, as: 0.125FGF18 − 0.560BCL2 + 0.409PRC1 + 0.104MMP9 − 0.188SERF1A (HR = 2.7, 95% CI = 1.9–4.0, p < 0.001. The signature was then evaluated on the validation set assessing the discrimination ability by a Kaplan Meier analysis, using the same cut offs classifying patients at low, intermediate or high risk of disease relapse as defined on the training set (p < 0.001. Our signature, after a further clinical validation, could be proposed as prognostic signature for disease free survival in breast cancer patients where the indication for adjuvant chemotherapy added to endocrine treatment is uncertain.

  1. A set of vectors for introduction of antibiotic resistance genes by in vitro Cre-mediated recombination

    Directory of Open Access Journals (Sweden)

    Vassetzky Yegor S

    2008-12-01

    Full Text Available Abstract Background Introduction of new antibiotic resistance genes in the plasmids of interest is a frequent task in molecular cloning practice. Classical approaches involving digestion with restriction endonucleases and ligation are time-consuming. Findings We have created a set of insertion vectors (pINS carrying genes that provide resistance to various antibiotics (puromycin, blasticidin and G418 and containing a loxP site. Each vector (pINS-Puro, pINS-Blast or pINS-Neo contains either a chloramphenicol or a kanamycin resistance gene and is unable to replicate in most E. coli strains as it contains a conditional R6Kγ replication origin. Introduction of the antibiotic resistance genes into the vector of interest is achieved by Cre-mediated recombination between the replication-incompetent pINS and a replication-competent target vector. The recombination mix is then transformed into E. coli and selected by the resistance marker (kanamycin or chloramphenicol present in pINS, which allows to recover the recombinant plasmids with 100% efficiency. Conclusion Here we propose a simple strategy that allows to introduce various antibiotic-resistance genes into any plasmid containing a replication origin, an ampicillin resistance gene and a loxP site.

  2. Identification of a set of endogenous reference genes for miRNA expression studies in Parkinson's disease blood samples.

    Science.gov (United States)

    Serafin, Alice; Foco, Luisa; Blankenburg, Hagen; Picard, Anne; Zanigni, Stefano; Zanon, Alessandra; Pramstaller, Peter P; Hicks, Andrew A; Schwienbacher, Christine

    2014-10-10

    Research on microRNAs (miRNAs) is becoming an increasingly attractive field, as these small RNA molecules are involved in several physiological functions and diseases. To date, only few studies have assessed the expression of blood miRNAs related to Parkinson's disease (PD) using microarray and quantitative real-time PCR (qRT-PCR). Measuring miRNA expression involves normalization of qRT-PCR data using endogenous reference genes for calibration, but their choice remains a delicate problem with serious impact on the resulting expression levels. The aim of the present study was to evaluate the suitability of a set of commonly used small RNAs as normalizers and to identify which of these miRNAs might be considered reliable reference genes in qRT-PCR expression analyses on PD blood samples. Commonly used reference genes snoRNA RNU24, snRNA RNU6B, snoRNA Z30 and miR-103a-3p were selected from the literature. We then analyzed the effect of using these genes as reference, alone or in any possible combination, on the measured expression levels of the target genes miR-30b-5p and miR-29a-3p, which have been previously reported to be deregulated in PD blood samples. We identified RNU24 and Z30 as a reliable and stable pair of reference genes in PD blood samples.

  3. Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells

    Directory of Open Access Journals (Sweden)

    Monticone Massimiliano

    2012-08-01

    Full Text Available Abstract Background Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting. We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type or highly diffuse single tumor cell infiltration (HD-type. Methods We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM. Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Results Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. Conclusions This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting. Massimiliano Monticone and Antonio Daga contributed equally to this work

  4. Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells.

    Science.gov (United States)

    Monticone, Massimiliano; Daga, Antonio; Candiani, Simona; Romeo, Francesco; Mirisola, Valentina; Viaggi, Silvia; Melloni, Ilaria; Pedemonte, Simona; Zona, Gianluigi; Giaretti, Walter; Pfeffer, Ulrich; Castagnola, Patrizio

    2012-08-17

    Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma) experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting.We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type) or highly diffuse single tumor cell infiltration (HD-type). We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM). Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting.Massimiliano Monticone and Antonio Daga contributed equally to this work.

  5. Meta-Analysis of the Transcriptome Reveals a Core Set of Shade-Avoidance Genes in Arabidopsis.

    Science.gov (United States)

    Sellaro, Romina; Pacín, Manuel; Casal, Jorge J

    2017-05-01

    The presence of neighboring vegetation modifies the light input perceived by photo-sensory receptors, initiating a signaling cascade that adjusts plant growth and physiology. Thousands of genes can change their expression during this process, but the structure of the transcriptional circuit is poorly understood. Here we present a meta-analysis of transcriptome data from Arabidopsis thaliana exposed to neighbor signals in different contexts, including organs where growth is promoted or inhibited by these signals. We identified a small set of genes that consistently and dynamically respond to neighbor light signals. This group is also affected by light during de-etiolation and day/night cycles. Among these genes, many of those with positive response to neighbor signals are binding targets of PHYTOCHROME-INTERACTING FACTORS (PIFs) and function as transcriptional regulators themselves, but none of these features is observed among those with negative response to neighbor signals. Changes. in neighbor signals can mimic the transcriptional signature of auxin, gibberellins, brassinosteroid, abscisic acid, ethylene, jasmonic acid and cytokinin but in a context-dependent manner. We propose the existence of a small core set of genes involved in downstream communication of PIF signaling status and in the control of light sensitivity and chloroplast metabolism. © 2017 The American Society of Photobiology.

  6. Different gene sets contribute to different symptom dimensions of depression and anxiety

    NARCIS (Netherlands)

    van Veen, Tineke; Goeman, Jelle J.; Monajemi, Ramin; Wardenaar, Klaas J.; Hartman, Catharina A.; Snieder, Harold; Nolte, Ilja M.; Penninx, Brenda W. J. H.; Zitman, Frans G.

    Although many genetic association studies have been carried out, it remains unclear which genes contribute to depression. This may be due to heterogeneity of the DSM-IV category of depression. Specific symptom-dimensions provide a more homogenous phenotype. Furthermore, as effects of individual

  7. Different gene sets contribute to different symptom dimensions of depression and anxiety

    NARCIS (Netherlands)

    van Veen, T.; Goeman, J.J.; Monajemi, R.; Wardenaar, K.J.; Hartman, C.A.; Snieder, H.; Nolte, I.M.; Penninx, B.W.J.H.; Zitman, F.G.

    2012-01-01

    Although many genetic association studies have been carried out, it remains unclear which genes contribute to depression. This may be due to heterogeneity of the DSM-IV category of depression. Specific symptom-dimensions provide a more homogenous phenotype. Furthermore, as effects of individual

  8. Genetic diversity of the conserved motifs of six bacterial leaf blight resistance genes in a set of rice landraces.

    Science.gov (United States)

    Das, Basabdatta; Sengupta, Samik; Prasad, Manoj; Ghose, Tapas Kumar

    2014-07-12

    Bacterial leaf blight (BLB) caused by the vascular pathogen Xanthomonas oryzae pv. oryzae (Xoo) is one of the most serious diseases leading to crop failure in rice growing countries. A total of 37 resistance genes against Xoo has been identified in rice. Of these, ten BLB resistance genes have been mapped on rice chromosomes, while 6 have been cloned, sequenced and characterized. Diversity analysis at the resistance gene level of this disease is scanty, and the landraces from West Bengal and North Eastern states of India have received little attention so far. The objective of this study was to assess the genetic diversity at conserved domains of 6 BLB resistance genes in a set of 22 rice accessions including landraces and check genotypes collected from the states of Assam, Nagaland, Mizoram and West Bengal. In this study 34 pairs of primers were designed from conserved domains of 6 BLB resistance genes; Xa1, xa5, Xa21, Xa21(A1), Xa26 and Xa27. The designed primer pairs were used to generate PCR based polymorphic DNA profiles to detect and elucidate the genetic diversity of the six genes in the 22 diverse rice accessions of known disease phenotype. A total of 140 alleles were identified including 41 rare and 26 null alleles. The average polymorphism information content (PIC) value was 0.56/primer pair. The DNA profiles identified each of the rice landraces unequivocally. The amplified polymorphic DNA bands were used to calculate genetic similarity of the rice landraces in all possible pair combinations. The similarity among the rice accessions ranged from 18% to 89% and the dendrogram produced from the similarity values was divided into 2 major clusters. The conserved domains identified within the sequenced rare alleles include Leucine-Rich Repeat, BED-type zinc finger domain, sugar transferase domain and the domain of the carbohydrate esterase 4 superfamily. This study revealed high genetic diversity at conserved domains of six BLB resistance genes in a set of 22

  9. Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects

    Science.gov (United States)

    Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling

    2017-01-01

    Abstract The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain–containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. PMID:28444351

  10. Signature Evaluation Tool (SET: a Java-based tool to evaluate and visualize the sample discrimination abilities of gene expression signatures

    Directory of Open Access Journals (Sweden)

    Lin Chi-Hung

    2008-01-01

    Full Text Available Abstract Background The identification of specific gene expression signature for distinguishing sample groups is a dominant field in cancer research. Although a number of tools have been developed to identify optimal gene expression signatures, the number of signature genes obtained is often overly large to be applied clinically. Furthermore, experimental verification is sometimes limited by the availability of wet-lab materials such as antibodies and reagents. A tool to evaluate the discrimination power of candidate genes is therefore in high demand by clinical researchers. Results Signature Evaluation Tool (SET is a Java-based tool adopting the Golub's weighted voting algorithm as well as incorporating the visual presentation of prediction strength for each array sample. SET provides a flexible and easy-to-follow platform to evaluate the discrimination power of a gene signature. Here, we demonstrated the application of SET for several purposes: (1 for signatures consisting of a large number of genes, SET offers the ability to rapidly narrow down the number of genes; (2 for a given signature (from third party analyses or user-defined, SET can re-evaluate and re-adjust its discrimination power by selecting/de-selecting genes repeatedly; (3 for multiple microarray datasets, SET can evaluate the classification capability of a signature among datasets; and (4 by providing a module to visualize the prediction strength for each sample, SET allows users to re-evaluate the discrimination power on mis-grouped or less-certain samples. Information obtained from the above applications could be useful in prognostic analyses or clinical management decisions. Conclusion Here we present SET to evaluate and visualize the sample-discrimination ability of a given gene expression signature. This tool provides a filtration function for signature identification and lies between clinical analyses and class prediction (or feature selection tools. The simplicity

  11. Repression of Middle Sporulation Genes in Saccharomyces cerevisiae by the Sum1-Rfm1-Hst1 Complex Is Maintained by Set1 and H3K4 Methylation.

    Science.gov (United States)

    Jaiswal, Deepika; Jezek, Meagan; Quijote, Jeremiah; Lum, Joanna; Choi, Grace; Kulkarni, Rushmie; Park, DoHwan; Green, Erin M

    2017-12-04

    The conserved yeast histone methyltransferase Set1 targets H3 lysine 4 (H3K4) for mono, di, and trimethylation and is linked to active transcription due to the euchromatic distribution of these methyl marks and the recruitment of Set1 during transcription. However, loss of Set1 results in increased expression of multiple classes of genes, including genes adjacent to telomeres and middle sporulation genes, which are repressed under normal growth conditions because they function in meiotic progression and spore formation. The mechanisms underlying Set1-mediated gene repression are varied, and still unclear in some cases, although repression has been linked to both direct and indirect action of Set1, associated with noncoding transcription, and is often dependent on the H3K4me2 mark. We show that Set1, and particularly the H3K4me2 mark, are implicated in repression of a subset of middle sporulation genes during vegetative growth. In the absence of Set1, there is loss of the DNA-binding transcriptional regulator Sum1 and the associated histone deacetylase Hst1 from chromatin in a locus-specific manner. This is linked to increased H4K5ac at these loci and aberrant middle gene expression. These data indicate that, in addition to DNA sequence, histone modification status also contributes to proper localization of Sum1 Our results also show that the role for Set1 in middle gene expression control diverges as cells receive signals to undergo meiosis. Overall, this work dissects an unexplored role for Set1 in gene-specific repression, and provides important insights into a new mechanism associated with the control of gene expression linked to meiotic differentiation. Copyright © 2017 Jaiswal et al.

  12. A new sequence data set of SSU rRNA gene for Scleractinia and its phylogenetic and ecological applications

    KAUST Repository

    Arrigoni, Roberto

    2016-11-27

    Scleractinian corals (i.e. hard corals) play a fundamental role in building and maintaining coral reefs, one of the most diverse ecosystems on Earth. Nevertheless, their phylogenies remain largely unresolved and little is known about dispersal and survival of their planktonic larval phase. The small subunit ribosomal RNA (SSU rRNA) is a commonly used gene for DNA barcoding in several metazoans, and small variable regions of SSU rRNA are widely adopted as barcode marker to investigate marine plankton community structure worldwide. Here, we provide a large sequence data set of the complete SSU rRNA gene from 298 specimens, representing all known extant reef coral families and a total of 106 genera. The secondary structure was extremely conserved within the order with few exceptions due to insertions or deletions occurring in the variable regions. Remarkable differences in SSU rRNA length and base composition were detected between and within acroporids (Acropora, Montipora, Isopora and Alveopora) compared to other corals. The V4 and V9 regions seem to be promising barcode loci because variation at commonly used barcode primer binding sites was extremely low, while their levels of divergence allowed families and genera to be distinguished. A time-calibrated phylogeny of Scleractinia is provided, and mutation rate heterogeneity is demonstrated across main lineages. The use of this data set as a valuable reference for investigating aspects of ecology, biology, molecular taxonomy and evolution of scleractinian corals is discussed.

  13. Association between expression of random gene sets and survival is evident in multiple cancer types and may be explained by sub-classification

    Science.gov (United States)

    2018-01-01

    One of the goals of cancer research is to identify a set of genes that cause or control disease progression. However, although multiple such gene sets were published, these are usually in very poor agreement with each other, and very few of the genes proved to be functional therapeutic targets. Furthermore, recent findings from a breast cancer gene-expression cohort showed that sets of genes selected randomly can be used to predict survival with a much higher probability than expected. These results imply that many of the genes identified in breast cancer gene expression analysis may not be causal of cancer progression, even though they can still be highly predictive of prognosis. We performed a similar analysis on all the cancer types available in the cancer genome atlas (TCGA), namely, estimating the predictive power of random gene sets for survival. Our work shows that most cancer types exhibit the property that random selections of genes are more predictive of survival than expected. In contrast to previous work, this property is not removed by using a proliferation signature, which implies that proliferation may not always be the confounder that drives this property. We suggest one possible solution in the form of data-driven sub-classification to reduce this property significantly. Our results suggest that the predictive power of random gene sets may be used to identify the existence of sub-classes in the data, and thus may allow better understanding of patient stratification. Furthermore, by reducing the observed bias this may allow more direct identification of biologically relevant, and potentially causal, genes. PMID:29470520

  14. Expressed genes for plant-type ribulose 1,5-bisphosphate carboxylase/oxygenase in the photosynthetic bacterium Chromatium vinosum, which possesses two complete sets of the genes.

    Science.gov (United States)

    Viale, A M; Kobayashi, H; Akazawa, T

    1989-05-01

    Two sets of genes for the large and small subunits of ribulose 1,5-bisphosphate carboxylase/oxygenase (RuBisCO) were detected in the photosynthetic purple sulfur bacterium Chromatium vinosum by hybridization analysis with RuBisCO gene probes, cloned by using the lambda Fix vector, and designated rbcL-rbcS and rbcA-rbcB. rbcL and rbcA encode the large subunits, and rbcS and rbcB encode the small subunits. rbcL-rbcS was the same as that reported previously (A. M. Viale, H. Kobayashi, T. Takabe, and T. Akazawa, FEBS Lett. 192:283-288, 1985). A DNA fragment bearing rbcA-rbcB was subcloned in plasmid vectors and sequenced. We found that rbcB was located 177 base pairs downstream of the rbcA coding region, and both genes were preceded by plausible procaryotic ribosome-binding sites. rbcA and rbcD encoded polypeptides of 472 and 118 amino acids, respectively. Edman degradation analysis of the subunits of RuBisCO isolated from C. vinosum showed that rbcA-rbcB encoded the enzyme present in this bacterium. The large- and small-subunit polypeptides were posttranslationally processed to remove 2 and 1 amino acid residues from their N-termini, respectively. Among hetero-oligomeric RuBisCOs, the C. vinosum large subunit exhibited higher homology to that from cyanobacteria, eucaryotic algae, and higher plants (71.6 to 74.2%) than to that from the chemolithotrophic bacterium Alcaligenes eutrophus (56.6%). A similar situation has been observed for the C. vinosum small subunit, although the homology among small subunits from different organisms was lower than that among the large subunits.

  15. Gene Sets for Utilization of Primary and Secondary Nutrition Supplies in the Distal Gut of Endangered Iberian Lynx

    Science.gov (United States)

    Alcaide, María; Messina, Enzo; Richter, Michael; Bargiela, Rafael; Peplies, Jörg; Huws, Sharon A.; Newbold, Charles J.; Golyshin, Peter N.; Simón, Miguel A.; López, Guillermo; Yakimov, Michail M.; Ferrer, Manuel

    2012-01-01

    Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus) fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads) related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of ‘presumptive’ aquaporin aqpZ genes and genes encoding ‘active’ lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(amino)lipids, glyco(amino)glycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases) in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80–100% wild rabbits) but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely

  16. Gene sets for utilization of primary and secondary nutrition supplies in the distal gut of endangered Iberian lynx.

    Directory of Open Access Journals (Sweden)

    María Alcaide

    Full Text Available Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of 'presumptive' aquaporin aqpZ genes and genes encoding 'active' lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(aminolipids, glyco(aminoglycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80-100% wild rabbits but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely

  17. The impact of primer sets on detection of the gene encoding biofilm-associated protein (Bap) in Acinetobacter baumannii: in silico and in vitro analysis.

    Science.gov (United States)

    Kodori, M; Douraghi, M; Yaseri, M; Rahbar, M

    2017-04-01

    The Acinetobacter baumannii virulence protein Bap is encoded by a large gene and contains both variable sequence and repetitive modules. To date, four primer sets targeting different regions of bap have been designed, but no study has evaluated all these primers simultaneously for detection of bap. Here, we assessed the effect of primer sets Bap I-IV, on detection of bap both in silico and in vitro. Using the primer set Bap II, all 143 tested strains yielded an amplicon corresponding to the bap gene. This primer set showed the highest sensitivity (100, 95% CI: 97·9-100%) compared to the other primer sets. This study demonstrates that primer set Bap II performs with optimal efficiency for detection of the bap gene among different strains. This study investigated the effect of nucleotide variation on PCR detection of the bap gene in various Acinetobacter baumannii strains. Since bap is the target gene for many detection assays, this variation can affect the detection efficiency. Here we present a primer set Bap II with optimal detection efficiency amongst 143 different strains, as shown by in silico and in vitro evidence. © 2017 The Society for Applied Microbiology.

  18. Integrative analysis of C. elegans modENCODE ChIP-seq data sets to infer gene regulatory interactions

    Science.gov (United States)

    Van Nostrand, Eric L.; Kim, Stuart K.

    2013-01-01

    The C. elegans modENCODE Consortium has defined in vivo binding sites for a large array of transcription factors by ChIP-seq. In this article, we present examples that illustrate how this compendium of ChIP-seq data can drive biological insights not possible with analysis of individual factors. First, we analyze the number of independent factors bound to the same locus, termed transcription factor complexity, and find that low-complexity sites are more likely to respond to altered expression of a single bound transcription factor. Next, we show that comparison of binding sites for the same factor across developmental stages can reveal insight into the regulatory network of that factor, as we find that the transcription factor UNC-62 has distinct binding profiles at different stages due to distinct cofactor co-association as well as tissue-specific alternative splicing. Finally, we describe an approach to infer potential regulators of gene expression changes found in profiling experiments (such as DNA microarrays) by screening these altered genes to identify significant enrichment for targets of a transcription factor identified in ChIP-seq data sets. After confirming that this approach can correctly identify the upstream regulator on expression data sets for which the regulator was previously known, we applied this approach to identify novel candidate regulators of transcriptional changes with age. The analysis revealed nine candidate aging regulators, of which three were previously known to have a role in longevity. We experimentally showed that two of the new candidate aging regulators can extend lifespan when overexpressed, indicating that this approach can identify novel functional regulators of complex processes. PMID:23531767

  19. Evidence for diversifying selection in a set of Mycobacterium tuberculosis genes in response to antibiotic- and nonantibiotic-related pressure.

    Science.gov (United States)

    Osório, Nuno S; Rodrigues, Fernando; Gagneux, Sebastien; Pedrosa, Jorge; Pinto-Carbó, Marta; Castro, António G; Young, Douglas; Comas, Iñaki; Saraiva, Margarida

    2013-06-01

    Tuberculosis (TB) is a global health problem estimated to kill 1.4 million people per year. Recent advances in the genomics of the causative agents of TB, bacteria known as the Mycobacterium tuberculosis complex (MTBC), have allowed a better comprehension of its population structure and provided the foundation for molecular evolution analyses. These studies are crucial for a better understanding of TB, including the variation of vaccine efficacy and disease outcome, together with the emergence of drug resistance. Starting from the analysis of 73 publicly available genomes from all the main MTBC lineages, we have screened for evidences of positive selection, a set of 576 genes previously associated with drug resistance or encoding membrane proteins. As expected, because antibiotics constitute strong selective pressure, some of the codons identified correspond to the position of confirmed drug-resistance-associated substitutions in the genes embB, rpoB, and katG. Furthermore, we identified diversifying selection in specific codons of the genes Rv0176 and Rv1872c coding for MCE1-associated transmembrane protein and a putative l-lactate dehydrogenase, respectively. Amino acid sequence analyses showed that in Rv0176, sites undergoing diversifying selection were in a predicted antigen region that varies between "modern" lineages and "ancient" MTBC/BCG strains. In Rv1872c, some of the sites under selection are predicted to impact protein function and thus might result from metabolic adaptation. These results illustrate that diversifying selection in MTBC is happening as a consequence of both antibiotic treatment and other evolutionary pressures.

  20. Genomics in cereals: from genome-wide conserved orthologous set (COS) sequences to candidate genes for trait dissection.

    Science.gov (United States)

    Quraishi, Umar Masood; Abrouk, Michael; Bolot, Stéphanie; Pont, Caroline; Throude, Mickael; Guilhot, Nicolas; Confolent, Carole; Bortolini, Fernanda; Praud, Sébastien; Murigneux, Alain; Charmet, Gilles; Salse, Jerome

    2009-11-01

    Recent updates in comparative genomics among cereals have provided the opportunity to identify conserved orthologous set (COS) DNA sequences for cross-genome map-based cloning of candidate genes underpinning quantitative traits. New tools are described that are applicable to any cereal genome of interest, namely, alignment criterion for orthologous couples identification, as well as the Intron Spanning Marker software to automatically select intron-spanning primer pairs. In order to test the software, it was applied to the bread wheat genome, and 695 COS markers were assigned to 1,535 wheat loci (on average one marker/2.6 cM) based on 827 robust rice-wheat orthologs. Furthermore, 31 of the 695 COS markers were selected to fine map a pentosan viscosity quantitative trait loci (QTL) on wheat chromosome 7A. Among the 31 COS markers, 14 (45%) were polymorphic between the parental lines and 12 were mapped within the QTL confidence interval with one marker every 0.6 cM defining candidate genes among the rice orthologous region.

  1. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    Science.gov (United States)

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  2. Human longevity and variation in DNA damage response and repair: Study of the contribution of sub-processes using competitive gene-set analysis

    DEFF Research Database (Denmark)

    Debrabant, Birgit; Sørensen, Mette; Flachsbart, Friederike

    2014-01-01

    the competitive gene-set analysis by Wang et al indicated that BER, HRR and RECQ associated stronger with longevity than the respective remaining genes of the pathway (P-values=0.004-0.048). For HRR and RECQ, only one gene contributed to the significance, whereas for BER several genes contributed....... These associations did, however, generally not pass correction for multiple testing. Still, these findings indicate that, of the entire pathway, variation in BER might influence longevity the most. These modest sized P-values were not replicated in a German sample. This might, though, be due to differences...

  3. Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

    Science.gov (United States)

    Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

    2015-02-01

    The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  4. Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

    Science.gov (United States)

    Medina, Ignacio; Montaner, David; Bonifaci, Nuria; Pujana, Miguel Angel; Carbonell, José; Tarraga, Joaquin; Al-Shahrour, Fatima; Dopazo, Joaquin

    2009-01-01

    Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap/ PMID:19502494

  5. A MultiSite GatewayTM vector set for the functional analysis of genes in the model Saccharomyces cerevisiae

    Directory of Open Access Journals (Sweden)

    Nagels Durand Astrid

    2012-09-01

    Full Text Available Abstract Background Recombinatorial cloning using the GatewayTM technology has been the method of choice for high-throughput omics projects, resulting in the availability of entire ORFeomes in GatewayTM compatible vectors. The MultiSite GatewayTM system allows combining multiple genetic fragments such as promoter, ORF and epitope tag in one single reaction. To date, this technology has not been accessible in the yeast Saccharomyces cerevisiae, one of the most widely used experimental systems in molecular biology, due to the lack of appropriate destination vectors. Results Here, we present a set of three-fragment MultiSite GatewayTM destination vectors that have been developed for gene expression in S. cerevisiae and that allow the assembly of any promoter, open reading frame, epitope tag arrangement in combination with any of four auxotrophic markers and three distinct replication mechanisms. As an example of its applicability, we used yeast three-hybrid to provide evidence for the assembly of a ternary complex of plant proteins involved in jasmonate signalling and consisting of the JAZ, NINJA and TOPLESS proteins. Conclusion Our vectors make MultiSite GatewayTM cloning accessible in S. cerevisiae and implement a fast and versatile cloning method for the high-throughput functional analysis of (heterologous proteins in one of the most widely used model organisms for molecular biology research.

  6. A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records

    DEFF Research Database (Denmark)

    Jiang, Li; Edwards, Stefan M.; Thomsen, Bo

    2014-01-01

    Background: Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic...... from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining...... causal genes supported the reliability of our approach. Moreover, these data suggest many promising novel candidate genes for human disorders that have a complex mode of inheritance. Conclusion: We have implemented and validated a network-based approach to prioritize genes for human diseases based...

  7. The map-1 gene family in root-knot nematodes, Meloidogyne spp.: a set of taxonomically restricted genes specific to clonal species.

    Directory of Open Access Journals (Sweden)

    Iva Tomalova

    Full Text Available Taxonomically restricted genes (TRGs, i.e., genes that are restricted to a limited subset of phylogenetically related organisms, may be important in adaptation. In parasitic organisms, TRG-encoded proteins are possible determinants of the specificity of host-parasite interactions. In the root-knot nematode (RKN Meloidogyne incognita, the map-1 gene family encodes expansin-like proteins that are secreted into plant tissues during parasitism, thought to act as effectors to promote successful root infection. MAP-1 proteins exhibit a modular architecture, with variable number and arrangement of 58 and 13-aa domains in their central part. Here, we address the evolutionary origins of this gene family using a combination of bioinformatics and molecular biology approaches. Map-1 genes were solely identified in one single member of the phylum Nematoda, i.e., the genus Meloidogyne, and not detected in any other nematode, thus indicating that the map-1 gene family is indeed a TRG family. A phylogenetic analysis of the distribution of map-1 genes in RKNs further showed that these genes are specifically present in species that reproduce by mitotic parthenogenesis, with the exception of M. floridensis, and could not be detected in RKNs reproducing by either meiotic parthenogenesis or amphimixis. These results highlight the divergence between mitotic and meiotic RKN species as a critical transition in the evolutionary history of these parasites. Analysis of the sequence conservation and organization of repeated domains in map-1 genes suggests that gene duplication(s together with domain loss/duplication have contributed to the evolution of the map-1 family, and that some strong selection mechanism may be acting upon these genes to maintain their functional role(s in the specificity of the plant-RKN interactions.

  8. Identification of the sigmaB regulon of Bacillus cereus and conservation of sigmaB-regulated genes in low-GC-content gram-positive bacteria.

    Science.gov (United States)

    van Schaik, Willem; van der Voort, Menno; Molenaar, Douwe; Moezelaar, Roy; de Vos, Willem M; Abee, Tjakko

    2007-06-01

    The alternative sigma factor sigma(B) has an important role in the acquisition of stress resistance in many gram-positive bacteria, including the food-borne pathogen Bacillus cereus. Here, we describe the identification of the set of sigma(B)-regulated genes in B. cereus by DNA microarray analysis of the transcriptome upon a mild heat shock. Twenty-four genes could be identified as being sigma(B) dependent as witnessed by (i) significantly lower expression levels of these genes in mutants with a deletion of sigB and rsbY (which encode the alternative sigma factor sigma(B) and a crucial positive regulator of sigma(B) activity, respectively) than in the parental strain B. cereus ATCC 14579 and (ii) increased expression of these genes upon a heat shock. Newly identified sigma(B)-dependent genes in B. cereus include a histidine kinase and two genes that have predicted functions in spore germination. This study shows that the sigma(B) regulon of B. cereus is considerably smaller than that of other gram-positive bacteria. This appears to be in line with phylogenetic analyses where sigma(B) of the B. cereus group was placed close to the ancestral form of sigma(B) in gram-positive bacteria. The data described in this study and previous studies in which the complete sigma(B) regulon of the gram-positive bacteria Bacillus subtilis, Listeria monocytogenes, and Staphylococcus aureus were determined enabled a comparison of the sets of sigma(B)-regulated genes in the different gram-positive bacteria. This showed that only three genes (rsbV, rsbW, and sigB) are conserved in their sigma(B) dependency in all four bacteria, suggesting that the sigma(B) regulon of the different gram-positive bacteria has evolved to perform niche-specific functions.

  9. Interaction between Social/Psychosocial Factors and Genetic Variants on Body Mass Index: A Gene-Environment Interaction Analysis in a Longitudinal Setting

    Directory of Open Access Journals (Sweden)

    Wei Zhao

    2017-09-01

    Full Text Available Obesity, which develops over time, is one of the leading causes of chronic diseases such as cardiovascular disease. However, hundreds of BMI (body mass index-associated genetic loci identified through large-scale genome-wide association studies (GWAS only explain about 2.7% of BMI variation. Most common human traits are believed to be influenced by both genetic and environmental factors. Past studies suggest a variety of environmental features that are associated with obesity, including socioeconomic status and psychosocial factors. This study combines both gene/regions and environmental factors to explore whether social/psychosocial factors (childhood and adult socioeconomic status, social support, anger, chronic burden, stressful life events, and depressive symptoms modify the effect of sets of genetic variants on BMI in European American and African American participants in the Health and Retirement Study (HRS. In order to incorporate longitudinal phenotype data collected in the HRS and investigate entire sets of single nucleotide polymorphisms (SNPs within gene/region simultaneously, we applied a novel set-based test for gene-environment interaction in longitudinal studies (LGEWIS. Childhood socioeconomic status (parental education was found to modify the genetic effect in the gene/region around SNP rs9540493 on BMI in European Americans in the HRS. The most significant SNP (rs9540488 by childhood socioeconomic status interaction within the rs9540493 gene/region was suggestively replicated in the Multi-Ethnic Study of Atherosclerosis (MESA (p = 0.07.

  10. Expression cloning screening of a unique and full-length set of cDNA clones is an efficient method for identifying genes involved in Xenopus neurogenesis.

    Science.gov (United States)

    Voigt, Jana; Chen, Jun-An; Gilchrist, Mike; Amaya, Enrique; Papalopulu, Nancy

    2005-03-01

    Functional screens, where a large numbers of cDNA clones are assayed for certain biological activity, are a useful tool in elucidating gene function. In Xenopus, gain of function screens are performed by pool screening, whereby RNA transcribed in vitro from groups of cDNA clones, ranging from thousands to a hundred, are injected into early embryos. Once an activity is detected in a pool, the active clone is identified by sib-selection. Such screens are intrinsically biased towards potent genes, whose RNA is active at low quantities. To improve the sensitivity and efficiency of a gain of function screen we have bioinformatically processed an arrayed and EST sequenced set of 100,000 gastrula and neurula cDNA clones, to create a unique and full-length set of approximately 2500 clones. Reducing the redundancy and excluding truncated clones from the starting clone set reduced the total number of clones to be screened, in turn allowing us to reduce the pool size to just eight clones per pool. We report that the efficiency of screening this clone set is five-fold higher compared to a redundant set derived from the same libraries. We have screened 960 cDNA clones from this set, for genes that are involved in neurogenesis. We describe the overexpression phenotypes of 18 single clones, the majority of which show a previously uncharacterised phenotype and some of which are completely novel. In situ hybridisation analysis shows that a large number of these genes are specifically expressed in neural tissue. These results demonstrate the effectiveness of a unique full-length set of cDNA clones for uncovering players in a developmental pathway.

  11. Interaction between dopamine D2 receptor genotype and parental rule-setting in adolescent alcohol use: evidence for a gene-parenting interaction.

    NARCIS (Netherlands)

    Zwaluw, C.S. van der; Engels, R.C.E.M.; Vermulst, A.A.; Franke, B.; Buitelaar, J.K.; Verkes, R.J.; Scholte, R.H.

    2010-01-01

    Association studies investigating the link between the dopamine D2 receptor gene (DRD2) and alcohol (mis)use have shown inconsistent results. This may be due to lack of attention for environmental factors. High levels of parental rule-setting are associated with lower levels of adolescent alcohol

  12. Bridging cancer biology with the clinic: relative expression of a GRHL2-mediated gene-set pair predicts breast cancer metastasis.

    Directory of Open Access Journals (Sweden)

    Xinan Yang

    Full Text Available Identification and characterization of crucial gene target(s that will allow focused therapeutics development remains a challenge. We have interrogated the putative therapeutic targets associated with the transcription factor Grainy head-like 2 (GRHL2, a critical epithelial regulatory factor. We demonstrate the possibility to define the molecular functions of critical genes in terms of their personalized expression profiles, allowing appropriate functional conclusions to be derived. A novel methodology, relative expression analysis with gene-set pairs (RXA-GSP, is designed to explore the potential clinical utility of cancer-biology discovery. Observing that Grhl2-overexpression leads to increased metastatic potential in vitro, we established a model assuming Grhl2-induced or -inhibited genes confer poor or favorable prognosis respectively for cancer metastasis. Training on public gene expression profiles of 995 breast cancer patients, this method prioritized one gene-set pair (GRHL2, CDH2, FN1, CITED2, MKI67 versus CTNNB1 and CTNNA3 from all 2717 possible gene-set pairs (GSPs. The identified GSP significantly dichotomized 295 independent patients for metastasis-free survival (log-rank tested p = 0.002; severe empirical p = 0.035. It also showed evidence of clinical prognostication in another independent 388 patients collected from three studies (log-rank tested p = 3.3e-6. This GSP is independent of most traditional prognostic indicators, and is only significantly associated with the histological grade of breast cancer (p = 0.0017, a GRHL2-associated clinical character (p = 6.8e-6, Spearman correlation, suggesting that this GSP is reflective of GRHL2-mediated events. Furthermore, a literature review indicates the therapeutic potential of the identified genes. This research demonstrates a novel strategy to integrate both biological experiments and clinical gene expression profiles for extracting and elucidating the genomic

  13. Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome

    Directory of Open Access Journals (Sweden)

    Gaora Peadar Ó

    2010-10-01

    Full Text Available Abstract Background Currently, a number of bioinformatics methods are available to generate appropriate lists of genes from a microarray experiment. While these lists represent an accurate primary analysis of the data, fewer options exist to contextualise those lists. The development and validation of such methods is crucial to the wider application of microarray technology in the clinical setting. Two key challenges in clinical bioinformatics involve appropriate statistical modelling of dynamic transcriptomic changes, and extraction of clinically relevant meaning from very large datasets. Results Here, we apply an approach to gene set enrichment analysis that allows for detection of bi-directional enrichment within a gene set. Furthermore, we apply canonical correlation analysis and Fisher's exact test, using plasma marker data with known clinical relevance to aid identification of the most important gene and pathway changes in our transcriptomic dataset. After a 28-day dietary intervention with high-CLA beef, a range of plasma markers indicated a marked improvement in the metabolic health of genetically obese mice. Tissue transcriptomic profiles indicated that the effects were most dramatic in liver (1270 genes significantly changed; p Conclusion Bi-directional gene set enrichment analysis more accurately reflects dynamic regulatory behaviour in biochemical pathways, and as such highlighted biologically relevant changes that were not detected using a traditional approach. In such cases where transcriptomic response to treatment is exceptionally large, canonical correlation analysis in conjunction with Fisher's exact test highlights the subset of pathways showing strongest correlation with the clinical markers of interest. In this case, we have identified selenoamino acid metabolism and steroid biosynthesis as key pathways mediating the observed relationship between metabolic health and high-CLA beef. These results indicate that this type of

  14. Selection and validation of a set of reliable reference genes for quantitative RT-PCR studies in the brain of the Cephalopod Mollusc Octopus vulgaris

    Directory of Open Access Journals (Sweden)

    Biffali Elio

    2009-07-01

    Full Text Available Abstract Background Quantitative real-time polymerase chain reaction (RT-qPCR is valuable for studying the molecular events underlying physiological and behavioral phenomena. Normalization of real-time PCR data is critical for a reliable mRNA quantification. Here we identify reference genes to be utilized in RT-qPCR experiments to normalize and monitor the expression of target genes in the brain of the cephalopod mollusc Octopus vulgaris, an invertebrate. Such an approach is novel for this taxon and of advantage in future experiments given the complexity of the behavioral repertoire of this species when compared with its relatively simple neural organization. Results We chose 16S, and 18S rRNA, actB, EEF1A, tubA and ubi as candidate reference genes (housekeeping genes, HKG. The expression of 16S and 18S was highly variable and did not meet the requirements of candidate HKG. The expression of the other genes was almost stable and uniform among samples. We analyzed the expression of HKG into two different set of animals using tissues taken from the central nervous system (brain parts and mantle (here considered as control tissue by BestKeeper, geNorm and NormFinder. We found that HKG expressions differed considerably with respect to brain area and octopus samples in an HKG-specific manner. However, when the mantle is treated as control tissue and the entire central nervous system is considered, NormFinder revealed tubA and ubi as the most suitable HKG pair. These two genes were utilized to evaluate the relative expression of the genes FoxP, creb, dat and TH in O. vulgaris. Conclusion We analyzed the expression profiles of some genes here identified for O. vulgaris by applying RT-qPCR analysis for the first time in cephalopods. We validated candidate reference genes and found the expression of ubi and tubA to be the most appropriate to evaluate the expression of target genes in the brain of different octopuses. Our results also underline the

  15. Distinguishing possible mechanisms for auxin-mediated developmental control in Arabidopsis: models with two Aux/IAA and ARF proteins, and two target gene-sets.

    Science.gov (United States)

    Bridge, L J; Mirams, G R; Kieffer, M L; King, J R; Kepinski, S

    2012-01-01

    New models of gene transcriptional responses to auxin signalling in Arabidopsis are presented. This work extends a previous model of auxin signalling to include networks of gene-sets which may control developmental responses along auxin gradients. Key elements of this new study include models of signalling pathways and networks involving two Aux-IAA proteins (IAAs), auxin response factors (ARFs) and gene targets. Hypotheses for the gene network topologies which may be involved in developmental responses have been tested against experimental observations for root hair growth in particular. In studying these models, we provide a framework for the analysis of auxin signalling with multiple IAAs and ARFs, and discuss the implications of bistability in such systems. Copyright © 2011 Elsevier Inc. All rights reserved.

  16. Identification and Construction of Combinatory Cancer Hallmark-Based Gene Signature Sets to Predict Recurrence and Chemotherapy Benefit in Stage II Colorectal Cancer.

    Science.gov (United States)

    Gao, Shanwu; Tibiche, Chabane; Zou, Jinfeng; Zaman, Naif; Trifiro, Mark; O'Connor-McCourt, Maureen; Wang, Edwin

    2016-01-01

    Decisions regarding adjuvant therapy in patients with stage II colorectal cancer (CRC) have been among the most challenging and controversial in oncology over the past 20 years. To develop robust combinatory cancer hallmark-based gene signature sets (CSS sets) that more accurately predict prognosis and identify a subset of patients with stage II CRC who could gain survival benefits from adjuvant chemotherapy. Thirteen retrospective studies of patients with stage II CRC who had clinical follow-up and adjuvant chemotherapy were analyzed. Respective totals of 162 and 843 patients from 2 and 11 independent cohorts were used as the discovery and validation cohorts, respectively. A total of 1005 patients with stage II CRC were included in the 13 cohorts. Among them, 84 of 416 patients in 3 independent cohorts received fluorouracil-based adjuvant chemotherapy. Identification of CSS sets to predict relapse-free survival and identify a subset of patients with stage II CRC who could gain substantial survival benefits from fluorouracil-based adjuvant chemotherapy. Eight cancer hallmark-based gene signatures (30 genes each) were identified and used to construct CSS sets for determining prognosis. The CSS sets were validated in 11 independent cohorts of 767 patients with stage II CRC who did not receive adjuvant chemotherapy. The CSS sets accurately stratified patients into low-, intermediate-, and high-risk groups. Five-year relapse-free survival rates were 94%, 78%, and 45%, respectively, representing 60%, 28%, and 12% of patients with stage II disease. The 416 patients with CSS set-defined high-risk stage II CRC who received fluorouracil-based adjuvant chemotherapy showed a substantial gain in survival benefits from the treatment (ie, recurrence reduced by 30%-40% in 5 years). The CSS sets substantially outperformed other prognostic predictors of stage 2 CRC. They are more accurate and robust for prognostic predictions and facilitate the identification of patients with stage

  17. A random set scoring model for prioritization of disease candidate genes using protein complexes and data-mining of GeneRIF, OMIM and PubMed records

    DEFF Research Database (Denmark)

    Jiang, Li; Edwards, Stefan M.; Thomsen, Bo

    2014-01-01

    Background: Prioritizing genetic variants is a challenge because disease susceptibility loci are often located in genes of unknown function or the relationship with the corresponding phenotype is unclear. A global data-mining exercise on the biomedical literature can establish the phenotypic...... from PubMed abstracts, OMIM, and GeneRIF records. We also investigated the validity of several vocabulary filters and different likelihood thresholds for predicted protein-protein interactions in terms of their effect on the network-based gene-prioritization approach, which relies on text-mining...... of the phenotype data. Our method demonstrated good precision and sensitivity compared with those of two alternative complex-based prioritization approaches. We then conducted a global ranking of all human genes according to their relevance to a range of human diseases. The resulting accurate ranking of known...

  18. AnovArray: a set of SAS macros for the analysis of variance of gene expression data

    Directory of Open Access Journals (Sweden)

    Renard Jean-Paul

    2005-06-01

    Full Text Available Abstract Background Analysis of variance is a powerful approach to identify differentially expressed genes in a complex experimental design for microarray and macroarray data. The advantage of the anova model is the possibility to evaluate multiple sources of variation in an experiment. Results AnovArray is a package implementing ANOVA for gene expression data using SAS® statistical software. The originality of the package is 1 to quantify the different sources of variation on all genes together, 2 to provide a quality control of the model, 3 to propose two models for a gene's variance estimation and to perform a correction for multiple comparisons. Conclusion AnovArray is freely available at http://www-mig.jouy.inra.fr/stat/AnovArray and requires only SAS® statistical software.

  19. Digital gene expression approach over multiple RNA-Seq data sets to detect neoblast transcriptional changes in Schmidtea mediterranea.

    Science.gov (United States)

    Rodríguez-Esteban, Gustavo; González-Sastre, Alejandro; Rojo-Laguna, José Ignacio; Saló, Emili; Abril, Josep F

    2015-05-08

    The freshwater planarian Schmidtea mediterranea is recognised as a valuable model for research into adult stem cells and regeneration. With the advent of the high-throughput sequencing technologies, it has become feasible to undertake detailed transcriptional analysis of its unique stem cell population, the neoblasts. Nonetheless, a reliable reference for this type of studies is still lacking. Taking advantage of digital gene expression (DGE) sequencing technology we compare all the available transcriptomes for S. mediterranea and improve their annotation. These results are accessible via web for the community of researchers. Using the quantitative nature of DGE, we describe the transcriptional profile of neoblasts and present 42 new neoblast genes, including several cancer-related genes and transcription factors. Furthermore, we describe in detail the Smed-meis-like gene and the three Nuclear Factor Y subunits Smed-nf-YA, Smed-nf-YB-2 and Smed-nf-YC. DGE is a valuable tool for gene discovery, quantification and annotation. The application of DGE in S. mediterranea confirms the planarian stem cells or neoblasts as a complex population of pluripotent and multipotent cells regulated by a mixture of transcription factors and cancer-related genes.

  20. Monitoring of gene expression in bacteria during infections using an adaptable set of bioluminescent, fluorescent and colorigenic fusion vectors.

    Directory of Open Access Journals (Sweden)

    Frank Uliczka

    Full Text Available A family of versatile promoter-probe plasmids for gene expression analysis was developed based on a modular expression plasmid system (pZ. The vectors contain different replicons with exchangeable antibiotic cassettes to allow compatibility and expression analysis on a low-, midi- and high-copy number basis. Suicide vector variants also permit chromosomal integration of the reporter fusion and stable vector derivatives can be used for in vivo or in situ expression studies under non-selective conditions. Transcriptional and translational fusions to the reporter genes gfp(mut3.1, amCyan, dsRed2, luxCDABE, phoA or lacZ can be constructed, and presence of identical multiple cloning sites in the vector system facilitates the interchange of promoters or reporter genes between the plasmids of the series. The promoter of the constitutively expressed gapA gene of Escherichia coli was included to obtain fluorescent and bioluminescent expression constructs. A combination of the plasmids allows simultaneous detection and gene expression analysis in individual bacteria, e.g. in bacterial communities or during mouse infections. To test our vector system, we analyzed and quantified expression of Yersinia pseudotuberculosis virulence genes under laboratory conditions, in association with cells and during the infection process.

  1. Monitoring of gene expression in bacteria during infections using an adaptable set of bioluminescent, fluorescent and colorigenic fusion vectors.

    Science.gov (United States)

    Uliczka, Frank; Pisano, Fabio; Kochut, Annika; Opitz, Wiebke; Herbst, Katharina; Stolz, Tatjana; Dersch, Petra

    2011-01-01

    A family of versatile promoter-probe plasmids for gene expression analysis was developed based on a modular expression plasmid system (pZ). The vectors contain different replicons with exchangeable antibiotic cassettes to allow compatibility and expression analysis on a low-, midi- and high-copy number basis. Suicide vector variants also permit chromosomal integration of the reporter fusion and stable vector derivatives can be used for in vivo or in situ expression studies under non-selective conditions. Transcriptional and translational fusions to the reporter genes gfp(mut3.1), amCyan, dsRed2, luxCDABE, phoA or lacZ can be constructed, and presence of identical multiple cloning sites in the vector system facilitates the interchange of promoters or reporter genes between the plasmids of the series. The promoter of the constitutively expressed gapA gene of Escherichia coli was included to obtain fluorescent and bioluminescent expression constructs. A combination of the plasmids allows simultaneous detection and gene expression analysis in individual bacteria, e.g. in bacterial communities or during mouse infections. To test our vector system, we analyzed and quantified expression of Yersinia pseudotuberculosis virulence genes under laboratory conditions, in association with cells and during the infection process.

  2. A connected set of genes associated with programmed cell death implicated in controlling the hypersensitive response in maize.

    Science.gov (United States)

    Olukolu, Bode A; Negeri, Adisu; Dhawan, Rahul; Venkata, Bala P; Sharma, Pankaj; Garg, Anshu; Gachomo, Emma; Marla, Sandeep; Chu, Kevin; Hasan, Anna; Ji, Jiabing; Chintamanani, Satya; Green, Jason; Shyu, Chi-Ren; Wisser, Randall; Holland, James; Johal, Guri; Balint-Kurti, Peter

    2013-02-01

    Rp1-D21 is a maize auto-active resistance gene conferring a spontaneous hypersensitive response (HR) of variable severity depending on genetic background. We report an association mapping strategy based on the Mutant Assisted Gene Identification and Characterization approach to identify naturally occurring allelic variants associated with phenotypic variation in HR. Each member of a collection of 231 diverse inbred lines of maize constituting a high-resolution association mapping panel were crossed to a parental stock heterozygous for Rp1-D21, and the segregating F(1) generation testcrosses were evaluated for phenotypes associated with lesion severity for 2 years at two locations. A genome-wide scan for associations with HR was conducted with 47,445 SNPs using a linear mixed model that controlled for spurious associations due to population structure. Since the ability to identify candidate genes and the resolution of association mapping are highly influenced by linkage disequilibrium (LD), we examined the extent of genome-wide LD. On average, marker pairs separated by >10 kbp had an r(2) value of HR traits were locally saturated with additional SNP markers to establish local LD structure and precisely identify candidate genes. Six significantly associated SNPs at five loci were detected. At each locus, the associated SNP was located within or immediately adjacent to candidate causative genes predicted to play significant roles in the control of programmed cell death and especially in ubiquitin pathway-related processes.

  3. A set of regulatory genes co-expressed in embryonic human brain is implicated in disrupted speech development.

    Science.gov (United States)

    Eising, Else; Carrion-Castillo, Amaia; Vino, Arianna; Strand, Edythe A; Jakielski, Kathy J; Scerri, Thomas S; Hildebrand, Michael S; Webster, Richard; Ma, Alan; Mazoyer, Bernard; Francks, Clyde; Bahlo, Melanie; Scheffer, Ingrid E; Morgan, Angela T; Shriberg, Lawrence D; Fisher, Simon E

    2018-02-20

    Genetic investigations of people with impaired development of spoken language provide windows into key aspects of human biology. Over 15 years after FOXP2 was identified, most speech and language impairments remain unexplained at the molecular level. We sequenced whole genomes of nineteen unrelated individuals diagnosed with childhood apraxia of speech, a rare disorder enriched for causative mutations of large effect. Where DNA was available from unaffected parents, we discovered de novo mutations, implicating genes, including CHD3, SETD1A and WDR5. In other probands, we identified novel loss-of-function variants affecting KAT6A, SETBP1, ZFHX4, TNRC6B and MKL2, regulatory genes with links to neurodevelopment. Several of the new candidates interact with each other or with known speech-related genes. Moreover, they show significant clustering within a single co-expression module of genes highly expressed during early human brain development. This study highlights gene regulatory pathways in the developing brain that may contribute to acquisition of proficient speech.

  4. Can, a putative oncogene associated with myeloid leukemogenesis, may be activated by fusion of its 3' half to different genes: characterization of the set gene

    NARCIS (Netherlands)

    von Lindern, M.; van Baal, S.; Wiegant, J.; Raap, A.; Hagemeijer, A.; Grosveld, G.

    1992-01-01

    The translocation (6;9)(p23;q34) in acute nonlymphocytic leukemia results in the formation of a highly consistent dek-can fusion gene. Translocation breakpoints invariably occur in single introns of dek and can, which were named icb-6 and icb-9, respectively. In a case of acute undifferentiated

  5. Sexual and asexual oogenesis require the expression of unique and shared sets of genes in the insect Acyrthosiphon pisum.

    Science.gov (United States)

    Gallot, Aurore; Shigenobu, Shuji; Hashiyama, Tomomi; Jaubert-Possamai, Stéphanie; Tagu, Denis

    2012-02-15

    Although sexual reproduction is dominant within eukaryotes, asexual reproduction is widespread and has evolved independently as a derived trait in almost all major taxa. How asexuality evolved in sexual organisms is unclear. Aphids, such as Acyrthosiphon pisum, alternate between asexual and sexual reproductive means, as the production of parthenogenetic viviparous females or sexual oviparous females and males varies in response to seasonal photoperiodism. Consequently, sexual and asexual development in aphids can be analyzed simultaneously in genetically identical individuals. We compared the transcriptomes of aphid embryos in the stages of development during which the trajectory of oogenesis is determined for producing sexual or asexual gametes. This study design aimed at identifying genes involved in the onset of the divergent mechanisms that result in the sexual or asexual phenotype. We detected 33 genes that were differentially transcribed in sexual and asexual embryos. Functional annotation by gene ontology (GO) showed a biological signature of oogenesis, cell cycle regulation, epigenetic regulation and RNA maturation. In situ hybridizations demonstrated that 16 of the differentially-transcribed genes were specifically expressed in germ cells and/or oocytes of asexual and/or sexual ovaries, and therefore may contribute to aphid oogenesis. We categorized these 16 genes by their transcription patterns in the two types of ovaries; they were: i) expressed during sexual and asexual oogenesis; ii) expressed during sexual and asexual oogenesis but with different localizations; or iii) expressed only during sexual or asexual oogenesis. Our results show that asexual and sexual oogenesis in aphids share common genetic programs but diverge by adapting specificities in their respective gene expression profiles in germ cells and oocytes.

  6. Sexual and asexual oogenesis require the expression of unique and shared sets of genes in the insect Acyrthosiphon pisum

    Directory of Open Access Journals (Sweden)

    Gallot Aurore

    2012-02-01

    Full Text Available Abstract Background Although sexual reproduction is dominant within eukaryotes, asexual reproduction is widespread and has evolved independently as a derived trait in almost all major taxa. How asexuality evolved in sexual organisms is unclear. Aphids, such as Acyrthosiphon pisum, alternate between asexual and sexual reproductive means, as the production of parthenogenetic viviparous females or sexual oviparous females and males varies in response to seasonal photoperiodism. Consequently, sexual and asexual development in aphids can be analyzed simultaneously in genetically identical individuals. Results We compared the transcriptomes of aphid embryos in the stages of development during which the trajectory of oogenesis is determined for producing sexual or asexual gametes. This study design aimed at identifying genes involved in the onset of the divergent mechanisms that result in the sexual or asexual phenotype. We detected 33 genes that were differentially transcribed in sexual and asexual embryos. Functional annotation by gene ontology (GO showed a biological signature of oogenesis, cell cycle regulation, epigenetic regulation and RNA maturation. In situ hybridizations demonstrated that 16 of the differentially-transcribed genes were specifically expressed in germ cells and/or oocytes of asexual and/or sexual ovaries, and therefore may contribute to aphid oogenesis. We categorized these 16 genes by their transcription patterns in the two types of ovaries; they were: i expressed during sexual and asexual oogenesis; ii expressed during sexual and asexual oogenesis but with different localizations; or iii expressed only during sexual or asexual oogenesis. Conclusions Our results show that asexual and sexual oogenesis in aphids share common genetic programs but diverge by adapting specificities in their respective gene expression profiles in germ cells and oocytes.

  7. Arabidopsis Histone Methyltransferase SET DOMAIN GROUP8 Mediates Induction of the Jasmonate/Ethylene Pathway Genes in Plant Defense Response to Necrotrophic Fungi1[W][OA

    Science.gov (United States)

    Berr, Alexandre; McCallum, Emily J.; Alioua, Abdelmalek; Heintz, Dimitri; Heitz, Thierry; Shen, Wen-Hui

    2010-01-01

    As sessile organisms, plants have to endure a wide variety of biotic and abiotic stresses, and accordingly they have evolved intricate and rapidly inducible defense strategies associated with the activation of a battery of genes. Among other mechanisms, changes in chromatin structure are thought to provide a flexible, global, and stable means for the regulation of gene transcription. In support of this idea, we demonstrate here that the Arabidopsis (Arabidopsis thaliana) histone methyltransferase SET DOMAIN GROUP8 (SDG8) plays a crucial role in plant defense against fungal pathogens by regulating a subset of genes within the jasmonic acid (JA) and/or ethylene signaling pathway. We show that the loss-of-function mutant sdg8-1 displays reduced resistance to the necrotrophic fungal pathogens Alternaria brassicicola and Botrytis cinerea. While levels of JA, a primary phytohormone involved in plant defense, and camalexin, a major phytoalexin against fungal pathogens, remain unchanged or even above normal in sdg8-1, induction of several defense genes within the JA/ethylene signaling pathway is severely compromised in response to fungal infection or JA treatment in mutant plants. Both downstream genes and, remarkably, also upstream mitogen-activated protein kinase kinase genes MKK3 and MKK5 are misregulated in sdg8-1. Accordingly, chromatin immunoprecipitation analysis shows that sdg8-1 impairs dynamic changes of histone H3 lysine 36 methylation at defense marker genes as well as at MKK3 and MKK5, which normally occurs upon infection with fungal pathogens or methyl JA treatment in wild-type plants. Our data indicate that SDG8-mediated histone H3 lysine 36 methylation may serve as a memory of permissive transcription for a subset of defense genes, allowing rapid establishment of transcriptional induction. PMID:20810545

  8. Redundancy control in pathway databases (ReCiPa): an application for improving gene-set enrichment analysis in Omics studies and "Big data" biology.

    Science.gov (United States)

    Vivar, Juan C; Pemu, Priscilla; McPherson, Ruth; Ghosh, Sujoy

    2013-08-01

    Abstract Unparalleled technological advances have fueled an explosive growth in the scope and scale of biological data and have propelled life sciences into the realm of "Big Data" that cannot be managed or analyzed by conventional approaches. Big Data in the life sciences are driven primarily via a diverse collection of 'omics'-based technologies, including genomics, proteomics, metabolomics, transcriptomics, metagenomics, and lipidomics. Gene-set enrichment analysis is a powerful approach for interrogating large 'omics' datasets, leading to the identification of biological mechanisms associated with observed outcomes. While several factors influence the results from such analysis, the impact from the contents of pathway databases is often under-appreciated. Pathway databases often contain variously named pathways that overlap with one another to varying degrees. Ignoring such redundancies during pathway analysis can lead to the designation of several pathways as being significant due to high content-similarity, rather than truly independent biological mechanisms. Statistically, such dependencies also result in correlated p values and overdispersion, leading to biased results. We investigated the level of redundancies in multiple pathway databases and observed large discrepancies in the nature and extent of pathway overlap. This prompted us to develop the application, ReCiPa (Redundancy Control in Pathway Databases), to control redundancies in pathway databases based on user-defined thresholds. Analysis of genomic and genetic datasets, using ReCiPa-generated overlap-controlled versions of KEGG and Reactome pathways, led to a reduction in redundancy among the top-scoring gene-sets and allowed for the inclusion of additional gene-sets representing possibly novel biological mechanisms. Using obesity as an example, bioinformatic analysis further demonstrated that gene-sets identified from overlap-controlled pathway databases show stronger evidence of prior association

  9. Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery.

    Science.gov (United States)

    Jia, Zhilong; Liu, Ying; Guan, Naiyang; Bo, Xiaochen; Luo, Zhigang; Barnes, Michael R

    2016-05-27

    Drug repositioning, finding new indications for existing drugs, has gained much recent attention as a potentially efficient and economical strategy for accelerating new therapies into the clinic. Although improvement in the sensitivity of computational drug repositioning methods has identified numerous credible repositioning opportunities, few have been progressed. Arguably the "black box" nature of drug action in a new indication is one of the main blocks to progression, highlighting the need for methods that inform on the broader target mechanism in the disease context. We demonstrate that the analysis of co-expressed genes may be a critical first step towards illumination of both disease pathology and mode of drug action. We achieve this using a novel framework, co-expressed gene-set enrichment analysis (cogena) for co-expression analysis of gene expression signatures and gene set enrichment analysis of co-expressed genes. The cogena framework enables simultaneous, pathway driven, disease and drug repositioning analysis. Cogena can be used to illuminate coordinated changes within disease transcriptomes and identify drugs acting mechanistically within this framework. We illustrate this using a psoriatic skin transcriptome, as an exemplar, and recover two widely used Psoriasis drugs (Methotrexate and Ciclosporin) with distinct modes of action. Cogena out-performs the results of Connectivity Map and NFFinder webservers in similar disease transcriptome analyses. Furthermore, we investigated the literature support for the other top-ranked compounds to treat psoriasis and showed how the outputs of cogena analysis can contribute new insight to support the progression of drugs into the clinic. We have made cogena freely available within Bioconductor or https://github.com/zhilongjia/cogena . In conclusion, by targeting co-expressed genes within disease transcriptomes, cogena offers novel biological insight, which can be effectively harnessed for drug discovery and

  10. Root Exudates of Various Host Plants of Rhizobium leguminosarum Contain Different Sets of Inducers of Rhizobium Nodulation Genes

    NARCIS (Netherlands)

    Zaat, Sebastian A. J.; Wijffelman, Carel A.; Mulders, Ine H. M.; van Brussel, Anton A. N.; Lugtenberg, Ben J. J.

    1988-01-01

    Rhizobium promoters involved in the formation of root nodules on leguminous plants are activated by flavonoids in plant root exudate. A series of Rhizobium strains which all contain the inducible Rhizobium leguminosarum nodA promoter fused to the Escherichia coli lacZ gene, and which differ only in

  11. Assessment of topoisomerase II-alpha gene status by dual color chromogenic in situ hybridization in a set of Iraqi patients with invasive breast carcinoma

    Directory of Open Access Journals (Sweden)

    Rasha Abd Alraouf Neama

    2017-01-01

    Full Text Available Background: The human epidermal growth factor receptor 2(HER2 proto-oncogene is overexpressed or amplified in approximately 15%–25% of invasive breast cancers. Approximately 35% of HER2-amplified breast cancers have coamplification of the topoisomerase II-alpha (TOP2A gene encoding an enzyme that is a major target of anthracyclines. Hence, the determination of genetic alteration (amplification or deletion of both genes is considered as an important predictive factor that determines the response of breast cancer patients to treatment. The aims of this study are to determinate TOP2A status gene amplification in a set of Iraqi patients with breast cancer that have had an equivocal (2+ and positive HER2/neu by immunohistochemistry (IHC and to compare the results with estrogen receptor (ER and progesterone receptor (PR and HER2/neu status. Patients and Methods: A cross-sectional prospective study done on 53 patients with invasive breast carcinoma. Twenty-six out of total 53 cases were positive HER2/neu (3+, the remaining 27 equivocal HER2-IHC (2+ cases reanalyzed using dual-color chromogenic in situ hybridization (ZytoVision probe kit for further identification of HER2/neu gene amplification. Using chromogenic in situ hybridization (CISH, TOP2A gene status determination was done for all cases. Results: There is a direct significant correlation between TOP2A gene amplification and HER2/neu positivity, P < 0.05 in that 15 (39.4% out of 38 positive HER2/neu cases were associated with topoisomerase gene amplification. Regarding relation of topoisomerase gene to hormone receptor status (ER and PR, there was a significant negative relationship between the gene and ER receptor status. The higher level of gene amplification was noticed in ER and PR negative cases in about 13 (43.3% and 14 (48.2% for ER and PR, respectively. Conclusion: TOP2A gene status has a significantly positive correlation with HER2/neu status while it has a significantly negative

  12. SET domain-containing protein 5 is required for expression of primordial germ cell specification-associated genes in murine embryonic stem cells.

    Science.gov (United States)

    Yu, Seung Eun; Kim, Min Seong; Park, Su Hyung; Yoo, Byong Chul; Kim, Kyung Hee; Jang, Yeun Kyu

    2017-07-01

    Primordial germ cell (PGC) specification is one of the most fundamental processes in developmental biology. Because PGCs are a common source of both gametes, generation of PGCs from embryonic stem cells (ESCs) is a useful model for analysing the germ line lineage. Although several studies focused on the role of epigenetic regulation on PGC differentiation from ESCs in vitro have been published, germ line commitment remains poorly understood. Here, we show that SET domain-containing protein (Setd5), which has a previously unknown function, is essential for regulating germ cell-associated genes in murine ESCs (mESCs). Even though Setd5 knockdown with 3 distinct shRNAs did not affect expression of pluripotency genes or levels of global histone methylation, all 3 shRNAs significantly diminished the expression of early and late-stage PGC-associated genes. Furthermore, our immunoprecipitation assay showed that Setd5 can interact with Tbl1xr1 and Ctr9, which are components of 2 different transcriptional regulatory complexes, namely, NcoR1 corepressor complex and Paf1 complex, respectively, in mESCs. Taken together, our data suggest that Setd5 is required for maintaining PGC-associated genes and Setd5-associated protein complexes containing Tbl1xr1 and Ctr9, which in turn are likely involved in regulating germ cell-related genes in mESCs. Copyright © 2017 John Wiley & Sons, Ltd.

  13. Poster: Observing change in crowded data sets in 3D space - Visualizing gene expression in human tissues

    KAUST Repository

    Rogowski, Marcin

    2013-03-01

    We have been confronted with a real-world problem of visualizing and observing change of gene expression between different human tissues. In this paper, we are presenting a universal representation space based on two-dimensional gel electrophoresis as opposed to force-directed layouts encountered most often in similar problems. We are discussing the methods we devised to make observing change more convenient in a 3D virtual reality environment. © 2013 IEEE.

  14. Monitoring of Gene Expression in Bacteria during Infections Using an Adaptable Set of Bioluminescent, Fluorescent and Colorigenic Fusion Vectors

    OpenAIRE

    Uliczka, Frank; Pisano, Fabio; Kochut, Annika; Opitz, Wiebke; Herbst, Katharina; Stolz, Tatjana; Dersch, Petra

    2011-01-01

    A family of versatile promoter-probe plasmids for gene expression analysis was developed based on a modular expression plasmid system (pZ). The vectors contain different replicons with exchangeable antibiotic cassettes to allow compatibility and expression analysis on a low-, midi- and high-copy number basis. Suicide vector variants also permit chromosomal integration of the reporter fusion and stable vector derivatives can be used for in vivo or in situ expression studies under non-selective...

  15. Alternative primer sets for PCR detection of genotypes involved in bacterial aerobic BTEX degradation : Distribution of the genes in BTEX degrading isolates and in subsurface soils of a BTEX contaminated industrial site

    NARCIS (Netherlands)

    Hendrickx, B; Junca, H; Vosahlova, J; Lindner, A; Ruegg, [No Value; Bucheli-Witschel, M; Faber, F; Egli, T; Mau, M; Schlomann, M; Brennerova, M; Brenner, [No Value; Pieper, DH; Top, EM; Dejonghe, W; Bastiaens, L; Springael, D

    Eight new primer sets were designed for PCR detection of (i) mono-oxygenase and dioxygenase gene sequences involved in initial attack of bacterial aerobic BTEX degradation and of (ii) catechol 2,3-dioxygenase gene sequences responsible for metacleavage of the aromatic ring. The new primer sets

  16. Independent evolution of the core and accessory gene sets in the genus Neisseria: insights gained from the genome of Neisseria lactamica isolate 020-06

    Directory of Open Access Journals (Sweden)

    White Brian

    2010-11-01

    Full Text Available Abstract Background The genus Neisseria contains two important yet very different pathogens, N. meningitidis and N. gonorrhoeae, in addition to non-pathogenic species, of which N. lactamica is the best characterized. Genomic comparisons of these three bacteria will provide insights into the mechanisms and evolution of pathogenesis in this group of organisms, which are applicable to understanding these processes more generally. Results Non-pathogenic N. lactamica exhibits very similar population structure and levels of diversity to the meningococcus, whilst gonococci are essentially recent descendents of a single clone. All three species share a common core gene set estimated to comprise around 1190 CDSs, corresponding to about 60% of the genome. However, some of the nucleotide sequence diversity within this core genome is particular to each group, indicating that cross-species recombination is rare in this shared core gene set. Other than the meningococcal cps region, which encodes the polysaccharide capsule, relatively few members of the large accessory gene pool are exclusive to one species group, and cross-species recombination within this accessory genome is frequent. Conclusion The three Neisseria species groups represent coherent biological and genetic groupings which appear to be maintained by low rates of inter-species horizontal genetic exchange within the core genome. There is extensive evidence for exchange among positively selected genes and the accessory genome and some evidence of hitch-hiking of housekeeping genes with other loci. It is not possible to define a 'pathogenome' for this group of organisms and the disease causing phenotypes are therefore likely to be complex, polygenic, and different among the various disease-associated phenotypes observed.

  17. Set of classical PCRs for detection of mutations in Candida glabrata FKS genes linked with echinocandin resistance.

    Science.gov (United States)

    Dudiuk, Catiana; Gamarra, Soledad; Leonardeli, Florencia; Jimenez-Ortigosa, Cristina; Vitale, Roxana G; Afeltra, Javier; Perlin, David S; Garcia-Effron, Guillermo

    2014-07-01

    Clinical echinocandin resistance among Candida glabrata strains is increasing, especially in the United States. Antifungal susceptibility testing is considered mandatory to guide therapeutic decisions. However, these methodologies are not routinely performed in the hospital setting due to their complexity and the time needed to obtain reliable results. Echinocandin failure in C. glabrata is linked exclusively to Fks1p and Fks2p amino acid substitutions, and detection of such substitutions would serve as a surrogate marker to identify resistant isolates. In this work, we report an inexpensive, simple, and quick classical PCR set able to objectively detect the most common mechanisms of echinocandin resistance in C. glabrata within 4 h. The usefulness of this assay was assessed using a blind collection of 50 C. glabrata strains, including 16 FKS1 and/or FKS2 mutants. Copyright © 2014, American Society for Microbiology. All Rights Reserved.

  18. Analysis of the real EADGENE data set: Comparison of methods and guidelines for data normalisation and selection of differentially expressed genes (Open Access publication

    Directory of Open Access Journals (Sweden)

    Sørensen Peter

    2007-11-01

    Full Text Available Abstract A large variety of methods has been proposed in the literature for microarray data analysis. The aim of this paper was to present techniques used by the EADGENE (European Animal Disease Genomics Network of Excellence WP1.4 participants for data quality control, normalisation and statistical methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two different mastitis causing bacteria: Escherichia coli and Staphylococcus aureus. It was reassuring to see that most of the teams found the same main biological results. In fact, most of the differentially expressed genes were found for infection by E. coli between uninfected and 24 h challenged udder quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised a biological problem of cross-talk between infected and uninfected quarters which will have to be dealt with for further microarray studies.

  19. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  20. A single visit multidisciplinary model for managing patients with mutations in moderate and high-risk genes in a community practice setting.

    Science.gov (United States)

    O'Leary, Michael P; Goldner, Bryan S; Abboy, Sridevi; Mercado, Philip D; Plurad, Hong Yoon

    2018-01-01

    The introduction of screening for multiple high and moderate risk mutations in genes has resulted in a complex approach to patient care involving multiple disciplines. We sought to describe the feasibility of a single visit multidisciplinary approach to the management of patients with an identified high/moderate risk gene mutation. Patients who presented to our community hospital over a 1-year period who were found to have a high/moderate risk genetic mutation on a screening panel were referred to the High Risk Genetic Clinic. Thirty-five patients were included. The majority were female [34 (97.1%)], Hispanic [22 (62.9%)], with a family history of cancer [21 (60%)]. Mean age was 40.3 years. Most of the participants had a BRCA1 gene mutation [10 (28.6%)]. Patients were seen at the High Risk Genetic Clinic within a mean of 41.9 days from the day of genetic mutation diagnosis. Four patients did not show and were significantly younger (19.3 vs. 39.6 years, p = 0.014). In this community setting, we provided coordinated care within multiple disciplines related to a genetic mutation in a single clinic visit. Increased efforts at coordinating early care should be directed towards patients diagnosed at a younger age.

  1. Exon Array Analysis using re-defined probe sets results in reliable identification of alternatively spliced genes in non-small cell lung cancer

    Directory of Open Access Journals (Sweden)

    Gröne Jörn

    2010-11-01

    Full Text Available Abstract Background Treatment of non-small cell lung cancer with novel targeted therapies is a major unmet clinical need. Alternative splicing is a mechanism which generates diverse protein products and is of functional relevance in cancer. Results In this study, a genome-wide analysis of the alteration of splicing patterns between lung cancer and normal lung tissue was performed. We generated an exon array data set derived from matched pairs of lung cancer and normal lung tissue including both the adenocarcinoma and the squamous cell carcinoma subtypes. An enhanced workflow was developed to reliably detect differential splicing in an exon array data set. In total, 330 genes were found to be differentially spliced in non-small cell lung cancer compared to normal lung tissue. Microarray findings were validated with independent laboratory methods for CLSTN1, FN1, KIAA1217, MYO18A, NCOR2, NUMB, SLK, SYNE2, TPM1, (in total, 10 events and ADD3, which was analysed in depth. We achieved a high validation rate of 69%. Evidence was found that the activity of FOX2, the splicing factor shown to cause cancer-specific splicing patterns in breast and ovarian cancer, is not altered at the transcript level in several cancer types including lung cancer. Conclusions This study demonstrates how alternatively spliced genes can reliably be identified in a cancer data set. Our findings underline that key processes of cancer progression in NSCLC are affected by alternative splicing, which can be exploited in the search for novel targeted therapies.

  2. Fine-scale linkage mapping reveals a small set of candidate genes influencing honey bee grooming behavior in response to Varroa mites.

    Directory of Open Access Journals (Sweden)

    Miguel E Arechavaleta-Velasco

    Full Text Available Populations of honey bees in North America have been experiencing high annual colony mortality for 15-20 years. Many apicultural researchers believe that introduced parasites called Varroa mites (V. destructor are the most important factor in colony deaths. One important resistance mechanism that limits mite population growth in colonies is the ability of some lines of honey bees to groom mites from their bodies. To search for genes influencing this trait, we used an Illumina Bead Station genotyping array to determine the genotypes of several hundred worker bees at over a thousand single-nucleotide polymorphisms in a family that was apparently segregating for alleles influencing this behavior. Linkage analyses provided a genetic map with 1,313 markers anchored to genome sequence. Genotypes were analyzed for association with grooming behavior, measured as the time that individual bees took to initiate grooming after mites were placed on their thoraces. Quantitative-trait-locus interval mapping identified a single chromosomal region that was significant at the chromosome-wide level (p<0.05 on chromosome 5 with a LOD score of 2.72. The 95% confidence interval for quantitative trait locus location contained only 27 genes (honey bee official gene annotation set 2 including Atlastin, Ataxin and Neurexin-1 (AmNrx1, which have potential neurodevelopmental and behavioral effects. Atlastin and Ataxin homologs are associated with neurological diseases in humans. AmNrx1 codes for a presynaptic protein with many alternatively spliced isoforms. Neurexin-1 influences the growth, maintenance and maturation of synapses in the brain, as well as the type of receptors most prominent within synapses. Neurexin-1 has also been associated with autism spectrum disorder and schizophrenia in humans, and self-grooming behavior in mice.

  3. Thy1.2 driven expression of transgenic His₆-SUMO2 in the brain of mice alters a restricted set of genes.

    Science.gov (United States)

    Rossner, Moritz J; Tirard, Marilyn

    2014-08-05

    Protein SUMOylation is a post-translational protein modification with a key regulatory role in nerve cell development and function, but its function in mammals in vivo has only been studied cursorily. We generated two new transgenic mouse lines that express His6-tagged SUMO1 and SUMO2 driven by the Thy1.2 promoter. The brains of mice of the two lines express transgenic His6-SUMO peptides and conjugate them to substrates in vivo but cytoarchitecture and synaptic organization of adult transgenic mouse brains are indistinguishable from the wild-type situation. We investigated the impact of transgenic SUMO expression on gene transcription in the hippocampus by performing genome wide analyses using microarrays. Surprisingly, no changes were observed in Thy1.2::His6-SUMO1 transgenic mice and only a restricted set of genes were upregulated in Thy1.2::His6-SUMO2 mice. Among these, Penk1 (Preproenkephalin 1), which encodes Met-enkephalin neuropeptides, showed the highest degree of alteration. Accordingly, a significant increase in Met-enkephalin peptide levels in the hippocampus of Thy1.2::His6-SUMO2 was detected, but the expression levels and cellular localization of Met-enkephalin receptors were not changed. Thus, transgenic neuronal expression of His6-SUMO1 or His6-SUMO2 only induces very minor phenotypical changes in mice. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. A set of genes previously implicated in the hypoxia response might be an important modulator in the rat ear tissue response to mechanical stretch

    Directory of Open Access Journals (Sweden)

    Orgill Dennis

    2007-11-01

    Full Text Available Abstract Background Wounds are increasingly important in our aging societies. Pathologies such as diabetes predispose patients to chronic wounds that can cause pain, infection, and amputation. The vacuum assisted closure device shows remarkable outcomes in wound healing. Its mechanism of action is unclear despite several hypotheses advanced. We previously hypothesized that micromechanical forces can heal wounds. To understand better the biological response of soft tissue to forces, rat ears in vivo were stretched and their gene expression patterns over time obtained. The absolute enrichment (AE algorithm that obtains a combined up and down regulated picture of the expression analysis was implemented. Results With the use of AE, the hypoxia gene set was the most important at a highly significant level. A co-expression network analysis showed that important co-regulated members of the hypoxia pathway include a glucose transporter (slc2a8, heme oxygenase, and nitric oxide synthase2 among others. Conclusion It appears that the hypoxia pathway may be an important modulator of response of soft tissue to forces. This finding gives us insights not only into the underlying biology, but also into clinical interventions that could be designed to mimic within wounded tissue the effects of forces without all the negative effects that forces themselves create.

  5. High nasal carriage rate of Staphylococcus aureus containing Panton-Valentine leukocidin- and EDIN-encoding genes in community and hospital settings in Burkina Faso

    Directory of Open Access Journals (Sweden)

    Abdoul-Salam OUEDRAOGO

    2016-09-01

    Full Text Available The objectives of the present study were to investigate the rate of S. aureus nasal carriage and molecular characteristics in hospital and community settings in Bobo Dioulasso, Burkina Faso. Nasal samples (n=219 were collected from 116 healthy volunteers and 103 hospitalized patients in July and August 2014. Samples were first screened using CHROMagar Staph aureus chromogenic agar plates, and S. aureus strains were identified by mass spectrometry. Antibiotic susceptibility was tested using the disk diffusion method on Müller-Hinton agar. All S. aureus isolates were genotyped using DNA microarray. Overall, the rate of S. aureus nasal carriage was 32.9% (72/219, 29% in healthy volunteers and 37% in hospital patients. Among the S. aureus isolates, only four methicillin-resistant S. aureus (MRSA strains were identified and all in hospital patients (3.9%. The 72 S. aureus isolates from nasal samples belonged to 16 different clonal complexes, particularly to CC 152-MSSA (22 clones and CC1-MSSA (nine clones. Two clones were significantly associated with community settings: CC1-MSSA and CC45-MSSA. The MRSA strains belonged to the ST88-MRSA-IV or the CC8-MRSA-V complex. A very high prevalence of toxinogenic strains 52,2% (36/69, containing Panton-Valentine leucocidin- and EDIN-encoding genes, was identified among the S. aureus isolates in community and hospital settings. This study provides the first characterization of S. aureus clones and their genetic characteristics in Burkina Faso. Altogether, it highlights the low prevalence of antimicrobial resistance, high diversity of methicillin-sensitive S. aureus clones and high frequency of toxinogenic S. aureus strains.

  6. CELF family RNA-binding protein UNC-75 regulates two sets of mutually exclusive exons of the unc-32 gene in neuron-specific manners in Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Hidehito Kuroyanagi

    Full Text Available An enormous number of alternative pre-mRNA splicing patterns in multicellular organisms are coordinately defined by a limited number of regulatory proteins and cis elements. Mutually exclusive alternative splicing should be strictly regulated and is a challenging model for elucidating regulation mechanisms. Here we provide models of the regulation of two sets of mutually exclusive exons, 4a-4c and 7a-7b, of the Caenorhabditis elegans uncoordinated (unc-32 gene, encoding the a subunit of V0 complex of vacuolar-type H(+-ATPases. We visualize selection patterns of exon 4 and exon 7 in vivo by utilizing a trio and a pair of symmetric fluorescence splicing reporter minigenes, respectively, to demonstrate that they are regulated in tissue-specific manners. Genetic analyses reveal that RBFOX family RNA-binding proteins ASD-1 and FOX-1 and a UGCAUG stretch in intron 7b are involved in the neuron-specific selection of exon 7a. Through further forward genetic screening, we identify UNC-75, a neuron-specific CELF family RNA-binding protein of unknown function, as an essential regulator for the exon 7a selection. Electrophoretic mobility shift assays specify a short fragment in intron 7a as the recognition site for UNC-75 and demonstrate that UNC-75 specifically binds via its three RNA recognition motifs to the element including a UUGUUGUGUUGU stretch. The UUGUUGUGUUGU stretch in the reporter minigenes is actually required for the selection of exon 7a in the nervous system. We compare the amounts of partially spliced RNAs in the wild-type and unc-75 mutant backgrounds and raise a model for the mutually exclusive selection of unc-32 exon 7 by the RBFOX family and UNC-75. The neuron-specific selection of unc-32 exon 4b is also regulated by UNC-75 and the unc-75 mutation suppresses the Unc phenotype of the exon-4b-specific allele of unc-32 mutants. Taken together, UNC-75 is the neuron-specific splicing factor and regulates both sets of the mutually exclusive

  7. The AHL- and BDSF-dependent quorum sensing systems control specific and overlapping sets of genes in Burkholderia cenocepacia H111.

    Directory of Open Access Journals (Sweden)

    Nadine Schmid

    Full Text Available Quorum sensing in Burkholderia cenocepacia H111 involves two signalling systems that depend on different signal molecules, namely N-acyl homoserine lactones (AHLs and the diffusible signal factor cis-2-dodecenoic acid (BDSF. Previous studies have shown that AHLs and BDSF control similar phenotypic traits, including biofilm formation, proteolytic activity and pathogenicity. In this study we mapped the BDSF stimulon by RNA-Seq and shotgun proteomics analysis. We demonstrate that a set of the identified BDSF-regulated genes or proteins are also controlled by AHLs, suggesting that the two regulons partially overlap. The detailed analysis of two mutually regulated operons, one encoding three lectins and the other one encoding the large surface protein BapA and its type I secretion machinery, revealed that both AHLs and BDSF are required for full expression, suggesting that the two signalling systems operate in parallel. In accordance with this, we show that both AHLs and BDSF are required for biofilm formation and protease production.

  8. Polymorphisms in sodium-dependent vitamin C transporter genes and plasma, aqueous humor and lens nucleus ascorbate concentrations in an ascorbate depleted setting.

    Science.gov (United States)

    Senthilkumari, Srinivasan; Talwar, Badri; Dharmalingam, Kuppamuthu; Ravindran, Ravilla D; Jayanthi, Ramamurthy; Sundaresan, Periasamy; Saravanan, Charu; Young, Ian S; Dangour, Alan D; Fletcher, Astrid E

    2014-07-01

    We have previously reported low concentrations of plasma ascorbate and low dietary vitamin C intake in the older Indian population and a strong inverse association of these with cataract. Little is known about ascorbate levels in aqueous humor and lens in populations habitually depleted of ascorbate and no studies in any setting have investigated whether genetic polymorphisms influence ascorbate levels in ocular tissues. Our objectives were to investigate relationships between ascorbate concentrations in plasma, aqueous humor and lens and whether these relationships are influenced by Single Nucleotide Polymorphisms (SNPs) in sodium-dependent vitamin C transporter genes (SLC23A1 and SLC23A2). We enrolled sixty patients (equal numbers of men and women, mean age 63 years) undergoing small incision cataract surgery in southern India. We measured ascorbate concentrations in plasma, aqueous humor and lens nucleus using high performance liquid chromatography. SLC23A1 SNPs (rs4257763, rs6596473) and SLC23A2 SNPs (rs1279683 and rs12479919) were genotyped using a TaqMan assay. Patients were interviewed for lifestyle factors which might influence ascorbate. Plasma vitamin C was normalized by a log10 transformation. Statistical analysis used linear regression with the slope of the within-subject associations estimated using beta (β) coefficients. The ascorbate concentrations (μmol/L) were: plasma ascorbate, median and inter-quartile range (IQR), 15.2 (7.8, 34.5), mean (SD) of aqueous humor ascorbate, 1074 (545) and lens nucleus ascorbate, 0.42 (0.16) (μmol/g lens nucleus wet weight). Minimum allele frequencies were: rs1279683 (0.28), rs12479919 (0.30), rs659647 (0.48). Decreasing concentrations of ocular ascorbate from the common to the rare genotype were observed for rs6596473 and rs12479919. The per allele difference in aqueous humor ascorbate for rs6596473 was -217 μmol/L, p humor ascorbate were higher for the GG genotype of rs6596473: GG, β = 1460 compared to

  9. Using logistic regression to improve the prognostic value of microarray gene expression data sets: application to early-stage squamous cell carcinoma of the lung and triple negative breast carcinoma.

    Science.gov (United States)

    Mount, David W; Putnam, Charles W; Centouri, Sara M; Manziello, Ann M; Pandey, Ritu; Garland, Linda L; Martinez, Jesse D

    2014-06-10

    Numerous microarray-based prognostic gene expression signatures of primary neoplasms have been published but often with little concurrence between studies, thus limiting their clinical utility. We describe a methodology using logistic regression, which circumvents limitations of conventional Kaplan Meier analysis. We applied this approach to a thrice-analyzed and published squamous cell carcinoma (SQCC) of the lung data set, with the objective of identifying gene expressions predictive of early death versus long survival in early-stage disease. A similar analysis was applied to a data set of triple negative breast carcinoma cases, which present similar clinical challenges. Important to our approach is the selection of homogenous patient groups for comparison. In the lung study, we selected two groups (including only stages I and II), equal in size, of earliest deaths and longest survivors. Genes varying at least four-fold were tested by logistic regression for accuracy of prediction (area under a ROC plot). The gene list was refined by applying two sliding-window analyses and by validations using a leave-one-out approach and model building with validation subsets. In the breast study, a similar logistic regression analysis was used after selecting appropriate cases for comparison. A total of 8594 variable genes were tested for accuracy in predicting earliest deaths versus longest survivors in SQCC. After applying the two sliding window and the leave-one-out analyses, 24 prognostic genes were identified; most of them were B-cell related. When the same data set of stage I and II cases was analyzed using a conventional Kaplan Meier (KM) approach, we identified fewer immune-related genes among the most statistically significant hits; when stage III cases were included, most of the prognostic genes were missed. Interestingly, logistic regression analysis of the breast cancer data set identified many immune-related genes predictive of clinical outcome. Stratification of

  10. RBiomirGS: an all-in-one miRNA gene set analysis solution featuring target mRNA mapping and expression profile integration

    Directory of Open Access Journals (Sweden)

    Jing Zhang

    2018-01-01

    Full Text Available Background With the continuous discovery of microRNA’s (miRNA association with a wide range of biological and cellular processes, expression profile-based functional characterization of such post-transcriptional regulation is crucial for revealing its significance behind particular phenotypes. Profound advancement in bioinformatics has been made to enable in depth investigation of miRNA’s role in regulating cellular and molecular events, resulting in a huge quantity of software packages covering different aspects of miRNA functional analysis. Therefore, an all-in-one software solution is in demand for a comprehensive yet highly efficient workflow. Here we present RBiomirGS, an R package for a miRNA gene set (GS analysis. Methods The package utilizes multiple databases for target mRNA mapping, estimates miRNA effect on the target mRNAs through miRNA expression profile and conducts a logistic regression-based GS enrichment. Additionally, human ortholog Entrez ID conversion functionality is included for target mRNAs. Results By incorporating all the core steps into one package, RBiomirGS eliminates the need for switching between different software packages. The modular structure of RBiomirGS enables various access points to the analysis, with which users can choose the most relevant functionalities for their workflow. Conclusions With RBiomirGS, users are able to assess the functional significance of the miRNA expression profile under the corresponding experimental condition by minimal input and intervention. Accordingly, RBiomirGS encompasses an all-in-one solution for miRNA GS analysis. RBiomirGS is available on GitHub (http://github.com/jzhangc/RBiomirGS. More information including instruction and examples can be found on website (http://kenstoreylab.com/?page_id=2865.

  11. Enumeration of minimal stoichiometric precursor sets in metabolic networks.

    Science.gov (United States)

    Andrade, Ricardo; Wannagat, Martin; Klein, Cecilia C; Acuña, Vicente; Marchetti-Spaccamela, Alberto; Milreu, Paulo V; Stougie, Leen; Sagot, Marie-France

    2016-01-01

    What an organism needs at least from its environment to produce a set of metabolites, e.g. target(s) of interest and/or biomass, has been called a minimal precursor set. Early approaches to enumerate all minimal precursor sets took into account only the topology of the metabolic network (topological precursor sets). Due to cycles and the stoichiometric values of the reactions, it is often not possible to produce the target(s) from a topological precursor set in the sense that there is no feasible flux. Although considering the stoichiometry makes the problem harder, it enables to obtain biologically reasonable precursor sets that we call stoichiometric. Recently a method to enumerate all minimal stoichiometric precursor sets was proposed in the literature. The relationship between topological and stoichiometric precursor sets had however not yet been studied. Such relationship between topological and stoichiometric precursor sets is highlighted. We also present two algorithms that enumerate all minimal stoichiometric precursor sets. The first one is of theoretical interest only and is based on the above mentioned relationship. The second approach solves a series of mixed integer linear programming problems. We compared the computed minimal precursor sets to experimentally obtained growth media of several Escherichia coli strains using genome-scale metabolic networks. The results show that the second approach efficiently enumerates minimal precursor sets taking stoichiometry into account, and allows for broad in silico studies of strains or species interactions that may help to understand e.g. pathotype and niche-specific metabolic capabilities. sasita is written in Java, uses cplex as LP solver and can be downloaded together with all networks and input files used in this paper at http://www.sasita.gforge.inria.fr.

  12. Phylogenomics reveals surprising sets of essential and dispensable clades of MIKC(c)-group MADS-box genes in flowering plants.

    Science.gov (United States)

    Gramzow, Lydia; Theißen, Günter

    2015-06-01

    MIKC(C)-group MADS-box genes are involved in the control of many developmental processes in flowering plants. All of these genes are members of one of 17 clades that had already been established in the most recent common ancestor (MRCA) of extant angiosperms. These clades trace back to 11 seed plant-specific superclades that were present in the MRCA of extant seed plants. Due to their important role in plant development and evolution, the origin of the clades of MIKC(C)-group genes has been studied in great detail. In contrast, whether any of these ancestral clades has ever been lost completely in any species has not been investigated so far. Here, we determined the presence of these clades by BLAST, PSI-BLAST, and Hidden Markov Model searches and by phylogenetic methods in the whole genomes of 27 flowering plants. Our data suggest that there are only three superclades of which all members have been lost in at least one of the investigated flowering plant species, and only few additional losses of angiosperm-specific MIKC(C)-group gene clades could be identified. Remarkably, for one seed plant superclade (TM8-like genes) and one angiosperm clade (FLC-like genes), multiple losses were identified, suggesting that the function of these genes is dispensable or that gene loss might have even been adaptive. The clades of MIKC(C)-group genes that have never been wiped out in any of the investigated species comprises, in addition to the expected floral organ identity genes, also TM3-like (SOC1-like), StMADS11-like (SVP-like), AGL17-like and GGM13-like (Bsister) genes, suggesting that these genes are more important for angiosperm development and evolution than has previously been appreciated. © 2015 Wiley Periodicals, Inc.

  13. Characterization of the translocation breakpoint sequences of two DEK-CAN fusion genes present in t(6;9) acute myeloid leukemia and a SET-CAN fusion gene found in a case of acute undifferentiated leukemia

    NARCIS (Netherlands)

    von Lindern, M.; Breems, D.; van Baal, S.; Adriaansen, H.; Grosveld, G.

    1992-01-01

    The t(6;9) associated with a subtype of acute myeloid leukemia (AML) was shown to generate a fusion between the 3' part of the CAN gene on chromosome 9 and the 5' part of the DEK gene on chromosome 6. The same part of the CAN gene appeared to be involved in a case of acute undifferentiated leukemia

  14. Development of a set of multiplex PCRs for detection of genes encoding cell wall-associated proteins in Staphylococcus pseudintermedius isolates from dogs, humans and the environment.

    Science.gov (United States)

    Phumthanakorn, Nathita; Chanchaithong, Pattrarat; Prapasarakul, Nuvee

    2017-11-01

    Staphylococcus pseudintermedius commonly colonizes the skin of dogs, whilst nasal carriage may occur in humans who are in contact with dogs or the environment of veterinary hospitals. Genes encoding cell wall-associated (CWA) proteins have been described in Staphylococcus aureus but knowledge of their occurrence in S. pseudintermedius is still limited. The aim of the study was to develop a method to detect S. pseudintermedius surface protein genes (sps) encoding CWA proteins, and to examine the distribution of the genes in isolates from different sources. Four multiplex PCR assays (mPCR) were developed for detection of 18 sps genes, with 4-5 genes detected per mPCR. These were applied to 135 S. pseudintermedius isolates from carriage sites (n=35) and infected sites (n=35) in dogs, from the nasal cavity of humans (n=25), and from the environment of a veterinary hospital (n=40). The mPCRs were shown to detect all 18 known sps genes, and no discrepancies were found between uniplex and mPCR results. The mPCRs could detect at least 1pg/μl of DNA template. A total of 23 sps gene profiles were found among the 135 isolates, with diverse gene combinations. Only spsD, spsF, spsI, spsO, spsP, and spsQ were not detected in all isolates. spsP and spsQ were more frequently detected in the canine isolates from infected sites than from carriage sites. This finding suggests that these two genes may play a role in pathogenicity, whereas the presence of the 12 sps genes may contribute to adherence function at all surfaces where carriage occurs. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Science.gov (United States)

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an

  16. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new

  17. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus.

    Science.gov (United States)

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-09-11

    Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. The new EST collection denotes an important step towards the

  18. Transcriptome analysis of acetic-acid-treated yeast cells identifies a large set of genes whose overexpression or deletion enhances acetic acid tolerance.

    Science.gov (United States)

    Lee, Yeji; Nasution, Olviyani; Choi, Eunyong; Choi, In-Geol; Kim, Wankee; Choi, Wonja

    2015-08-01

    Acetic acid inhibits the metabolic activities of Saccharomyces cerevisiae. Therefore, a better understanding of how S. cerevisiae cells acquire the tolerance to acetic acid is of importance to develop robust yeast strains to be used in industry. To do this, we examined the transcriptional changes that occur at 12 h post-exposure to acetic acid, revealing that 56 and 58 genes were upregulated and downregulated, respectively. Functional categorization of them revealed that 22 protein synthesis genes and 14 stress response genes constituted the largest portion of the upregulated and downregulated genes, respectively. To evaluate the association of the regulated genes with acetic acid tolerance, 3 upregulated genes (DBP2, ASC1, and GND1) were selected among 34 non-protein synthesis genes, and 54 viable mutants individually deleted for the downregulated genes were retrieved from the non-essential haploid deletion library. Strains overexpressing ASC1 and GND1 displayed enhanced tolerance to acetic acid, whereas a strain overexpressing DBP2 was sensitive. Fifty of 54 deletion mutants displayed enhanced acetic acid tolerance. Three chosen deletion mutants (hsps82Δ, ato2Δ, and ssa3Δ) were also tolerant to benzoic acid but not propionic and sorbic acids. Moreover, all those five (two overexpressing and three deleted) strains were more efficient in proton efflux and lower in membrane permeability and internal hydrogen peroxide content than controls. Individually or in combination, those physiological changes are likely to contribute at least in part to enhanced acetic acid tolerance. Overall, information of our transcriptional profile was very useful to identify molecular factors associated with acetic acid tolerance.

  19. Different N-terminal isoforms of Oct-1 control expression of distinct sets of genes and their high levels in Namalwa Burkitt's lymphoma cells affect a wide range of cellular processes.

    Science.gov (United States)

    Pankratova, Elizaveta V; Stepchenko, Alexander G; Portseva, Tatiana; Mogila, Vladic A; Georgieva, Sofia G

    2016-11-02

    Oct-1 transcription factor has various functions in gene regulation. Its expression level is increased in several types of cancer and is associated with poor survival prognosis. Here we identified distinct Oct-1 protein isoforms in human cells and compared gene expression patterns and functions for Oct-1A, Oct-1L, and Oct-1X isoforms that differ by their N-terminal sequences. The longest isoform, Oct-1A, is abundantly expressed and is the main Oct-1 isoform in most of human tissues. The Oct-1L and the weakly expressed Oct-1X regulate the majority of Oct-1A targets as well as additional sets of genes. Oct-1X controls genes involved in DNA replication, DNA repair, RNA processing, and cellular response to stress. The high level of Oct-1 isoforms upregulates genes related to cell cycle progression and activates proliferation both in Namalwa Burkitt's lymphoma cells and primary human fibroblasts. It downregulates expression of genes related to antigen processing and presentation, cytokine-cytokine receptor interaction, oxidative metabolism, and cell adhesion, thus facilitating pro-oncogenic processes. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Armillaria mellea induces a set of defense genes in grapevine roots and one of them codifies a protein with antifungal activity.

    Science.gov (United States)

    Perazzolli, Michele; Bampi, Federica; Faccin, Silvia; Moser, Mirko; De Luca, Federica; Ciccotti, Anna Maria; Velasco, Riccardo; Gessler, Cesare; Pertot, Ilaria; Moser, Claudio

    2010-04-01

    Grapevine root rot, caused by Armillaria mellea, is a serious disease in some grape-growing regions. Young grapevines start to show symptoms of Armillaria root rot from the second year after inoculation, suggesting a certain degree of resistance in young roots. We used a suppression subtractive hybridization approach to study grapevine's reactions to the first stages of A. mellea infection. We identified 24 genes that were upregulated in the roots of the rootstock Kober 5BB 24 h after A. mellea challenge. Real-time reverse-transcriptase polymerase chain reaction analysis confirmed the induction of genes encoding protease inhibitors, thaumatins, glutathione S-transferase, and aminocyclopropane carboxylate oxidase, as well as phase-change related, tumor-related, and proline-rich proteins, and gene markers of the ethylene and jasmonate signaling pathway. Gene modulation was generally stronger in Kober 5BB than in Pinot Noir plants, and in vitro inoculation induced higher modulation than in greenhouse Armillaria spp. treatments. The full-length coding sequences of seven of these genes were obtained and expressed as recombinant proteins. The grapevine homologue of the Quercus spp. phase-change-related protein inhibited the growth of A. mellea mycelia in vitro, suggesting that this protein may play an important role in the defense response against A. mellea.

  1. Development of a serogroup-specific multiplex PCR assay to detect a set of Escherichia coli serogroups based on the identification of their O-antigen gene clusters.

    Science.gov (United States)

    Wang, Quan; Ruan, Xiaojuan; Wei, Dongmei; Hu, Zhidong; Wu, Lixia; Yu, Ting; Feng, Lu; Wang, Lei

    2010-10-01

    The Escherichia coli serogroups O115, O126, O137, O158, O165, and O173 are pathogenic strains associated with diarrhea. Molecular approaches such as PCR have been proven to be rapid, inexpensive, and accurate. The sequences of the O-antigen-processing genes wzx and wzy are specific for different O antigens and are generally used as the target genes for the detection and identification of E. coli strains belonging to different O serogroups. In this report, the O-antigen gene clusters of these 6 O serogroups were sequenced, and genes were identified on the basis of homology. By screening these sequences against all 186 E. coli and Shigella strains, we found that the sequences of the wzx and wzy genes were serogroup-specific, and 2 specific primer pairs for each serogroup were screened out. A multiplex PCR assay targeting all 6 serogroups was developed. Twenty-nine strains were used to validate the specificity of the assay. The detection sensitivity was 1ng genomic DNA. As the assay was shown to be accurate and sensitive, it can be used for the identification and detection of strains belonging to these serogroups in stool and other environmental samples after being isolated by culture. Copyright (c) 2010 Elsevier Ltd. All rights reserved.

  2. Mixed-species RNAseq analysis of human lymphoma cells adhering to mouse stromal cells identifies a core gene set that is also differentially expressed in the lymph node microenvironment of mantle cell lymphoma and chronic lymphocytic leukemia patients.

    Science.gov (United States)

    Arvidsson, Gustav; Henriksson, Johan; Sander, Birgitta; Wright, Anthony P

    2018-04-01

    A subset of hematologic cancer patients is refractory to treatment or suffers relapse, due in part to minimal residual disease, whereby some cancer cells survive treatment. Cell-adhesion-mediated drug resistance is an important mechanism, whereby cancer cells receive survival signals via interaction with e.g. stromal cells. No genome-wide studies of in vitro systems have yet been performed to compare gene expression in different cell subsets within a co-culture and cells grown separately. Using RNA sequencing and species-specific read mapping, we compared transcript levels in human Jeko-1 mantle cell lymphoma cells stably adhered to mouse MS-5 stromal cells or in suspension within a co-culture or cultured separately as well as in stromal cells in co-culture or in separate culture. From 1050 differentially expressed transcripts in adherent mantle cell lymphoma cells, we identified 24 functional categories that together represent four main functional themes, anti-apoptosis, B-cell signaling, cell adhesion/migration and early mitosis. A comparison with previous mantle cell lymphoma and chronic lymphocytic leukemia studies, of gene expression differences between lymph node and blood, identified 116 genes that are differentially expressed in all three studies. From these genes, we suggest a core set of genes ( CCL3, CCL4, DUSP4, ETV5, ICAM1, IL15RA, IL21R, IL4I1, MFSD2A, NFKB1, NFKBIE, SEMA7A, TMEM2 ) characteristic of cells undergoing cell-adhesion-mediated microenvironment signaling in mantle cell lymphoma/chronic lymphocytic leukemia. The model system developed and characterized here together with the core gene set will be useful for future studies of pathways that mediate increased cancer cell survival and drug resistance mechanisms. Copyright© 2018 Ferrata Storti Foundation.

  3. A schizophrenia gene locus on chromosome 17q21 in a new set of families of Mexican and central american ancestry: evidence from the NIMH Genetics of schizophrenia in latino populations study.

    Science.gov (United States)

    Escamilla, Michael; Hare, Elizabeth; Dassori, Albana M; Peralta, Juan Manuel; Ontiveros, Alfonso; Nicolini, Humberto; Raventós, Henriette; Medina, Rolando; Mendoza, Ricardo; Jerez, Alvaro; Muñoz, Rodrigo; Almasy, Laura

    2009-04-01

    The present study investigated a new set of families of Latin American ancestry in order to detect the location of genes predisposing to schizophrenia and related psychotic disorders. A genome-wide scan was performed for 175 newly recruited families with at least two siblings suffering from a psychotic disorder. Best-estimate consensus procedures were used to arrive at diagnoses, and nonparametric allele-sharing statistics were calculated to detect linkage. Genome-wide significant evidence for linkage for the phenotype of DSM-IV schizophrenia or schizoaffective disorder was found in a region on chromosome 17q21 (lod score, 3.33). A region on chromosome 15q22-23 showed suggestive evidence of linkage with this same phenotype (lod score, 2.11). Analyses using a broader model (any psychosis) yielded evidence of suggestive linkage for the 17q21 region only, and no region achieved genome-wide significance of linkage. The new set of 175 families of Mexican and Central American ancestry delineates two new loci likely to harbor predisposition genes for schizophrenia and schizoaffective disorder. The region with the strongest support for linkage in this sample, 17q21, has been implicated in meta-analyses of schizophrenia genome screens, but the authors found no previous reports of it as a locus for schizophrenia in specific population- or family-based studies, and it may represent the location of a schizophrenia predisposition gene (or genes) of special relevance in Mexican and Central American populations.

  4. Definition of the low molecular weight glutenin subunit gene family members in a set of standard bread wheat (Triticum aestivum L.) varieties

    Science.gov (United States)

    Low-molecular-weight glutenin subunits (LMW-GS) are a class of seed storage proteins that play a major role in the determination of the viscoelastic properties of wheat dough. Most of the LMW-GSs are encoded by a multi-gene family located on the short arms of the homoeologous group 1 chromosomes, at...

  5. Multiplex reverse transcription-polymerase chain reaction combined with on-chip electrophoresis as a rapid screening tool for candidate gene sets

    DEFF Research Database (Denmark)

    Wittig, Rainer; Salowsky, Rüdiger; Blaich, Stephanie

    2005-01-01

    Combining multiplex reverse transcription-polymerase chain reaction (mRT-PCR) with microfluidic amplicon analysis, we developed an assay for the rapid and reliable semiquantitative expression screening of 11 candidate genes for drug resistance in human malignant melanoma. The functionality...

  6. Characterization of the CrbS/R Two-Component System in Pseudomonas fluorescens Reveals a New Set of Genes under Its Control and a DNA Motif Required for CrbR-Mediated Transcriptional Activation

    Directory of Open Access Journals (Sweden)

    Edgardo Sepulveda

    2017-11-01

    Full Text Available The CrbS/R system is a two-component signal transduction system that regulates acetate utilization in Vibrio cholerae, P. aeruginosa, and P. entomophila. CrbS is a hybrid histidine kinase that belongs to a recently identified family, in which the signaling domain is fused to an SLC5 solute symporter domain through aSTAC domain. Upon activation by CrbS, CrbR activates transcription of the acs gene, which encodes an acetyl-CoA synthase (ACS, and the actP gene, which encodes an acetate/solute symporter. In this work, we characterized the CrbS/R system in Pseudomonas fluorescens SBW25. Through the quantitative proteome analysis of different mutants, we were able to identify a new set of genes under its control, which play an important role during growth on acetate. These results led us to the identification of a conserved DNA motif in the putative promoter region of acetate-utilization genes in the Gammaproteobacteria that is essential for the CrbR-mediated transcriptional activation of genes under acetate-utilizing conditions. Finally, we took advantage of the existence of a second SLC5-containing two-component signal transduction system in P. fluorescens, CbrA/B, to demonstrate that the activation of the response regulator by the histidine kinase is not dependent on substrate transport through the SLC5 domain.

  7. Witnessing stressful events induces glutamatergic synapse pathway alterations and gene set enrichment of positive EPSP regulation within the VTA of adult mice: An ontology based approach

    Science.gov (United States)

    Brewer, Jacob S.

    It is well known that exposure to severe stress increases the risk for developing mood disorders. Currently, the neurobiological and genetic mechanisms underlying the functional effects of psychological stress are poorly understood. Presenting a major obstacle to the study of psychological stress is the inability of current animal models of stress to distinguish between physical and psychological stressors. A novel paradigm recently developed by Warren et al., is able to tease apart the effects of physical and psychological stress in adult mice by allowing these mice to "witness," the social defeat of another mouse thus removing confounding variables associated with physical stressors. Using this 'witness' model of stress and RNA-Seq technology, the current study aims to study the genetic effects of psychological stress. After, witnessing the social defeat of another mouse, VTA tissue was extracted, sequenced, and analyzed for differential expression. Since genes often work together in complex networks, a pathway and gene ontology (GO) analysis was performed using data from the differential expression analysis. The pathway and GO analyzes revealed a perturbation of the glutamatergic synapse pathway and an enrichment of positive excitatory post-synaptic potential regulation. This is consistent with the excitatory synapse theory of depression. Together these findings demonstrate a dysregulation of the mesolimbic reward pathway at the gene level as a result of psychological stress potentially contributing to depressive like behaviors.

  8. Automatic sets and Delone sets

    International Nuclear Information System (INIS)

    Barbe, A; Haeseler, F von

    2004-01-01

    Automatic sets D part of Z m are characterized by having a finite number of decimations. They are equivalently generated by fixed points of certain substitution systems, or by certain finite automata. As examples, two-dimensional versions of the Thue-Morse, Baum-Sweet, Rudin-Shapiro and paperfolding sequences are presented. We give a necessary and sufficient condition for an automatic set D part of Z m to be a Delone set in R m . The result is then extended to automatic sets that are defined as fixed points of certain substitutions. The morphology of automatic sets is discussed by means of examples

  9. DNA sequence profiles of the colorectal cancer critical gene set KRAS-BRAF-PIK3CA-PTEN-TP53 related to age at disease onset.

    Directory of Open Access Journals (Sweden)

    Marianne Berg

    2010-11-01

    Full Text Available The incidence of colorectal cancer (CRC increases with age and early onset indicates an increased likelihood for genetic predisposition for this disease. The somatic genetics of tumor development in relation to patient age remains mostly unknown. We have examined the mutation status of five known cancer critical genes in relation to age at diagnosis, and compared the genomic complexity of tumors from young patients without known CRC syndromes with those from elderly patients. Among 181 CRC patients, stratified by microsatellite instability status, DNA sequence changes were identified in KRAS (32%, BRAF (16%, PIK3CA (4%, PTEN (14% and TP53 (51%. In patients younger than 50 years (n = 45, PIK3CA mutations were not observed and TP53 mutations were more frequent than in the older age groups. The total gene mutation index was lowest in tumors from the youngest patients. In contrast, the genome complexity, assessed as copy number aberrations, was highest in tumors from the youngest patients. A comparable number of tumors from young (70 years was quadruple negative for the four predictive gene markers (KRAS-BRAF-PIK3CA-PTEN; however, 16% of young versus only 1% of the old patients had tumor mutations in PTEN/PIK3CA exclusively. This implies that mutation testing for prediction of EGFR treatment response may be restricted to KRAS and BRAF in elderly (>70 years patients. Distinct genetic differences found in tumors from young and elderly patients, whom are comparable for known clinical and pathological variables, indicate that young patients have a different genetic risk profile for CRC development than older patients.

  10. Drug-induced Liver Fibrosis: Testing Nevirapine in a Viral-like Liver Setting Using Histopathology, MALDI IMS, and Gene Expression.

    Science.gov (United States)

    Brown, H Roger; Castellino, Stephen; Groseclose, M Reid; Elangbam, Chandikumar S; Mellon-Kusibab, Kathryn; Yoon, Lawrence W; Gates, Lisa D; Krull, David L; Cariello, Neal F; Arrington-Brown, Leigh; Tillman, Tony; Fowler, Serita; Shah, Vishal; Bailey, David; Miller, Richard T

    2016-01-01

    Nevirapine (NVP) is associated with hepatotoxicity in 1-5% of patients. In rodent studies, NVP has been shown to cause hepatic enzyme induction, centrilobular hypertrophy, and skin rash in various rat strains but not liver toxicity. In an effort to understand whether NVP is metabolized differently in a transiently inflamed liver and whether a heightened immune response alters NVP-induced hepatic responses, female brown Norway rats were dosed with either vehicle or NVP alone (75 mg/kg/day for 15 days) or galactosamine alone (single intraperitoneal [ip] injection on day 7 to mimic viral hepatitis) or a combination of NVP (75/100/150 mg/kg/day for 15 days) and galactosamine (single 750 mg/kg ip on day 7). Livers were collected at necropsy for histopathology, matrix-assisted laser desorption/ionization imaging mass spectrometry and gene expression. Eight days after galactosamine, hepatic fibrosis was noted in rats dosed with the combination of NVP and galactosamine. No fibrosis occurred with NVP alone or galactosamine alone. Gene expression data suggested a viral-like response initiated by galactosamine via RNA sensors leading to apoptosis, toll-like receptor, and dendritic cell responses. These were exacerbated by NVP-induced growth factor, retinol, apoptosis, and periostin effects. This finding supports clinical reports warning against exacerbation of fibrosis by NVP in patients with hepatitis C. © The Author(s) 2016.

  11. Expression patterns of porcine Toll-like receptors family set of genes (TLR1-10) in gut-associated lymphoid tissues alter with age.

    Science.gov (United States)

    Uddin, Muhammad Jasim; Kaewmala, Kanokwan; Tesfaye, Dawit; Tholen, Ernst; Looft, Christian; Hoelker, Michael; Schellander, Karl; Cinar, Mehmet Ulas

    2013-08-01

    The aim was to study the expression pattern of the porcine TLR family (TLR1-10) genes in gut-associated lymphoid tissues (GALT) of varying ages. A total of nine clinically healthy pigs of three ages group (1 day, 2 months and 5 months old) were selected for this experiment (three pigs in each group). Tissues from intestinal mucosa in stomach, duodenum, jejunum and ileum and mesenteric lymph node (MLN) were used. mRNA expression of TLRs (1-10) was detectable in all tissues and TLR3 showed the highest mRNA abundance among TLRs. TLR3 expression in stomach, and TLR1 and TLR6 expression in MLN were higher in adult than newborn pigs. The western blot results of TLR2, 3 and 9 in some cases, did not coincide with the mRNA expression results. The protein localization of TLR2, 3 and 9 showed that TLR expressing cells were abundant in the lamina propria, Peyer's patches in intestine, and around and within the lymphoid follicles in the MLN. This expressions study sheds the first light on the expression patterns of all TLR genes in GALT at different ages of pigs. Copyright © 2013 Elsevier Ltd. All rights reserved.

  12. A common multiple cloning site in a set of vectors for expression of eukaryotic genes in mammalian, insect and bacterial cells

    DEFF Research Database (Denmark)

    Pallisgaard, N; Pedersen, FS; Birkelund, Svend

    1994-01-01

    a start Met codon was included in the same reading frame as in lambda gt11Sfi-Not to support expression of partial cDNA clones. Thus a cDNA insert of lambda gt11Sfi-Not could be shuttled among the new vectors for expression. The other set of vectors without a start codon were suitable for expression of c......DNA carrying their own start Met codon. By Western blot analysis and by transactivation of a reporter plasmid in co-transfections we show that cDNA is very efficiently expressed in NIH 3T3 cells under control of the elongation factor 1 alpha promoter....

  13. Coordination of MicroRNAs, PhasiRNAs, and NB-LRR Genes in Response to a Plant Pathogen: Insights from Analyses of a Set of Soybean Rps Gene Near-Isogenic Lines

    Directory of Open Access Journals (Sweden)

    Meixia Zhao

    2015-03-01

    Full Text Available Disease-related genes, particularly the nucleotide binding site (NB–leucine-rich repeat (LRR class of plant genes can be triggered by microRNAs (miRNAs to generate phased small interfering RNAs (phasiRNAs, which could reduce the transcript levels of their targets. However, how global changes in transcript levels coordinate with changes in miRNA and phasiRNA levels in defense responses remains largely unknown. Here, we investigated changes in the relative abundance of small RNAs (sRNAs, with a focus on miRNAs and phasiRNAs and their potential targets in response to the pathogen in the susceptible soybean [Glycine max (L. Merr.] ‘Williams’ and nine resistant near-isogenic lines (NILs, each carrying a unique ( gene. In total, 369 distinct miRNAs, including 78 new ones, were identified in the 10 soybean lines. The majority of miRNAs were downregulated by the pathogen. Of the 525 genes found in the soybean reference genome, 257 were predicted to be the targets of eight abundant miRNA families and 126 (dubbed or were predicted to have produced phasiRNAs. Upregulation of 15 was associated with downregulation of their corresponding phasiRNAs in the NILs; these phasiRNAs were predicted to regulate 75 additional s in . In addition, we identified putative 24-nucleotide (nt phasiRNAs from transposons, possibly representing a novel general epigenetic mechanism for regulation of transposon activity under biotic stresses. Together, these observations suggest that miRNAs and phasiRNAs play an important role in response to plant pathogens through complex, multiple layers of post-transcriptional regulation.

  14. High-resolution definition of the Vibrio cholerae essential gene set with hidden Markov model-based analyses of transposon-insertion sequencing data.

    Science.gov (United States)

    Chao, Michael C; Pritchard, Justin R; Zhang, Yanjia J; Rubin, Eric J; Livny, Jonathan; Davis, Brigid M; Waldor, Matthew K

    2013-10-01

    The coupling of high-density transposon mutagenesis to high-throughput DNA sequencing (transposon-insertion sequencing) enables simultaneous and genome-wide assessment of the contributions of individual loci to bacterial growth and survival. We have refined analysis of transposon-insertion sequencing data by normalizing for the effect of DNA replication on sequencing output and using a hidden Markov model (HMM)-based filter to exploit heretofore unappreciated information inherent in all transposon-insertion sequencing data sets. The HMM can smooth variations in read abundance and thereby reduce the effects of read noise, as well as permit fine scale mapping that is independent of genomic annotation and enable classification of loci into several functional categories (e.g. essential, domain essential or 'sick'). We generated a high-resolution map of genomic loci (encompassing both intra- and intergenic sequences) that are required or beneficial for in vitro growth of the cholera pathogen, Vibrio cholerae. This work uncovered new metabolic and physiologic requirements for V. cholerae survival, and by combining transposon-insertion sequencing and transcriptomic data sets, we also identified several novel noncoding RNA species that contribute to V. cholerae growth. Our findings suggest that HMM-based approaches will enhance extraction of biological meaning from transposon-insertion sequencing genomic data.

  15. RRM domain of Arabidopsis splicing factor SF1 is important for pre-mRNA splicing of a specific set of genes

    KAUST Repository

    Lee, Keh Chien

    2017-04-11

    The RNA recognition motif of Arabidopsis splicing factor SF1 affects the alternative splicing of FLOWERING LOCUS M pre-mRNA and a heat shock transcription factor HsfA2 pre-mRNA. Splicing factor 1 (SF1) plays a crucial role in 3\\' splice site recognition by binding directly to the intron branch point. Although plant SF1 proteins possess an RNA recognition motif (RRM) domain that is absent in its fungal and metazoan counterparts, the role of the RRM domain in SF1 function has not been characterized. Here, we show that the RRM domain differentially affects the full function of the Arabidopsis thaliana AtSF1 protein under different experimental conditions. For example, the deletion of RRM domain influences AtSF1-mediated control of flowering time, but not the abscisic acid sensitivity response during seed germination. The alternative splicing of FLOWERING LOCUS M (FLM) pre-mRNA is involved in flowering time control. We found that the RRM domain of AtSF1 protein alters the production of alternatively spliced FLM-β transcripts. We also found that the RRM domain affects the alternative splicing of a heat shock transcription factor HsfA2 pre-mRNA, thereby mediating the heat stress response. Taken together, our results suggest the importance of RRM domain for AtSF1-mediated alternative splicing of a subset of genes involved in the regulation of flowering and adaptation to heat stress.

  16. Bioinformatic Description of Immunotherapy Targets for Pediatric T-Cell Leukemia and the Impact of Normal Gene Sets Used for Comparison

    Directory of Open Access Journals (Sweden)

    Rimas J Orentas

    2014-06-01

    Full Text Available Pediatric lymphoid leukemia has the highest cure rate of all pediatric malignancies, yet due to its prevalence, still accounts for the majority of childhood cancer deaths and requires long-term highly toxic therapy. The ability to target B-cell ALL with immunoglobulin-like binders, whether anti-CD22 antibody or anti-CD19 CAR-Ts, has impacted treatment options for some patients. The development of new ways to target B cell antigens continues at rapid pace. T-cell ALL accounts for up to 20% of childhood leukemia but has yet to see a set of high value immunotherapeutic targets identified. To find new targets for T-ALL immunotherapy, we employed a bioinformatic comparison to broad normal tissue arrays, hematopoietic stem cells (HSC, and mature lymphocytes, then filtered the results for transcripts encoding plasma membrane proteins. T-ALL bears a core T cell signature and transcripts encoding TCR/CD3 components and canonical markers of T cell development predominate, especially when comparison was made to normal tissue or HSC. However, when comparison to mature lymphocytes was also undertaken, we identified two antigens that may drive, or be associated with leukemogenesis; TALLA-1 and hedgehog interacting protein, HHIP. In addition, TCR subfamilies, CD1, activation and adhesion markers, membrane organizing molecules, and receptors linked to metabolism and inflammation were also identified. Of these, only CD52, CD37, and CD98 are currently being targeted clinically. This work provides a set of targets to be considered for future development of immunotherapies for T-ALL.

  17. Mining microbial metatranscriptomes for expression of antibiotic resistance genes under natural conditions

    Science.gov (United States)

    Versluis, Dennis; D'Andrea, Marco Maria; Ramiro Garcia, Javier; Leimena, Milkha M.; Hugenholtz, Floor; Zhang, Jing; Öztürk, Başak; Nylund, Lotta; Sipkema, Detmer; Schaik, Willem Van; de Vos, Willem M.; Kleerebezem, Michiel; Smidt, Hauke; Passel, Mark W. J. Van

    2015-07-01

    Antibiotic resistance genes are found in a broad range of ecological niches associated with complex microbiota. Here we investigated if resistance genes are not only present, but also transcribed under natural conditions. Furthermore, we examined the potential for antibiotic production by assessing the expression of associated secondary metabolite biosynthesis gene clusters. Metatranscriptome datasets from intestinal microbiota of four human adults, one human infant, 15 mice and six pigs, of which only the latter have received antibiotics prior to the study, as well as from sea bacterioplankton, a marine sponge, forest soil and sub-seafloor sediment, were investigated. We found that resistance genes are expressed in all studied ecological niches, albeit with niche-specific differences in relative expression levels and diversity of transcripts. For example, in mice and human infant microbiota predominantly tetracycline resistance genes were expressed while in human adult microbiota the spectrum of expressed genes was more diverse, and also included β-lactam, aminoglycoside and macrolide resistance genes. Resistance gene expression could result from the presence of natural antibiotics in the environment, although we could not link it to expression of corresponding secondary metabolites biosynthesis clusters. Alternatively, resistance gene expression could be constitutive, or these genes serve alternative roles besides antibiotic resistance.

  18. Toolbox Approaches Using Molecular Markers and 16S rRNA Gene Amplicon Data Sets for Identification of Fecal Pollution in Surface Water.

    Science.gov (United States)

    Ahmed, W; Staley, C; Sadowsky, M J; Gyawali, P; Sidhu, J P S; Palmer, A; Beale, D J; Toze, S

    2015-10-01

    In this study, host-associated molecular markers and bacterial 16S rRNA gene community analysis using high-throughput sequencing were used to identify the sources of fecal pollution in environmental waters in Brisbane, Australia. A total of 92 fecal and composite wastewater samples were collected from different host groups (cat, cattle, dog, horse, human, and kangaroo), and 18 water samples were collected from six sites (BR1 to BR6) along the Brisbane River in Queensland, Australia. Bacterial communities in the fecal, wastewater, and river water samples were sequenced. Water samples were also tested for the presence of bird-associated (GFD), cattle-associated (CowM3), horse-associated, and human-associated (HF183) molecular markers, to provide multiple lines of evidence regarding the possible presence of fecal pollution associated with specific hosts. Among the 18 water samples tested, 83%, 33%, 17%, and 17% were real-time PCR positive for the GFD, HF183, CowM3, and horse markers, respectively. Among the potential sources of fecal pollution in water samples from the river, DNA sequencing tended to show relatively small contributions from wastewater treatment plants (up to 13% of sequence reads). Contributions from other animal sources were rarely detected and were very small (pollution in an urban river. This study is a proof of concept, and based on the results, we recommend using bacterial community analysis (where possible) along with PCR detection or quantification of host-associated molecular markers to provide information on the sources of fecal pollution in waterways. Copyright © 2015, American Society for Microbiology. All Rights Reserved.

  19. Niche-Specific Requirement for Hyphal Wall protein 1 in Virulence of Candida albicans

    Science.gov (United States)

    Staab, Janet F.; Datta, Kausik; Rhee, Peter

    2013-01-01

    Specialized Candida albicans cell surface proteins called adhesins mediate binding of the fungus to host cells. The mammalian transglutaminase (TG) substrate and adhesin, Hyphal wall protein 1 (Hwp1), is expressed on the hyphal form of C. albicans where it mediates fungal adhesion to epithelial cells. Hwp1 is also required for biofilm formation and mating thus the protein functions in both fungal-host and self-interactions. Hwp1 is required for full virulence of C. albicans in murine models of disseminated candidiasis and of esophageal candidiasis. Previous studies correlated TG activity on the surface of oral epithelial cells, produced by epithelial TG (TG1), with tight binding of C. albicans via Hwp1 to the host cell surfaces. However, the contribution of other Tgs, specifically tissue TG (TG2), to disseminated candidiasis mediated by Hwp1 was not known. A newly created hwp1 null strain in the wild type SC5314 background was as virulent as the parental strain in C57BL/6 mice, and virulence was retained in C57BL/6 mice deleted for Tgm2 (TG2). Further, the hwp1 null strains displayed modestly reduced virulence in BALB/c mice as did strain DD27-U1, an independently created hwp1Δ/Δ in CAI4 corrected for its ura3Δ defect at the URA3 locus. Hwp1 was still needed to produce wild type biofilms, and persist on murine tongues in an oral model of oropharyngeal candidiasis consistent with previous studies by us and others. Finally, lack of Hwp1 affected the translocation of C. albicans from the mouse intestine into the bloodstream of mice. Together, Hwp1 appears to have a minor role in disseminated candidiasis, independent of tissue TG, but a key function in host- and self-association to the surface of oral mucosa. PMID:24260489

  20. Regulation of GacA in Pseudomonas chlororaphis Strains Shows a Niche Specificity.

    Directory of Open Access Journals (Sweden)

    Jun Li

    Full Text Available The GacS/GacA two-component system plays a central role in the regulation of a broad range of biological functions in many bacteria. In the biocontrol organism Pseudomonas chlororaphis, the Gac system has been shown to positively control quorum sensing, biofilm formation, and phenazine production, but has an overall negative impact on motility. These studies have been performed with strains originated from the rhizosphere predominantly. To investigate the level of conservation between the GacA regulation of biocontrol-related traits in P. chlororaphis isolates from different habitats, the studies presented here focused on the endophytic isolate G5 of P. chlororaphis subsp. aurantiaca. A gacA mutant deficient in the production of N-acylhomoserine lactones (AHLs and phenazine was isolated through transposon mutagenesis. Further phenotypic characterization revealed that in strain G5, similar to other P. chlororaphis strains, a gacA mutation caused inability to produce biocontrol factors such as phenazine, HCN and proteases responsible for antifungal activity, but overproduced siderophores. LC-MS/MS analysis revealed that AHL production was also practically abolished in this mutant. However, the wild type exhibited an extremely diverse AHL pattern which has never been identified in P. chlororaphis. In contrast to other isolates of this organism, GacA in strain G5 was shown to negatively regulate biofilm formation and oxidative stress response whilst positively regulating cell motility and biosynthesis of indole-3-acetic acid (IAA. To gain a better understanding of the overall impact of GacA in G5, a comparative proteomic analysis was performed revealing that, in addition to some of the traits like phenazine mentioned above, GacA also negatively regulated lipopolysaccharide (LPS and trehalose biosynthesis whilst having a positive impact on energy metabolism, an effect not previously described in P. chlororaphis. Consequently, GacA regulation shows a differential strain dependency which is likely to be in line with their niche of origin.

  1. Stem cell niche-specific Ebf3 maintains the bone marrow cavity.

    Science.gov (United States)

    Seike, Masanari; Omatsu, Yoshiki; Watanabe, Hitomi; Kondoh, Gen; Nagasawa, Takashi

    2018-03-01

    Bone marrow is the tissue filling the space between bone surfaces. Hematopoietic stem cells (HSCs) are maintained by special microenvironments known as niches within bone marrow cavities. Mesenchymal cells, termed CXC chemokine ligand 12 (CXCL12)-abundant reticular (CAR) cells or leptin receptor-positive (LepR + ) cells, are a major cellular component of HSC niches that gives rise to osteoblasts in bone marrow. However, it remains unclear how osteogenesis is prevented in most CAR/LepR + cells to maintain HSC niches and marrow cavities. Here, using lineage tracing, we found that the transcription factor early B-cell factor 3 (Ebf3) is preferentially expressed in CAR/LepR + cells and that Ebf3-expressing cells are self-renewing mesenchymal stem cells in adult marrow. When Ebf3 is deleted in CAR/LepR + cells, HSC niche function is severely impaired, and bone marrow is osteosclerotic with increased bone in aged mice. In mice lacking Ebf1 and Ebf3 , CAR/LepR + cells exhibiting a normal morphology are abundantly present, but their niche function is markedly impaired with depleted HSCs in infant marrow. Subsequently, the mutants become progressively more osteosclerotic, leading to the complete occlusion of marrow cavities in early adulthood. CAR/LepR + cells differentiate into bone-producing cells with reduced HSC niche factor expression in the absence of Ebf1/Ebf3 Thus, HSC cellular niches express Ebf3 that is required to create HSC niches, to inhibit their osteoblast differentiation, and to maintain spaces for HSCs. © 2018 Seike et al.; Published by Cold Spring Harbor Laboratory Press.

  2. Distinct properties of Escherichia coli products of plant-type ribulose-1,5-bisphosphate carboxylase/oxygenase directed by two sets of genes from the photosynthetic bacterium Chromatium vinosum.

    Science.gov (United States)

    Viale, A M; Kobayashi, H; Akazawa, T

    1990-10-25

    We have recently described the existence of two sets of genes encoding ribulose-1,5-bisphosphate carboxylase/oxygenase (Rbu-P2 carboxylase), rbcA-rbcB and rbcL-rbcS, in the photosynthetic purple sulfur bacterium Chromatium vinosum (Viale, A.M., Kobayashi, H., and Akazawa, T. (1989) J. Bacteriol. 171, 2391-2400). These genes were cloned in plasmid vectors, and their expression was studied in Escherichia coli. Expression of rbcA-rbcB in E. coli was obtained under the control of its own promoter. On the other hand, expression of rbcL-rbcS in this host was not observed unless these genes were cloned under the control of the tac promoter. Purified rbcA-rbcB and rbcL-rbcS products from E. coli consisted of large and small subunits in equimolar ratios. They also showed very close elution profiles to Rbu-P2 carboxylase isolated from C. vinosum in size-exclusion chromatography columns, thus suggesting hexadecameric (L8S8) structures. Vmax of Rbu-P2 carboxylase were very similar for both enzymes, but the Km values for CO2 and ribulose 1,5-bisphosphate showed some differences. Immunochemical and N-terminal amino acid sequence analyses of the large and small subunits encoded by rbcA-rbcB and rbcL-rbcS also differed, especially at the level of the small subunits. The comparisons described above as well as the analysis of C. vinosum crude extracts by anion-exchange chromatography indicated that Rbu-P2 carboxylase encoded by rbcA-rbcB was the only species detected in the photosynthetic bacterium.

  3. Essential Bacillus subtilis genes

    NARCIS (Netherlands)

    Kobayashi, K.; Ehrlich, S.D.; Albertini, A.; Amati, G.; Andersen, K.K.; Arnaud, M.; Asai, K.; Ashikaga, S.; Aymerich, S.; Bessieres, P.; Boland, F.; Brignell, S.C.; Bron, S; Bunai, K.; Chapuis, J; Christiansen, L.C.; Danchin, A.; Debarbouille, M.; Dervyn, E.; Deuerling, E.; Devine, K.; Devine, S.K.; Dreesen, O.; Errington, J.; Fillinger, S.; Foster, S.J.; Fujita, Y.; Galizzi, A.; Gardan, R.; Eschevins, C.; Fukushima, T.; Haga, K.; Harwood, C.R; Hecker, M.; Hosoya, D.; Hullo, M.F.; Kakeshita, H.; Karamata, D.; Kasahara, Y.; Kawamura, F.; Koga, K.; Koski, P.; Kuwana, R.; Imamura, D.; Ishimaru, M.; Ishikawa, S.; Ishio, I.; Le Coq, D.; Masson, A.; Mauel, C.; Meima, Roelf; Mellado, R.P.; Moir, A.; Moriya, S.; Nagakawa, E.; Nanamiya, H.; Nakai, S.; Nygaard, P.; Ogura, M.; Ohanan, T.; O'Reilly, M.; O'Rourke, M.; Pragai, Z.; Pooley, H.M.; Rapoport, G.; Rawlins, J.P.; Rivas, L.A.; Rivolta, C.; Sadaie, A.; Sadaie, Y.; Sarvas, M; Sato, T.; Saxild, H.H.; Scanlan, E.; Schumann, W; Seegers, J.F. M. L.; Sekiguchi, J.; Sekowska, A.; Seror, S.J.; Simon, M.; Stragier, P.; Studer, R.; Takamatsu, H.; Tanaka, T.; Takeuchi, M.; Thomaides, H.B.; Vagner, V.; van Dijl, J.M.; Watabe, K.; Wipat, A; Yamamoto, H.; Yamamoto, M.; Yamamoto, Y.; Yamane, K.; Yata, K.; Yoshida, K.; Yoshikawa, H.; Zuber, U.; Ogasawara, N.; Ishio, [No Value

    2003-01-01

    To estimate the minimal gene set required to sustain bacterial life in nutritious conditions, we carried out a systematic inactivation of Bacillus subtilis genes. Among approximate to4,100 genes of the organism, only 192 were shown to be indispensable by this or previous work. Another 79 genes were

  4. Genes and Gene Therapy

    Science.gov (United States)

    ... correctly, a child can have a genetic disorder. Gene therapy is an experimental technique that uses genes to ... or prevent disease. The most common form of gene therapy involves inserting a normal gene to replace an ...

  5. Simultaneous assay of every Salmonella Typhi gene using one million transposon mutants

    Science.gov (United States)

    Langridge, Gemma C.; Phan, Minh-Duy; Turner, Daniel J.; Perkins, Timothy T.; Parts, Leopold; Haase, Jana; Charles, Ian; Maskell, Duncan J.; Peters, Sarah E.; Dougan, Gordon; Wain, John; Parkhill, Julian; Turner, A. Keith

    2009-01-01

    Very high-throughput sequencing technologies need to be matched by high-throughput functional studies if we are to make full use of the current explosion in genome sequences. We have generated a very large bacterial mutant pool, consisting of an estimated 1.1 million transposon mutants and we have used genomic DNA from this mutant pool, and Illumina nucleotide sequencing to prime from the transposon and sequence into the adjacent target DNA. With this method, which we have called TraDIS (transposon directed insertion-site sequencing), we have been able to map 370,000 unique transposon insertion sites to the Salmonella enterica serovar Typhi chromosome. The unprecedented density and resolution of mapped insertion sites, an average of one every 13 base pairs, has allowed us to assay simultaneously every gene in the genome for essentiality and generate a genome-wide list of candidate essential genes. In addition, the semiquantitative nature of the assay allowed us to identify genes that are advantageous and those that are disadvantageous for growth under standard laboratory conditions. Comparison of the mutant pool following growth in the presence or absence of ox bile enabled every gene to be assayed for its contribution toward bile tolerance, a trait required of any enteric bacterium and for carriage of S. Typhi in the gall bladder. This screen validated our hypothesis that we can simultaneously assay every gene in the genome to identify niche-specific essential genes. PMID:19826075

  6. Combination of interval set and soft set

    Directory of Open Access Journals (Sweden)

    Keyun Qin

    2013-04-01

    Full Text Available Soft set theory and interval set theory are all mathematical tools for dealing with uncertainties. This paper is devoted to the discussion of soft interval set and its application. The notion of soft interval sets is introduced by combining soft set and interval set. Several operations on soft interval sets are presented in a manner parallel to that used in defining operations on soft sets and the lattice structures of soft interval sets are established. In addition, a soft interval set based decision making problem is analyzed.

  7. Hierarchical Sets: Analyzing Pangenome Structure through Scalable Set Visualizations

    DEFF Research Database (Denmark)

    Pedersen, Thomas Lin

    2017-01-01

    information to increase in knowledge. As the pangenome data structure is essentially a collection of sets we explore the potential for scalable set visualization as a tool for pangenome analysis. We present a new hierarchical clustering algorithm based on set arithmetics that optimizes the intersection sizes...... not correspond with the hierarchy, can be visualized using hierarchical edge bundles. When applied to pangenome data this plot shows putative horizontal gene transfers between the genomes and can highlight relationships between genomes that is not represented by the hierarchy.We illustrate the utility...... of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https://cran.r-project...

  8. Prostate cancer gene 3 (PCA3) is of additional predictive value in patients with PI-RADS grade III (intermediate) lesions in the MR-guided re-biopsy setting for prostate cancer.

    Science.gov (United States)

    Kaufmann, S; Bedke, J; Gatidis, S; Hennenlotter, J; Kramer, U; Notohamiprodjo, M; Nikolaou, K; Stenzl, A; Kruck, S

    2016-04-01

    Multiparametric magnetic resonance imaging (mpMRI) improves diagnostic accuracy in re-biopsies of men with prostate cancer (PC) suspicion, but predictive value is limited despite the use of the new Prostate Imaging Reporting and Data System (PI-RADS). Prognostic value of the PC-specific biomarker prostate cancer gene 3 (PCA3) added to the PI-RADS score was evaluated. The study was a retrospective analysis of the institutional database for men with MR-guided biopsy (MR-GB) for suspicious lesion in mpMRI and who had an additional pre-MR-GB PCA3 testing for ongoing PC suspicion. All men had ≥ 1 negative ultrasound GB. Lesions were retrospectively scored by PI-RADS in three MRI sequences (T2w, DCE, and DWI). PCA3 was analyzed with cutoffs of 25 and 35. The prognostic value of mpMRI and PCA3 and the additional value of both were explored. Tumor detection rate (49 men, mean PSA 10 ng/ml, lesion size 40 mm(2)) was 45 % (22/49 patients). In the subgroup of PI-RADS IV°, 17/17 patients had PC; in PI-RADS III° (intermediate) 5/15 had PC, and all 5 had a PCA3 > 35. PCA3 > 35 had no additional prognostic value in the whole cohort. Out of the 10/15 PC negative patients (PI-RADS III°), PCA3 was PI-RADS III° patients improved predictive accuracy to 91.8 %. MpMRI and subsequent grading to PI-RADS significantly improves PC detection in the re-biopsy setting. The diagnostic uncertainty in the PI-RADS intermediate group can be ameliorated by the addition of PCA3 cutoff of 35 to avoid potential unnecessary biopsies.

  9. Analysis of the real EADGENE data set:

    DEFF Research Database (Denmark)

    Sørensen, Peter; Bonnet, Agnès; Buitenhuis, Bart

    2007-01-01

    ) or principal component analysis (PCA) to identify groups of differentially expressed genes with a similar expression pattern over time points and infective agent (E. coli or S. aureus). The main result from these analyses was that HC and PCA were able to separate tissue samples taken at 24 h following E. coli...... approach looked at differential expression of predefined gene sets. Gene sets were defined based on information retrieved from biological databases such as Gene Ontology. Based on these annotation sources the teams used either the GlobalTest or the Fisher exact test to identify differentially expressed...

  10. UpSet: Visualization of Intersecting Sets

    Science.gov (United States)

    Lex, Alexander; Gehlenborg, Nils; Strobelt, Hendrik; Vuillemot, Romain; Pfister, Hanspeter

    2016-01-01

    Understanding relationships between sets is an important analysis task that has received widespread attention in the visualization community. The major challenge in this context is the combinatorial explosion of the number of set intersections if the number of sets exceeds a trivial threshold. In this paper we introduce UpSet, a novel visualization technique for the quantitative analysis of sets, their intersections, and aggregates of intersections. UpSet is focused on creating task-driven aggregates, communicating the size and properties of aggregates and intersections, and a duality between the visualization of the elements in a dataset and their set membership. UpSet visualizes set intersections in a matrix layout and introduces aggregates based on groupings and queries. The matrix layout enables the effective representation of associated data, such as the number of elements in the aggregates and intersections, as well as additional summary statistics derived from subset or element attributes. Sorting according to various measures enables a task-driven analysis of relevant intersections and aggregates. The elements represented in the sets and their associated attributes are visualized in a separate view. Queries based on containment in specific intersections, aggregates or driven by attribute filters are propagated between both views. We also introduce several advanced visual encodings and interaction methods to overcome the problems of varying scales and to address scalability. UpSet is web-based and open source. We demonstrate its general utility in multiple use cases from various domains. PMID:26356912

  11. Fuzzy sets, rough sets, multisets and clustering

    CERN Document Server

    Dahlbom, Anders; Narukawa, Yasuo

    2017-01-01

    This book is dedicated to Prof. Sadaaki Miyamoto and presents cutting-edge papers in some of the areas in which he contributed. Bringing together contributions by leading researchers in the field, it concretely addresses clustering, multisets, rough sets and fuzzy sets, as well as their applications in areas such as decision-making. The book is divided in four parts, the first of which focuses on clustering and classification. The second part puts the spotlight on multisets, bags, fuzzy bags and other fuzzy extensions, while the third deals with rough sets. Rounding out the coverage, the last part explores fuzzy sets and decision-making.

  12. Essential Bacillus subtilis genes

    DEFF Research Database (Denmark)

    Kobayashi, K.; Ehrlich, S.D.; Albertini, A.

    2003-01-01

    To estimate the minimal gene set required to sustain bacterial life in nutritious conditions, we carried out a systematic inactivation of Bacillus subtilis genes. Among approximate to4,100 genes of the organism, only 192 were shown to be indispensable by this or previous work. Another 79 genes were...... predicted to be essential. The vast majority of essential genes were categorized in relatively few domains of cell metabolism, with about half involved in information processing, one-fifth involved in the synthesis of cell envelope and the determination of cell shape and division, and one-tenth related...... to cell energetics. Only 4% of essential genes encode unknown functions. Most essential genes are present throughout a wide range of Bacteria, and almost 70% can also be found in Archaea and Eucarya. However, essential genes related to cell envelope, shape, division, and respiration tend to be lost from...

  13. Gene panel testing of 5589 BRCA1/2-negative index patients with breast cancer in a routine diagnostic setting: results of the German Consortium for Hereditary Breast and Ovarian Cancer.

    Science.gov (United States)

    Hauke, Jan; Horvath, Judit; Groß, Eva; Gehrig, Andrea; Honisch, Ellen; Hackmann, Karl; Schmidt, Gunnar; Arnold, Norbert; Faust, Ulrike; Sutter, Christian; Hentschel, Julia; Wang-Gohrke, Shan; Smogavec, Mateja; Weber, Bernhard H F; Weber-Lassalle, Nana; Weber-Lassalle, Konstantin; Borde, Julika; Ernst, Corinna; Altmüller, Janine; Volk, Alexander E; Thiele, Holger; Hübbel, Verena; Nürnberg, Peter; Keupp, Katharina; Versmold, Beatrix; Pohl, Esther; Kubisch, Christian; Grill, Sabine; Paul, Victoria; Herold, Natalie; Lichey, Nadine; Rhiem, Kerstin; Ditsch, Nina; Ruckert, Christian; Wappenschmidt, Barbara; Auber, Bernd; Rump, Andreas; Niederacher, Dieter; Haaf, Thomas; Ramser, Juliane; Dworniczak, Bernd; Engel, Christoph; Meindl, Alfons; Schmutzler, Rita K; Hahnen, Eric

    2018-03-09

    The prevalence of germ line mutations in non-BRCA1/2 genes associated with hereditary breast cancer (BC) is low, and the role of some of these genes in BC predisposition and pathogenesis is conflicting. In this study, 5589 consecutive BC index patients negative for pathogenic BRCA1/2 mutations and 2189 female controls were screened for germ line mutations in eight cancer predisposition genes (ATM, CDH1, CHEK2, NBN, PALB2, RAD51C, RAD51D, and TP53). All patients met the inclusion criteria of the German Consortium for Hereditary Breast and Ovarian Cancer for germ line testing. The highest mutation prevalence was observed in the CHEK2 gene (2.5%), followed by ATM (1.5%) and PALB2 (1.2%). The mutation prevalence in each of the remaining genes was 0.3% or lower. Using Exome Aggregation Consortium control data, we confirm significant associations of heterozygous germ line mutations with BC for ATM (OR: 3.63, 95%CI: 2.67-4.94), CDH1 (OR: 17.04, 95%CI: 3.54-82), CHEK2 (OR: 2.93, 95%CI: 2.29-3.75), PALB2 (OR: 9.53, 95%CI: 6.25-14.51), and TP53 (OR: 7.30, 95%CI: 1.22-43.68). NBN germ line mutations were not significantly associated with BC risk (OR:1.39, 95%CI: 0.73-2.64). Due to their low mutation prevalence, the RAD51C and RAD51D genes require further investigation. Compared with control datasets, predicted damaging rare missense variants were significantly more prevalent in CHEK2 and TP53 in BC index patients. Compared with the overall sample, only TP53 mutation carriers show a significantly younger age at first BC diagnosis. We demonstrate a significant association of deleterious variants in the CHEK2, PALB2, and TP53 genes with bilateral BC. Both, ATM and CHEK2, were negatively associated with triple-negative breast cancer (TNBC) and estrogen receptor (ER)-negative tumor phenotypes. A particularly high CHEK2 mutation prevalence (5.2%) was observed in patients with human epidermal growth factor receptor 2 (HER2)-positive tumors. © 2018 The Authors. Cancer Medicine

  14. A new set of ESTs from chickpea (Cicer arietinum L. embryo reveals two novel F-box genes, CarF-box_PP2 and CarF-box_LysM, with potential roles in seed development.

    Directory of Open Access Journals (Sweden)

    Shefali Gupta

    Full Text Available Considering the economic importance of chickpea (C. arietinum L. seeds, it is important to understand the mechanisms underlying seed development for which a cDNA library was constructed from 6 day old chickpea embryos. A total of 8,186 ESTs were obtained from which 4,048 high quality ESTs were assembled into 1,480 unigenes that majorly encoded genes involved in various metabolic and regulatory pathways. Of these, 95 ESTs were found to be involved in ubiquitination related protein degradation pathways and 12 ESTs coded specifically for putative F-box proteins. Differential transcript accumulation of these putative F-box genes was observed in chickpea tissues as evidenced by quantitative real-time PCR. Further, to explore the role of F-box proteins in chickpea seed development, two F-box genes were selected for molecular characterization. These were named as CarF-box_PP2 and CarF-box_LysM depending on their C-terminal domains, PP2 and LysM, respectively. Their highly conserved structures led us to predict their target substrates. Subcellular localization experiment revealed that CarF-box_PP2 was localized in the cytoplasm and CarF-box_LysM was localized in the nucleus. We demonstrated their physical interactions with SKP1 protein, which validated that they function as F-box proteins in the formation of SCF complexes. Sequence analysis of their promoter regions revealed certain seed specific cis-acting elements that may be regulating their preferential transcript accumulation in the seed. Overall, the study helped in expanding the EST database of chickpea, which was further used to identify two novel F-box genes having a potential role in seed development.

  15. A new set of ESTs from chickpea (Cicer arietinum L.) embryo reveals two novel F-box genes, CarF-box_PP2 and CarF-box_LysM, with potential roles in seed development.

    Science.gov (United States)

    Gupta, Shefali; Garg, Vanika; Bhatia, Sabhyata

    2015-01-01

    Considering the economic importance of chickpea (C. arietinum L.) seeds, it is important to understand the mechanisms underlying seed development for which a cDNA library was constructed from 6 day old chickpea embryos. A total of 8,186 ESTs were obtained from which 4,048 high quality ESTs were assembled into 1,480 unigenes that majorly encoded genes involved in various metabolic and regulatory pathways. Of these, 95 ESTs were found to be involved in ubiquitination related protein degradation pathways and 12 ESTs coded specifically for putative F-box proteins. Differential transcript accumulation of these putative F-box genes was observed in chickpea tissues as evidenced by quantitative real-time PCR. Further, to explore the role of F-box proteins in chickpea seed development, two F-box genes were selected for molecular characterization. These were named as CarF-box_PP2 and CarF-box_LysM depending on their C-terminal domains, PP2 and LysM, respectively. Their highly conserved structures led us to predict their target substrates. Subcellular localization experiment revealed that CarF-box_PP2 was localized in the cytoplasm and CarF-box_LysM was localized in the nucleus. We demonstrated their physical interactions with SKP1 protein, which validated that they function as F-box proteins in the formation of SCF complexes. Sequence analysis of their promoter regions revealed certain seed specific cis-acting elements that may be regulating their preferential transcript accumulation in the seed. Overall, the study helped in expanding the EST database of chickpea, which was further used to identify two novel F-box genes having a potential role in seed development.

  16. Rapid and simple method by combining FTA™ card DNA extraction with two set multiplex PCR for simultaneous detection of non-O157 Shiga toxin-producing Escherichia coli strains and virulence genes in food samples.

    Science.gov (United States)

    Kim, S A; Park, S H; Lee, S I; Ricke, S C

    2017-12-01

    The aim of this research was to optimize two multiplex polymerase chain reaction (PCR) assays that could simultaneously detect six non-O157 Shiga toxin-producing Escherichia coli (STEC) as well as the three virulence genes. We also investigated the potential of combining the FTA™ card-based DNA extraction with the multiplex PCR assays. Two multiplex PCR assays were optimized using six primer pairs for each non-O157 STEC serogroup and three primer pairs for virulence genes respectively. Each STEC strain specific primer pair only amplified 155, 238, 321, 438, 587 and 750 bp product for O26, O45, O103, O111, O121 and O145 respectively. Three virulence genes were successfully multiplexed: 375 bp for eae, 655 bp for stx1 and 477 bp for stx2. When two multiplex PCR assays were validated with ground beef samples, distinctive bands were also successfully produced. Since the two multiplex PCR examined here can be conducted under the same PCR conditions, the six non-O157 STEC and their virulence genes could be concurrently detected with one run on the thermocycler. In addition, all bands clearly appeared to be amplified by FTA card DNA extraction in the multiplex PCR assay from the ground beef sample, suggesting that an FTA card could be a viable sampling approach for rapid and simple DNA extraction to reduce time and labour and therefore may have practical use for the food industry. Two multiplex polymerase chain reaction (PCR) assays were optimized for discrimination of six non-O157 Shiga toxin-producing Escherichia coli (STEC) and identification of their major virulence genes within a single reaction, simultaneously. This study also determined the successful ability of the FTA™ card as an alternative to commercial DNA extraction method for conducting multiplex STEC PCR assays. The FTA™ card combined with multiplex PCR holds promise for the food industry by offering a simple and rapid DNA sample method for reducing time, cost and labour for detection of STEC in

  17. Studying Genes

    Science.gov (United States)

    ... NIGMS NIGMS Home > Science Education > Studying Genes Studying Genes Tagline (Optional) Middle/Main Content Area PDF Version (382 KB) Other Fact Sheets What are genes? Genes are segments of DNA that contain instructions ...

  18. Discovering genes underlying QTL

    Energy Technology Data Exchange (ETDEWEB)

    Vanavichit, Apichart [Kasetsart University, Kamphaengsaen, Nakorn Pathom (Thailand)

    2002-02-01

    A map-based approach has allowed scientists to discover few genes at a time. In addition, the reproductive barrier between cultivated rice and wild relatives has prevented us from utilizing the germ plasm by a map-based approach. Most genetic traits important to agriculture or human diseases are manifested as observable, quantitative phenotypes called Quantitative Trait Loci (QTL). In many instances, the complexity of the phenotype/genotype interaction and the general lack of clearly identifiable gene products render the direct molecular cloning approach ineffective, thus additional strategies like genome mapping are required to identify the QTL in question. Genome mapping requires no prior knowledge of the gene function, but utilizes statistical methods to identify the most likely gene location. To completely characterize genes of interest, the initially mapped region of a gene location will have to be narrowed down to a size that is suitable for cloning and sequencing. Strategies for gene identification within the critical region have to be applied after the sequencing of a potentially large clone or set of clones that contains this gene(s). Tremendous success of positional cloning has been shown for cloning many genes responsible for human diseases, including cystic fibrosis and muscular dystrophy as well as plant disease resistance genes. Genome and QTL mapping, positional cloning: the pre-genomics era, comparative approaches to gene identification, and positional cloning: the genomics era are discussed in the report. (M. Suetake)

  19. Discovering genes underlying QTL

    International Nuclear Information System (INIS)

    Vanavichit, Apichart

    2002-01-01

    A map-based approach has allowed scientists to discover few genes at a time. In addition, the reproductive barrier between cultivated rice and wild relatives has prevented us from utilizing the germ plasm by a map-based approach. Most genetic traits important to agriculture or human diseases are manifested as observable, quantitative phenotypes called Quantitative Trait Loci (QTL). In many instances, the complexity of the phenotype/genotype interaction and the general lack of clearly identifiable gene products render the direct molecular cloning approach ineffective, thus additional strategies like genome mapping are required to identify the QTL in question. Genome mapping requires no prior knowledge of the gene function, but utilizes statistical methods to identify the most likely gene location. To completely characterize genes of interest, the initially mapped region of a gene location will have to be narrowed down to a size that is suitable for cloning and sequencing. Strategies for gene identification within the critical region have to be applied after the sequencing of a potentially large clone or set of clones that contains this gene(s). Tremendous success of positional cloning has been shown for cloning many genes responsible for human diseases, including cystic fibrosis and muscular dystrophy as well as plant disease resistance genes. Genome and QTL mapping, positional cloning: the pre-genomics era, comparative approaches to gene identification, and positional cloning: the genomics era are discussed in the report. (M. Suetake)

  20. Gene Therapy

    Science.gov (United States)

    Gene therapy Overview Gene therapy involves altering the genes inside your body's cells in an effort to treat or stop disease. Genes contain your ... that don't work properly can cause disease. Gene therapy replaces a faulty gene or adds a new ...

  1. Invariant sets for Windows

    CERN Document Server

    Morozov, Albert D; Dragunov, Timothy N; Malysheva, Olga V

    1999-01-01

    This book deals with the visualization and exploration of invariant sets (fractals, strange attractors, resonance structures, patterns etc.) for various kinds of nonlinear dynamical systems. The authors have created a special Windows 95 application called WInSet, which allows one to visualize the invariant sets. A WInSet installation disk is enclosed with the book.The book consists of two parts. Part I contains a description of WInSet and a list of the built-in invariant sets which can be plotted using the program. This part is intended for a wide audience with interests ranging from dynamical

  2. Hierarchical sets: analyzing pangenome structure through scalable set visualizations

    Science.gov (United States)

    2017-01-01

    Abstract Motivation: The increase in available microbial genome sequences has resulted in an increase in the size of the pangenomes being analyzed. Current pangenome visualizations are not intended for the pangenome sizes possible today and new approaches are necessary in order to convert the increase in available information to increase in knowledge. As the pangenome data structure is essentially a collection of sets we explore the potential for scalable set visualization as a tool for pangenome analysis. Results: We present a new hierarchical clustering algorithm based on set arithmetics that optimizes the intersection sizes along the branches. The intersection and union sizes along the hierarchy are visualized using a composite dendrogram and icicle plot, which, in pangenome context, shows the evolution of pangenome and core size along the evolutionary hierarchy. Outlying elements, i.e. elements whose presence pattern do not correspond with the hierarchy, can be visualized using hierarchical edge bundles. When applied to pangenome data this plot shows putative horizontal gene transfers between the genomes and can highlight relationships between genomes that is not represented by the hierarchy. We illustrate the utility of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. Availability and Implementation: The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https://cran.r-project.org/web/packages/hierarchicalSets) Contact: thomasp85@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28130242

  3. Fusion of NUP98 and the SET binding protein 1 (SETBP1) gene in a paediatric acute T cell lymphoblastic leukaemia with t(11;18)(p15;q12).

    Science.gov (United States)

    Panagopoulos, Ioannis; Kerndrup, Gitte; Carlsen, Niels; Strömbeck, Bodil; Isaksson, Margareth; Johansson, Bertil

    2007-01-01

    Three NUP98 chimaeras have previously been reported in T cell acute lymphoblastic leukaemia (T-ALL): NUP98/ADD3, NUP98/CCDC28A, and NUP98/RAP1GDS1. We report a T-ALL with t(11;18)(p15;q12) resulting in a novel NUP98 fusion. Fluorescent in situ hybridisation showed NUP98 and SET binding protein 1(SETBP1) fusion signals; other analyses showed that exon 12 of NUP98 was fused in-frame with exon 5 of SETBP1. Nested polymerase chain reaction did not amplify the reciprocal SETBP1/NUP98, suggesting that NUP98/SETBP1 transcript is pathogenetically important. SETBP1 has previously not been implicated in leukaemias; however, it encodes a protein that specifically interacts with SET, fused to NUP214 in a case of acute undifferentiated leukaemia.

  4. Initial description of primate-specific cystine-knot Prometheus genes and differential gene expansions of D-dopachrome tautomerase genes.

    Science.gov (United States)

    Premzl, Marko

    2015-06-01

    Using eutherian comparative genomic analysis protocol and public genomic sequence data sets, the present work attempted to update and revise two gene data sets. The most comprehensive third party annotation gene data sets of eutherian adenohypophysis cystine-knot genes (128 complete coding sequences), and d-dopachrome tautomerases and macrophage migration inhibitory factor genes (30 complete coding sequences) were annotated. For example, the present study first described primate-specific cystine-knot Prometheus genes, as well as differential gene expansions of D-dopachrome tautomerase genes. Furthermore, new frameworks of future experiments of two eutherian gene data sets were proposed.

  5. Value Set Authority Center

    Data.gov (United States)

    U.S. Department of Health & Human Services — The VSAC provides downloadable access to all official versions of vocabulary value sets contained in the 2014 Clinical Quality Measures (CQMs). Each value set...

  6. Alternate superior Julia sets

    International Nuclear Information System (INIS)

    Yadav, Anju; Rani, Mamta

    2015-01-01

    Alternate Julia sets have been studied in Picard iterative procedures. The purpose of this paper is to study the quadratic and cubic maps using superior iterates to obtain Julia sets with different alternate structures. Analytically, graphically and computationally it has been shown that alternate superior Julia sets can be connected, disconnected and totally disconnected, and also fattier than the corresponding alternate Julia sets. A few examples have been studied by applying different type of alternate structures

  7. Sets, Planets, and Comets

    Science.gov (United States)

    Baker, Mark; Beltran, Jane; Buell, Jason; Conrey, Brian; Davis, Tom; Donaldson, Brianna; Detorre-Ozeki, Jeanne; Dibble, Leila; Freeman, Tom; Hammie, Robert; Montgomery, Julie; Pickford, Avery; Wong, Justine

    2013-01-01

    Sets in the game "Set" are lines in a certain four-dimensional space. Here we introduce planes into the game, leading to interesting mathematical questions, some of which we solve, and to a wonderful variation on the game "Set," in which every tableau of nine cards must contain at least one configuration for a player to pick up.

  8. Axiomatic set theory

    CERN Document Server

    Suppes, Patrick

    1972-01-01

    This clear and well-developed approach to axiomatic set theory is geared toward upper-level undergraduates and graduate students. It examines the basic paradoxes and history of set theory and advanced topics such as relations and functions, equipollence, finite sets and cardinal numbers, rational and real numbers, and other subjects. 1960 edition.

  9. Paired fuzzy sets

    DEFF Research Database (Denmark)

    Rodríguez, J. Tinguaro; Franco de los Ríos, Camilo; Gómez, Daniel

    2015-01-01

    In this paper we want to stress the relevance of paired fuzzy sets, as already proposed in previous works of the authors, as a family of fuzzy sets that offers a unifying view for different models based upon the opposition of two fuzzy sets, simply allowing the existence of different types...

  10. Elements of set theory

    CERN Document Server

    Enderton, Herbert B

    1977-01-01

    This is an introductory undergraduate textbook in set theory. In mathematics these days, essentially everything is a set. Some knowledge of set theory is necessary part of the background everyone needs for further study of mathematics. It is also possible to study set theory for its own interest--it is a subject with intruiging results anout simple objects. This book starts with material that nobody can do without. There is no end to what can be learned of set theory, but here is a beginning.

  11. Metric graphic sets

    Science.gov (United States)

    Garces, I. J. L.; Rosario, J. B.

    2017-10-01

    For an ordered subset W = {w 1, w 2, …, wk } of vertices in a connected graph G and a vertex v of G, the metric representation of v with respect to W is the k-vector r(v|W) = (d(v, w 1), d(v, w 2), …, d(v, wk )), where d(v, wi ) is the distance of the vertices v and wi in G. The set W is called a resolving set of G if r(u|W) = r(v|W) implies u = v. The metric dimension of G, denoted by β(G), is the minimum cardinality of a resolving set of G, and a resolving set of G with cardinality equal to its metric dimension is called a metric basis of G. A set T of vectors is called a positive lattice set if all the coordinates in each vector of T are positive integers. A positive lattice set T consisting of n k-vectors is called a metric graphic set if there exists a simple connected graph G of order n + k with β(G) = k such that T = {r(ui |S) : ui ∈ V (G)\\S, 1 ≤ i ≤ n} for some metric basis S = {s 1, s 2, …, sk } of G. If such G exists, then we say G is a metric graphic realization of T. In this paper, we introduce the concept of metric graphic sets anchored on the concept of metric dimension and provide some characterizations. We also give necessary and sufficient conditions for any positive lattice set consisting of 2 k-vectors to be a metric graphic set. We provide an upper bound for the sum of all the coordinates of any metric graphic set and enumerate some properties of positive lattice sets consisting of n 2-vectors that are not metric graphic sets.

  12. Acronical Risings and Settings

    Science.gov (United States)

    Hockey, Thomas A.

    2012-01-01

    A concept found in historical primary sources, and useful in contemporary historiography, is the acronical rising and setting of stars (or planets). Topocentric terms, they provide information about a star's relationship to the Sun and thus its visibility in the sky. Yet there remains ambiguity as to what these two phrases actually mean. "Acronical” is said to have come from the Greek akros ("point,” "summit,” or "extremity") and nux ("night"). While all sources agree that the word is originally Greek, there are alternate etymologies for it. A more serious difficulty with acronical rising and setting is that there are two competing definitions. One I call the Poetical Definition. Acronical rising (or setting) is one of the three Poetical Risings (or Settings) known to classicists. (The other two are cosmical rising/setting, discussed below, and the more familiar helical rising/setting.) The term "poetical" refers to these words use in classical poetry, e. g., that of Columella, Hesiod, Ovid, Pliny the Younger, and Virgil. The Poetical Definition of "acronical” usually is meant in this context. The Poetical Definition of "acronical” is as follows: When a star rises as the Sun sets, it rises acronically. When a star sets as the Sun sets, it sets acronically. In contrast with the Poetical Definition, there also is what I call the Astronomical Definition. The Astronomical Definition is somewhat more likely to appear in astronomical, mathematical, or navigational works. When the Astronomical Definition is recorded in dictionaries, it is often with the protasis "In astronomy, . . . ." The Astronomical Definition of "acronical” is as follows: When a star rises as the Sun sets, it rises acronically. When a star sets as the Sun rises, it sets acronically. I will attempt to sort this all out in my talk.

  13. Fusion of NUP98 and the SET binding protein 1 (SETBP1) gene in a paediatric acute T cell lymphoblastic leukaemia with t(11;18)(p15;q12)

    DEFF Research Database (Denmark)

    Panagopoulos, Ioannis; Kerndrup, Gitte; Carlsen, Niels

    2007-01-01

    Three NUP98 chimaeras have previously been reported in T cell acute lymphoblastic leukaemia (T-ALL): NUP98/ADD3, NUP98/CCDC28A, and NUP98/RAP1GDS1. We report a T-ALL with t(11;18)(p15;q12) resulting in a novel NUP98 fusion. Fluorescent in situ hybridisation showed NUP98 and SET binding protein 1......(SETBP1) fusion signals; other analyses showed that exon 12 of NUP98 was fused in-frame with exon 5 of SETBP1. Nested polymerase chain reaction did not amplify the reciprocal SETBP1/NUP98, suggesting that NUP98/SETBP1 transcript is pathogenetically important. SETBP1 has previously not been implicated...

  14. Social Set Analysis

    DEFF Research Database (Denmark)

    Vatrapu, Ravi; Mukkamala, Raghava Rao; Hussain, Abid

    2016-01-01

    Current analytical approaches in computational social science can be characterized by four dominant paradigms: text analysis (information extraction and classification), social network analysis (graph theory), social complexity analysis (complex systems science), and social simulations (cellular...... this limitation, based on the sociology of associations and the mathematics of set theory, this paper presents a new approach to big data analytics called social set analysis. Social set analysis consists of a generative framework for the philosophies of computational social science, theory of social data...... analysis, crisp set-theoretical interaction analysis, and event-studies-oriented set-theoretical visualizations. Implications for big data analytics, current limitations of the set-theoretical approach, and future directions are outlined....

  15. Settings for Suicide Prevention

    Science.gov (United States)

    ... Settings Behavioral Health Care Inpatient Mental Health Outpatient Mental Health Substance Abuse Treatment ... Emergency Departments Primary Care Justice System Adult Justice System Juvenile Justice ...

  16. gene structure, gene expression

    Indian Academy of Sciences (India)

    and seedling leaves were sampled at 6 h after the treatment. For cold stress, the seedlings were transferred to 4◦C growth chamber for 30 min. Control seedlings were exposed to none of these treatments. To examine the expression patterns of these predicted genes in Poplar and to further confirm their stress responsive-.

  17. Profiling of the toxicity mechanisms of coated and uncoated silver nanoparticles to yeast Saccharomyces cerevisiae BY4741 using a set of its 9 single-gene deletion mutants defective in oxidative stress response, cell wall or membrane integrity and endocytosis.

    Science.gov (United States)

    Käosaar, Sandra; Kahru, Anne; Mantecca, Paride; Kasemets, Kaja

    2016-09-01

    The widespread use of nanosilver in various antibacterial, antifungal, and antiviral products warrants the studies of the toxicity pathways of nanosilver-enabled materials toward microbes and viruses. We profiled the toxicity mechanisms of uncoated, casein-coated, and polyvinylpyrrolidone-coated silver nanoparticles (AgNPs) using Saccharomyces cerevisiae wild-type (wt) and its 9 single-gene deletion mutants defective in oxidative stress (OS) defense, cell wall/membrane integrity, and endocytosis. The 48-h growth inhibition assay in organic-rich growth medium and 24-h cell viability assay in deionized (DI) water were applied whereas AgNO3, H2O2, and SDS served as positive controls. Both coated AgNPs (primary size 8-12nm) were significantly more toxic than the uncoated (~85nm) AgNPs. All studied AgNPs were ~30 times more toxic if exposed to yeast cells in DI water than in the rich growth medium: the IC50 based on nominal concentration of AgNPs in the growth inhibition test ranged from 77 to 576mg Ag/L and in the cell viability test from 2.7 to 18.7mg Ag/L, respectively. Confocal microscopy showed that wt but not endocytosis mutant (end3Δ) internalized AgNPs. Comparison of toxicity patterns of wt and mutant strains defective in OS defense and membrane integrity revealed that the toxicity of the studied AgNPs to S. cerevisiae was not caused by the OS or cell wall/membrane permeabilization. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Archaeological predictive model set.

    Science.gov (United States)

    2015-03-01

    This report is the documentation for Task 7 of the Statewide Archaeological Predictive Model Set. The goal of this project is to : develop a set of statewide predictive models to assist the planning of transportation projects. PennDOT is developing t...

  19. The Model Confidence Set

    DEFF Research Database (Denmark)

    Hansen, Peter Reinhard; Lunde, Asger; Nason, James M.

    The paper introduces the model confidence set (MCS) and applies it to the selection of models. A MCS is a set of models that is constructed such that it will contain the best model with a given level of confidence. The MCS is in this sense analogous to a confidence interval for a parameter. The MCS...

  20. "Ready, Set, FLOW!"

    Science.gov (United States)

    Stroud, Wesley

    2018-01-01

    All educators want their classrooms to be inviting areas that support investigations. However, a common mistake is to fill learning spaces with items or objects that are set up by the teacher or are simply "for show." This type of setting, although it may create a comfortable space for students, fails to stimulate investigations and…

  1. Haar meager sets revisited

    Czech Academy of Sciences Publication Activity Database

    Doležal, Martin; Rmoutil, M.; Vejnar, B.; Vlasák, V.

    2016-01-01

    Roč. 440, č. 2 (2016), s. 922-939 ISSN 0022-247X Institutional support: RVO:67985840 Keywords : Haar meager set * Haar null set * Polish group Subject RIV: BA - General Mathematics Impact factor: 1.064, year: 2016 http://www.sciencedirect.com/science/article/pii/S0022247X1600305X

  2. Pseudo-set framing.

    Science.gov (United States)

    Barasz, Kate; John, Leslie K; Keenan, Elizabeth A; Norton, Michael I

    2017-10-01

    Pseudo-set framing-arbitrarily grouping items or tasks together as part of an apparent "set"-motivates people to reach perceived completion points. Pseudo-set framing changes gambling choices (Study 1), effort (Studies 2 and 3), giving behavior (Field Data and Study 4), and purchase decisions (Study 5). These effects persist in the absence of any reward, when a cost must be incurred, and after participants are explicitly informed of the arbitrariness of the set. Drawing on Gestalt psychology, we develop a conceptual account that predicts what will-and will not-act as a pseudo-set, and defines the psychological process through which these pseudo-sets affect behavior: over and above typical reference points, pseudo-set framing alters perceptions of (in)completeness, making intermediate progress seem less complete. In turn, these feelings of incompleteness motivate people to persist until the pseudo-set has been fulfilled. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  3. Descriptive set theory

    CERN Document Server

    Moschovakis, YN

    1987-01-01

    Now available in paperback, this monograph is a self-contained exposition of the main results and methods of descriptive set theory. It develops all the necessary background material from logic and recursion theory, and treats both classical descriptive set theory and the effective theory developed by logicians.

  4. Theory of random sets

    CERN Document Server

    Molchanov, Ilya

    2017-01-01

    This monograph, now in a thoroughly revised second edition, offers the latest research on random sets. It has been extended to include substantial developments achieved since 2005, some of them motivated by applications of random sets to econometrics and finance. The present volume builds on the foundations laid by Matheron and others, including the vast advances in stochastic geometry, probability theory, set-valued analysis, and statistical inference. It shows the various interdisciplinary relationships of random set theory within other parts of mathematics, and at the same time fixes terminology and notation that often vary in the literature, establishing it as a natural part of modern probability theory and providing a platform for future development. It is completely self-contained, systematic and exhaustive, with the full proofs that are necessary to gain insight. Aimed at research level, Theory of Random Sets will be an invaluable reference for probabilists; mathematicians working in convex and integ...

  5. Economic communication model set

    Science.gov (United States)

    Zvereva, Olga M.; Berg, Dmitry B.

    2017-06-01

    This paper details findings from the research work targeted at economic communications investigation with agent-based models usage. The agent-based model set was engineered to simulate economic communications. Money in the form of internal and external currencies was introduced into the models to support exchanges in communications. Every model, being based on the general concept, has its own peculiarities in algorithm and input data set since it was engineered to solve the specific problem. Several and different origin data sets were used in experiments: theoretic sets were estimated on the basis of static Leontief's equilibrium equation and the real set was constructed on the basis of statistical data. While simulation experiments, communication process was observed in dynamics, and system macroparameters were estimated. This research approved that combination of an agent-based and mathematical model can cause a synergetic effect.

  6. Evidence for homosexuality gene

    Energy Technology Data Exchange (ETDEWEB)

    Pool, R.

    1993-07-16

    A genetic analysis of 40 pairs of homosexual brothers has uncovered a region on the X chromosome that appears to contain a gene or genes for homosexuality. When analyzing the pedigrees of homosexual males, the researcheres found evidence that the trait has a higher likelihood of being passed through maternal genes. This led them to search the X chromosome for genes predisposing to homosexuality. The researchers examined the X chromosomes of pairs of homosexual brothers for regions of DNA that most or all had in common. Of the 40 sets of brothers, 33 shared a set of five markers in the q28 region of the long arm of the X chromosome. The linkage has a LOD score of 4.0, which translates into a 99.5% certainty that there is a gene or genes in this area that predispose males to homosexuality. The chief researcher warns, however, that this one site cannot explain all instances of homosexuality, since there were some cases where the trait seemed to be passed paternally. And even among those brothers where there was no evidence that the trait was passed paternally, seven sets of brothers did not share the Xq28 markers. It seems likely that homosexuality arises from a variety of causes.

  7. Analysis of the real EADGENE data set:

    DEFF Research Database (Denmark)

    Jaffrézic, Florence; de Koning, Dirk-Jan; Boettcher, Paul J

    2007-01-01

    different mastitis causing bacteria: Escherichia coli and Staphylococcus aureus. It was reassuring to see that most of the teams found the same main biological results. In fact, most of the differentially expressed genes were found for infection by E. coli between uninfected and 24 h challenged udder...... methods for the detection of differentially expressed genes in order to provide some more general data analysis guidelines. All the workshop participants were given a real data set obtained in an EADGENE funded microarray study looking at the gene expression changes following artificial infection with two...... quarters. Very little transcriptional variation was observed for the bacteria S. aureus. Lists of differentially expressed genes found by the different research teams were, however, quite dependent on the method used, especially concerning the data quality control step. These analyses also emphasised...

  8. Stationary Markov Sets.

    Science.gov (United States)

    1986-04-01

    I, - H (2.10 ) 7.,’..,. . . . l- 0{o)/(ca+ Ii (1R)) = X(1- clil {O)/Ca1 + I 1 (iR+)) (2.11) In particular, when IT is an infinite measure...limits * of regenerative sets. Z. Wahrscheinlichkeitstheorie verw,. Gebiete 70, 157-173 (1985). 4. Hoffmann-j6rgensen, J.; Markov sets. Math . Scand. 24...1969). S . Krylov, N.V., Yushkevich, A.A.; Markov random sets. Trans. Mosc. Math . Soc. 13, 127-153 (1965). 6. M1aisonneuve, B, ; Ensembles

  9. Set theory essentials

    CERN Document Server

    Milewski, Emil G

    2012-01-01

    REA's Essentials provide quick and easy access to critical information in a variety of different fields, ranging from the most basic to the most advanced. As its name implies, these concise, comprehensive study guides summarize the essentials of the field covered. Essentials are helpful when preparing for exams, doing homework and will remain a lasting reference source for students, teachers, and professionals. Set Theory includes elementary logic, sets, relations, functions, denumerable and non-denumerable sets, cardinal numbers, Cantor's theorem, axiom of choice, and order relations.

  10. Symmetry Adapted Basis Sets

    DEFF Research Database (Denmark)

    Avery, John Scales; Rettrup, Sten; Avery, James Emil

    In theoretical physics, theoretical chemistry and engineering, one often wishes to solve partial differential equations subject to a set of boundary conditions. This gives rise to eigenvalue problems of which some solutions may be very difficult to find. For example, the problem of finding...... in such problems can be much reduced by making use of symmetry-adapted basis functions. The conventional method for generating symmetry-adapted basis sets is through the application of group theory, but this can be difficult. This book describes an easier method for generating symmetry-adapted basis sets...

  11. Basic set theory

    CERN Document Server

    Levy, Azriel

    2002-01-01

    An advanced-level treatment of the basics of set theory, this text offers students a firm foundation, stopping just short of the areas employing model-theoretic methods. Geared toward upper-level undergraduate and graduate students, it consists of two parts: the first covers pure set theory, including the basic motions, order and well-foundedness, cardinal numbers, the ordinals, and the axiom of choice and some of it consequences; the second deals with applications and advanced topics such as point set topology, real spaces, Boolean algebras, and infinite combinatorics and large cardinals. An

  12. Setting the Minimum Wage

    OpenAIRE

    Boeri, Tito

    2009-01-01

    The process leading to the setting of the minimum wage so far has been fairly overlooked by economists. This paper suggests that this is a serious limitation as the setting regime contributes to explain cross-country variation in the fine-tuning of the minimum wage, hence in the way in which the trade-off between reducing poverty among working people and shutting down low productivity jobs is addressed. There are two common ways of setting national minimum wages: they are either government le...

  13. Combinatorics of set partitions

    CERN Document Server

    Mansour, Toufik

    2012-01-01

    Focusing on a very active area of mathematical research in the last decade, Combinatorics of Set Partitions presents methods used in the combinatorics of pattern avoidance and pattern enumeration in set partitions. Designed for students and researchers in discrete mathematics, the book is a one-stop reference on the results and research activities of set partitions from 1500 A.D. to today. Each chapter gives historical perspectives and contrasts different approaches, including generating functions, kernel method, block decomposition method, generating tree, and Wilf equivalences. Methods and d

  14. Lebesgue Sets Immeasurable Existence

    Directory of Open Access Journals (Sweden)

    Diana Marginean Petrovai

    2012-12-01

    Full Text Available It is well known that the notion of measure and integral were released early enough in close connection with practical problems of measuring of geometric figures. Notion of measure was outlined in the early 20th century through H. Lebesgue’s research, founder of the modern theory of measure and integral. It was developed concurrently a technique of integration of functions. Gradually it was formed a specific area todaycalled the measure and integral theory. Essential contributions to building this theory was made by a large number of mathematicians: C. Carathodory, J. Radon, O. Nikodym, S. Bochner, J. Pettis, P. Halmos and many others. In the following we present several abstract sets, classes of sets. There exists the sets which are not Lebesgue measurable and the sets which are Lebesgue measurable but are not Borel measurable. Hence B ⊂ L ⊂ P(X.

  15. General Paleoclimatology Data Sets

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Data of past climate and environment derived from unusual proxy evidence. Parameter keywords describe what was measured in this data set. Additional summary...

  16. HEDIS Limited Data Set

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Healthcare Effectiveness Data and Information Set (HEDIS) is a tool used by more than 90 percent of Americas health plans to measure performance on important...

  17. Norovirus in Healthcare Settings

    Science.gov (United States)

    ... MRSA Mycobacterium abscessus Norovirus Pseudomonas aeruginosa Tracking CRPA Staphylococcus aureus Tuberculosis VISA / VRSA Vancomycin-resistant Enterococci (VRE) in Healthcare Settings Preventing HAIs Targeted Assessment for Prevention (TAP) TAP CAUTI Implementation Guide TAP CDI Implementation ...

  18. Set theory and physics

    Energy Technology Data Exchange (ETDEWEB)

    Svozil, K. [Univ. of Technology, Vienna (Austria)

    1995-11-01

    Inasmuch as physical theories are formalizable, set theory provides a framework for theoretical physics. Four speculations about the relevance of set theoretical modeling for physics are presented: the role of transcendental set theory (i) in chaos theory, (ii) for paradoxical decompositions of solid three-dimensional objects, (iii) in the theory of effective computability (Church-Turing thesis) related to the possible {open_quotes}solution of supertasks,{close_quotes} and (iv) for weak solutions. Several approaches to set theory and their advantages and disadvantages for physical applications are discussed: Cantorian {open_quotes}naive{close_quotes} (i.e., nonaxiomatic) set theory, contructivism, and operationalism. In the author`s opinion, an attitude, of {open_quotes}suspended attention{close_quotes} (a term borrowed from psychoanalysis) seems most promising for progress. Physical and set theoretical entities must be operationalized wherever possible. At the same time, physicists should be open to {open_quotes}bizarre{close_quotes} or {open_quotes}mindboggling{close_quotes} new formalisms, which need not be operationalizable or testable at the time of their creation, but which may successfully lead to novel fields of phenomenology and technology.

  19. Multicriteria identification sets method

    Science.gov (United States)

    Kamenev, G. K.

    2016-11-01

    A multicriteria identification and prediction method for mathematical models of simulation type in the case of several identification criteria (error functions) is proposed. The necessity of the multicriteria formulation arises, for example, when one needs to take into account errors of completely different origins (not reducible to a single characteristic) or when there is no information on the class of noise in the data to be analyzed. An identification sets method is described based on the approximation and visualization of the multidimensional graph of the identification error function and sets of suboptimal parameters. This method allows for additional advantages of the multicriteria approach, namely, the construction and visual analysis of the frontier and the effective identification set (frontier and the Pareto set for identification criteria), various representations of the sets of Pareto effective and subeffective parameter combinations, and the corresponding predictive trajectory tubes. The approximation is based on the deep holes method, which yields metric ɛ-coverings with nearly optimal properties, and on multiphase approximation methods for the Edgeworth-Pareto hull. The visualization relies on the approach of interactive decision maps. With the use of the multicriteria method, multiple-choice solutions of identification and prediction problems can be produced and justified by analyzing the stability of the optimal solution not only with respect to the parameters (robustness with respect to data) but also with respect to the chosen set of identification criteria (robustness with respect to the given collection of functionals).

  20. Radionuclide reporter gene imaging for cardiac gene therapy

    International Nuclear Information System (INIS)

    Inubushi, Masayuki; Tamaki, Nagara

    2007-01-01

    In the field of cardiac gene therapy, angiogenic gene therapy has been most extensively investigated. The first clinical trial of cardiac angiogenic gene therapy was reported in 1998, and at the peak, more than 20 clinical trial protocols were under evaluation. However, most trials have ceased owing to the lack of decisive proof of therapeutic effects and the potential risks of viral vectors. In order to further advance cardiac angiogenic gene therapy, remaining open issues need to be resolved: there needs to be improvement of gene transfer methods, regulation of gene expression, development of much safer vectors and optimisation of therapeutic genes. For these purposes, imaging of gene expression in living organisms is of great importance. In radionuclide reporter gene imaging, ''reporter genes'' transferred into cell nuclei encode for a protein that retains a complementary ''reporter probe'' of a positron or single-photon emitter; thus expression of the reporter genes can be imaged with positron emission tomography or single-photon emission computed tomography. Accordingly, in the setting of gene therapy, the location, magnitude and duration of the therapeutic gene co-expression with the reporter genes can be monitored non-invasively. In the near future, gene therapy may evolve into combination therapy with stem/progenitor cell transplantation, so-called cell-based gene therapy or gene-modified cell therapy. Radionuclide reporter gene imaging is now expected to contribute in providing evidence on the usefulness of this novel therapeutic approach, as well as in investigating the molecular mechanisms underlying neovascularisation and safety issues relevant to further progress in conventional gene therapy. (orig.)

  1. The gentle art of gene arrangement: the meaning of gene clusters

    Science.gov (United States)

    Trowsdale, John

    2002-01-01

    Genome sequence comparisons reveal that some sets of genes are in similar linkage groups in different organisms while other sets are dispersed. Are some linkage groups maintained by chance, or is there an advantage to such an arrangement? Some insights may come from large clusters of genes, such as the major histocompatibility complex which includes many genes involved in immune defense. PMID:11897017

  2. The gentle art of gene arrangement: the meaning of gene clusters

    OpenAIRE

    Trowsdale, John

    2002-01-01

    Genome sequence comparisons reveal that some sets of genes are in similar linkage groups in different organisms while other sets are dispersed. Are some linkage groups maintained by chance, or is there an advantage to such an arrangement? Some insights may come from large clusters of genes, such as the major histocompatibility complex which includes many genes involved in immune defense.

  3. State-set branching

    DEFF Research Database (Denmark)

    Jensen, Rune Møller; Veloso, Manuela M.; Bryant, Randal E.

    2008-01-01

    In this article, we present a framework called state-set branching that combines symbolic search based on reduced ordered Binary Decision Diagrams (BDDs) with best-first search, such as A* and greedy best-first search. The framework relies on an extension of these algorithms from expanding a single...... state in each iteration to expanding a set of states. We prove that it is generally sound and optimal for two A* implementations and show how a new BDD technique called branching partitioning can be used to efficiently expand sets of states. The framework is general. It applies to any heuristic function...... framework. The algorithms outperform the ordinary A* algorithm in almost all domains. In addition, they can improve the complexity of A* exponentially and often dominate both A* and blind BDD-based search by several orders of magnitude. Moreover, they have substantially better performance than BDDA...

  4. Social Set Analysis

    DEFF Research Database (Denmark)

    Vatrapu, Ravi; Hussain, Abid; Buus Lassen, Niels

    2015-01-01

    This paper argues that the basic premise of Social Network Analysis (SNA) -- namely that social reality is constituted by dyadic relations and that social interactions are determined by structural properties of networks-- is neither necessary nor sufficient, for Big Social Data analytics...... of Facebook or Twitter data. However, there exist no other holistic computational social science approach beyond the relational sociology and graph theory of SNA. To address this limitation, this paper presents an alternative holistic approach to Big Social Data analytics called Social Set Analysis (SSA......). Based on the sociology of associations and the mathematics of classical, fuzzy and rough set theories, this paper proposes a research program. The function of which is to design, develop and evaluate social set analytics in terms of fundamentally novel formal models, predictive methods and visual...

  5. Combinatorics of finite sets

    CERN Document Server

    Anderson, Ian

    2011-01-01

    Coherent treatment provides comprehensive view of basic methods and results of the combinatorial study of finite set systems. The Clements-Lindstrom extension of the Kruskal-Katona theorem to multisets is explored, as is the Greene-Kleitman result concerning k-saturated chain partitions of general partially ordered sets. Connections with Dilworth's theorem, the marriage problem, and probability are also discussed. Each chapter ends with a helpful series of exercises and outline solutions appear at the end. ""An excellent text for a topics course in discrete mathematics."" - Bulletin of the Ame

  6. Set theory and logic

    CERN Document Server

    Stoll, Robert R

    1979-01-01

    Set Theory and Logic is the result of a course of lectures for advanced undergraduates, developed at Oberlin College for the purpose of introducing students to the conceptual foundations of mathematics. Mathematics, specifically the real number system, is approached as a unity whose operations can be logically ordered through axioms. One of the most complex and essential of modern mathematical innovations, the theory of sets (crucial to quantum mechanics and other sciences), is introduced in a most careful concept manner, aiming for the maximum in clarity and stimulation for further study in

  7. Social Set Visualizer

    DEFF Research Database (Denmark)

    Flesch, Benjamin; Hussain, Abid; Vatrapu, Ravi

    2015-01-01

    This paper presents a state-of-the art visual analytics dash-board, Social Set Visualizer (SoSeVi), of approximately 90 million Facebook actions from 11 different companies that have been mentioned in the traditional media in relation to garment factory accidents in Bangladesh. The enterprise...... application domain for the dashboard is Corporate Social Responsibility (CSR) and the targeted end-users are CSR researchers and practitioners. The design of the dashboard was based on the "social set analytics" approach to computational social science. The development of the dash-board involved cutting...

  8. Why quasi-sets?

    Directory of Open Access Journals (Sweden)

    Décio Krause

    2002-11-01

    Full Text Available Quasi-set theory was developed to deal with collections of indistinguishable objects. In standard mathematics, there are no such kind of entities, for indistinguishability (agreement with respect to all properties entails numerical identity. The main motivation underlying such a theory is of course quantum physics, for collections of indistinguishable (’identical’ in the physicists’ jargon particles cannot be regarded as ’sets’ of standard set theories, which are collections of distinguishable objects. In this paper, a rationale for the development of such a theory is presented, motivated by Heinz Post’s claim that indistinguishability ofquantum entities should be attributed ’right at the start’.

  9. Determining Semantically Related Significant Genes.

    Science.gov (United States)

    Taha, Kamal

    2014-01-01

    GO relation embodies some aspects of existence dependency. If GO term xis existence-dependent on GO term y, the presence of y implies the presence of x. Therefore, the genes annotated with the function of the GO term y are usually functionally and semantically related to the genes annotated with the function of the GO term x. A large number of gene set enrichment analysis methods have been developed in recent years for analyzing gene sets enrichment. However, most of these methods overlook the structural dependencies between GO terms in GO graph by not considering the concept of existence dependency. We propose in this paper a biological search engine called RSGSearch that identifies enriched sets of genes annotated with different functions using the concept of existence dependency. We observe that GO term xcannot be existence-dependent on GO term y, if x- and y- have the same specificity (biological characteristics). After encoding into a numeric format the contributions of GO terms annotating target genes to the semantics of their lowest common ancestors (LCAs), RSGSearch uses microarray experiment to identify the most significant LCA that annotates the result genes. We evaluated RSGSearch experimentally and compared it with five gene set enrichment systems. Results showed marked improvement.

  10. Building Temperature Set Point

    Energy Technology Data Exchange (ETDEWEB)

    Meincke, Carol L. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Evans, Christopher A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2014-09-01

    This white paper provides information and recommendations for an actionable and enforceable corporate policy statement on temperature set points for office and related spaces at Sandia and presents a strategy that balances the need to achieve the energy goals with optimizing employee comfort and productivity.

  11. Therapists in Oncology Settings

    Science.gov (United States)

    Hendrick, Susan S.

    2013-01-01

    This article describes the author's experiences of working with cancer patients/survivors both individually and in support groups for many years, across several settings. It also documents current best-practice guidelines for the psychosocial treatment of cancer patients/survivors and their families. The author's view of the important qualities…

  12. Dynamics Of Causal Sets

    CERN Document Server

    Rideout, D P

    2001-01-01

    The Causal Set approach to quantum gravity asserts that spacetime, at its smallest length scale, has a discrete structure. This discrete structure takes the form of a locally finite order relation, where the order, corresponding with the macroscopic notion of spacetime causality, is taken to be a fundamental aspect of nature. After an introduction to the Causal Set approach, this thesis considers a simple toy dynamics for causal sets. Numerical simulations of the model provide evidence for the existence of a continuum limit. While studying this toy dynamics, a picture arises of how the dynamics can be generalized in such a way that the theory could hope to produce more physically realistic causal sets. By thinking in terms of a stochastic growth process, and positing some fundamental principles, we are led almost uniquely to a family of dynamical laws (stochastic processes) parameterized by a countable sequence of coupling constants. This result is quite promising in that we now know how to speak of dynamics ...

  13. SET-Routes programme

    CERN Multimedia

    Marietta Schupp, EMBL Photolab

    2008-01-01

    Dr Sabine Hentze, specialist in human genetics, giving an Insight Lecture entitled "Human Genetics – Diagnostics, Indications and Ethical Issues" on 23 September 2008 at EMBL Heidelberg. Activities in a achool in Budapest during a visit of Angela Bekesi, Ambassadors for the SET-Routes programme.

  14. Cobham recursive set functions

    Czech Academy of Sciences Publication Activity Database

    Beckmann, A.; Buss, S.; Friedman, S.-D.; Müller, M.; Thapen, Neil

    2016-01-01

    Roč. 167, č. 3 (2016), s. 335-369 ISSN 0168-0072 R&D Projects: GA ČR GBP202/12/G061 Institutional support: RVO:67985840 Keywords : set function * polynomial time * Cobham recursion Subject RIV: BA - General Mathematics Impact factor: 0.647, year: 2016 http://www.sciencedirect.com/science/article/pii/S0168007215001293

  15. The Crystal Set

    Science.gov (United States)

    Greenslade, Thomas B., Jr.

    2014-01-01

    In past issues of this journal, the late H. R. Crane wrote a long series of articles under the running title of "How Things Work." In them, Dick dealt with many questions that physics teachers asked themselves, but did not have the time to answer. This article is my attempt to work through the physics of the crystal set, which I thought…

  16. Using RNA-Seq data to select refence genes for normalizing gene expression in apple roots

    Science.gov (United States)

    Gene expression in apple roots in response to various stress conditions is a less-explored research subject. Reliable reference genes for normalizing quantitative gene expression data have not been carefully investigated. In this study, the suitability of a set of 15 apple genes were evaluated for t...

  17. Tuberculosis diagnosis in resource-limited settings: Clinical use of ...

    African Journals Online (AJOL)

    Tuberculosis diagnosis in resource-limited settings: Clinical use of GeneXpert in the diagnosis of smear-negative PTB: a case report. ... studies are needed to provide evidence to policy makers in order to improve access to GeneXpert. Key words: Tuberculosis; developing countries; molecular diagnostic techniques.

  18. Some remarks on good sets

    Indian Academy of Sciences (India)

    R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

    they are full, (2) loops correspond one-to-one to extreme points of a convex set. Some other properties of good sets are discussed. Keywords. Good set; full set; related component; loop; relatively full set. Introduction and preliminaries. In this note we make some remarks on good sets in n-fold Cartesian product as defined in.

  19. Frame scaling function sets and frame wavelet sets in Rd

    International Nuclear Information System (INIS)

    Liu Zhanwei; Hu Guoen; Wu Guochang

    2009-01-01

    In this paper, we classify frame wavelet sets and frame scaling function sets in higher dimensions. Firstly, we obtain a necessary condition for a set to be the frame wavelet sets. Then, we present a necessary and sufficient condition for a set to be a frame scaling function set. We give a property of frame scaling function sets, too. Some corresponding examples are given to prove our theory in each section.

  20. Soft Expert Sets

    Directory of Open Access Journals (Sweden)

    Shawkat Alkhazaleh

    2011-01-01

    Full Text Available In 1999, Molodtsov introduced the concept of soft set theory as a general mathematical tool for dealing with uncertainty. Many researchers have studied this theory, and they created some models to solve problems in decision making and medical diagnosis, but most of these models deal only with one expert. This causes a problem with the user, especially with those who use questionnaires in their work and studies. In our model, the user can know the opinion of all experts in one model. So, in this paper, we introduce the concept of a soft expert set, which will more effective and useful. We also define its basic operations, namely, complement, union intersection AND, and OR. Finally, we show an application of this concept in decision-making problem.

  1. Hesitant fuzzy sets theory

    CERN Document Server

    Xu, Zeshui

    2014-01-01

    This book provides the readers with a thorough and systematic introduction to hesitant fuzzy theory. It presents the most recent research results and advanced methods in the field. These includes: hesitant fuzzy aggregation techniques, hesitant fuzzy preference relations, hesitant fuzzy measures, hesitant fuzzy clustering algorithms and hesitant fuzzy multi-attribute decision making methods. Since its introduction by Torra and Narukawa in 2009, hesitant fuzzy sets have become more and more popular and have been used for a wide range of applications, from decision-making problems to cluster analysis, from medical diagnosis to personnel appraisal and information retrieval. This book offers a comprehensive report on the state-of-the-art in hesitant fuzzy sets theory and applications, aiming at becoming a reference guide for both researchers and practitioners in the area of fuzzy mathematics and other applied research fields (e.g. operations research, information science, management science and engineering) chara...

  2. Social Set Visualizer

    DEFF Research Database (Denmark)

    Flesch, Benjamin; Vatrapu, Ravi; Mukkamala, Raghava Rao

    2015-01-01

    Current state-of-the-art in big social data analytics is largely limited to graph theoretical approaches such as social network analysis (SNA) informed by the social philosophical approach of relational sociology. This paper proposes and illustrates an alternate holistic approach to big social data...... platforms. We present and discuss a theoretical and conceptual model of social data followed by a formal description of our technique based on set theory and event studies with a real-world social data example from Facebook. We then illustrate our new approach by reporting on the design, development...... in relation to the garment factory accidents in Bangladesh, and analyze the results. The enterprise application domain for the dashboard is corporate social responsibility (CSR) and the targeted end-users are CSR researchers and practitioners. The design of the dashboard was based on the social set analysis...

  3. Setting goals in psychotherapy

    DEFF Research Database (Denmark)

    Emiliussen, Jakob; Wagoner, Brady

    2013-01-01

    The present study is concerned with the ethical dilemmas of setting goals in therapy. The main questions that it aims to answer are: who is to set the goals for therapy and who is to decide when they have been reached? The study is based on four semi-­‐structured, phenomenological interviews...... with psychologists, which were analyzed using the framework of the Interpretative Phenomenological Analysis (IPA), with minor changes to the procedure of categorization. Using Harré’s (2002, 2012) Positioning Theory, it is shown that determining goals and deciding if they have been reached are processes...... that are based on asymmetric collaboration between the therapist and the client. Determining goals and deciding when they are reached are not “sterile” procedures, as both the client and the therapist might have different agendas when working therapeutically. The psychologists that participated in this study...

  4. Cobham recursive set functions

    Czech Academy of Sciences Publication Activity Database

    Beckmann, A.; Buss, S.; Friedman, S.-D.; Müller, M.; Thapen, Neil

    2016-01-01

    Roč. 167, č. 3 (2016), s. 335-369 ISSN 0168-0072 R&D Projects: GA ČR GBP202/12/G061 Institutional support: RVO:67985840 Keywords : set function * polynomial time * Cobham recursion Subject RIV: BA - General Mathematics Impact factor: 0.647, year: 2016 http://www. science direct.com/ science /article/pii/S0168007215001293

  5. Triage in military settings.

    Science.gov (United States)

    Falzone, E; Pasquier, P; Hoffmann, C; Barbier, O; Boutonnet, M; Salvadori, A; Jarrassier, A; Renner, J; Malgras, B; Mérat, S

    2017-02-01

    Triage, a medical term derived from the French word "trier", is the practical process of sorting casualties to rationally allocate limited resources. In combat settings with limited medical resources and long transportation times, triage is challenging since the objectives are to avoid overcrowding medical treatment facilities while saving a maximum of soldiers and to get as many of them back into action as possible. The new face of modern warfare, asymmetric and non-conventional, has led to the integrative evolution of triage into the theatre of operations. This article defines different triage scores and algorithms currently implemented in military settings. The discrepancies associated with these military triage systems are highlighted. The assessment of combat casualty severity requires several scores and each nation adopts different systems for triage on the battlefield with the same aim of quickly identifying those combat casualties requiring lifesaving and damage control resuscitation procedures. Other areas of interest for triage in military settings are discussed, including predicting the need for massive transfusion, haemodynamic parameters and ultrasound exploration. Copyright © 2016 Société française d’anesthésie et de réanimation (Sfar). Published by Elsevier Masson SAS. All rights reserved.

  6. Gene expression

    International Nuclear Information System (INIS)

    Hildebrand, C.E.; Crawford, B.D.; Walters, R.A.; Enger, M.D.

    1983-01-01

    We prepared probes for isolating functional pieces of the metallothionein locus. The probes enabled a variety of experiments, eventually revealing two mechanisms for metallothionein gene expression, the order of the DNA coding units at the locus, and the location of the gene site in its chromosome. Once the switch regulating metallothionein synthesis was located, it could be joined by recombinant DNA methods to other, unrelated genes, then reintroduced into cells by gene-transfer techniques. The expression of these recombinant genes could then be induced by exposing the cells to Zn 2+ or Cd 2+ . We would thus take advantage of the clearly defined switching properties of the metallothionein gene to manipulate the expression of other, perhaps normally constitutive, genes. Already, despite an incomplete understanding of how the regulatory switch of the metallothionein locus operates, such experiments have been performed successfully

  7. Classical Sets and Non-Classical Sets: An Overview -38 ...

    Indian Academy of Sciences (India)

    Classical Sets and Non-Classical Sets: Sumita Basu is assistant professor of mathematics at Lady Braboume. College, Kolkata. Her research interests include artificial intelligence, automata theory, and mathematical logic. Keywords. Fuzzy sets, crisp sets, rough sets, law of excluded middle,. DeMorgan's laws. An Overview.

  8. Genes2FANs: connecting genes through functional association networks

    Science.gov (United States)

    2012-01-01

    Background Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent. Results Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories. Conclusions Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in

  9. Ordered sets and lattices

    CERN Document Server

    Drashkovicheva, Kh; Igoshin, V I; Katrinyak, T; Kolibiar, M

    1989-01-01

    This book is another publication in the recent surveys of ordered sets and lattices. The papers, which might be characterized as "reviews of reviews," are based on articles reviewed in the Referativnyibreve Zhurnal: Matematika from 1978 to 1982. For the sake of completeness, the authors also attempted to integrate information from other relevant articles from that period. The bibliography of each paper provides references to the reviews in RZhMat and Mathematical Reviews where one can seek more detailed information. Specifically excluded from consideration in this volume were such topics as al

  10. Analysis of successive data sets

    NARCIS (Netherlands)

    Spreeuwers, Lieuwe Jan; Breeuwer, Marcel; Haselhoff, Eltjo Hans

    2008-01-01

    The invention relates to the analysis of successive data sets. A local intensity variation is formed from such successive data sets, that is, from data values in successive data sets at corresponding positions in each of the data sets. A region of interest is localized in the individual data sets on

  11. Analysis of successive data sets

    NARCIS (Netherlands)

    Spreeuwers, Lieuwe Jan; Breeuwer, Marcel; Haselhoff, Eltjo Hans

    2002-01-01

    The invention relates to the analysis of successive data sets. A local intensity variation is formed from such successive data sets, that is, from data values in successive data sets at corresponding positions in each of the data sets. A region of interest is localized in the individual data sets on

  12. Identification of temporal association rules from time-series microarray data sets.

    Science.gov (United States)

    Nam, Hojung; Lee, KiYoung; Lee, Doheon

    2009-03-19

    One of the most challenging problems in mining gene expression data is to identify how the expression of any particular gene affects the expression of other genes. To elucidate the relationships between genes, an association rule mining (ARM) method has been applied to microarray gene expression data. However, a conventional ARM method has a limit on extracting temporal dependencies between gene expressions, though the temporal information is indispensable to discover underlying regulation mechanisms in biological pathways. In this paper, we propose a novel method, referred to as temporal association rule mining (TARM), which can extract temporal dependencies among related genes. A temporal association rule has the form [gene A upward arrow, gene B downward arrow] --> (7 min) [gene C upward arrow], which represents that high expression level of gene A and significant repression of gene B followed by significant expression of gene C after 7 minutes. The proposed TARM method is tested with Saccharomyces cerevisiae cell cycle time-series microarray gene expression data set. In the parameter fitting phase of TARM, the fitted parameter set [threshold = +/- 0.8, support >or= 3 transactions, confidence >or= 90%] with the best precision score for KEGG cell cycle pathway has been chosen for rule mining phase. With the fitted parameter set, numbers of temporal association rules with five transcriptional time delays (0, 7, 14, 21, 28 minutes) are extracted from gene expression data of 799 genes, which are pre-identified cell cycle relevant genes. From the extracted temporal association rules, associated genes, which play same role of biological processes within short transcriptional time delay and some temporal dependencies between genes with specific biological processes are identified. In this work, we proposed TARM, which is an applied form of conventional ARM. TARM showed higher precision score than Dynamic Bayesian network and Bayesian network. Advantages of TARM are

  13. Dynamical basis set

    International Nuclear Information System (INIS)

    Blanco, M.; Heller, E.J.

    1985-01-01

    A new Cartesian basis set is defined that is suitable for the representation of molecular vibration-rotation bound states. The Cartesian basis functions are superpositions of semiclassical states generated through the use of classical trajectories that conform to the intrinsic dynamics of the molecule. Although semiclassical input is employed, the method becomes ab initio through the standard matrix diagonalization variational method. Special attention is given to classical-quantum correspondences for angular momentum. In particular, it is shown that the use of semiclassical information preferentially leads to angular momentum eigenstates with magnetic quantum number Vertical BarMVertical Bar equal to the total angular momentum J. The present method offers a reliable technique for representing highly excited vibrational-rotational states where perturbation techniques are no longer applicable

  14. Numbers and sets

    Directory of Open Access Journals (Sweden)

    Marco Ruffino

    2001-12-01

    Full Text Available In this paper I discuss the intuition behind Frege's and Russell's definitions of numbers as sets, as well as Benacerraf's criticism of it. I argue that Benacerraf's argument is not as strong as some philosophers tend to think. Moreover, I examine an alternative to the Fregean-Russellian definition of numbers proposed by Maddy, and point out some problems faced by it.Neste artigo discuto a intuição subjacente à definição de n∨meros como conjuntos proposta por Frege e Russell, assim como a crítica de Benacerraf a esta definição. Eu tento mostrar que o argumento de Benacerraf não é tão forte como alguns filósofos o tomaram. Adicionalmente, examino uma alternativa à definição de Frege e Russell proposta por Maddy, e indico algumas dificuldades encontrada pela mesma.

  15. Revitalizing the setting approach

    DEFF Research Database (Denmark)

    Bloch, Paul; Toft, Ulla; Reinbach, Helene Christine

    2014-01-01

    BackgroundThe concept of health promotion rests on aspirations aiming at enabling people to increase control over and improve their health. Health promotion action is facilitated in settings such as schools, homes and work places. As a contribution to the promotion of healthy lifestyles, we have......-based. Based on a presentation of ¿Health and Local Community¿, a supersetting initiative addressing the prevention of lifestyle diseases in a Danish municipality, the paper discusses the potentials and challenges of supporting local community interventions using the supersetting approach...... impact of supersetting initiatives. The supersetting approach is an ecological approach, which places the individual in a social, environmental and cultural context, and calls for a holistic perspective to change potentials and developmental processes with a starting point in the circumstances of people...

  16. Hesitant intuitionistic fuzzy soft sets

    Science.gov (United States)

    Nazra, Admi; Syafruddin; Lestari, Riri; Catur Wicaksono, Gandung

    2017-09-01

    This paper aims to extend the hesitant fuzzy soft sets to hesitant intuitionistic fuzzy soft sets by merging the concept of hesitant intuitionistic fuzzy sets and soft sets. The authors define some operations on hesitant intuitionistic fuzzy sets, such as complement, union and intersection, and obtain related properties. The similar operations are defined on hesitant intuitionistic fuzzy soft sets, and also some properties such as assosiative and De Morgan’s laws are obtained.

  17. Thesaurus-based disambiguation of gene symbols

    Directory of Open Access Journals (Sweden)

    Wain Hester M

    2005-06-01

    Full Text Available Abstract Background Massive text mining of the biological literature holds great promise of relating disparate information and discovering new knowledge. However, disambiguation of gene symbols is a major bottleneck. Results We developed a simple thesaurus-based disambiguation algorithm that can operate with very little training data. The thesaurus comprises the information from five human genetic databases and MeSH. The extent of the homonym problem for human gene symbols is shown to be substantial (33% of the genes in our combined thesaurus had one or more ambiguous symbols, not only because one symbol can refer to multiple genes, but also because a gene symbol can have many non-gene meanings. A test set of 52,529 Medline abstracts, containing 690 ambiguous human gene symbols taken from OMIM, was automatically generated. Overall accuracy of the disambiguation algorithm was up to 92.7% on the test set. Conclusion The ambiguity of human gene symbols is substantial, not only because one symbol may denote multiple genes but particularly because many symbols have other, non-gene meanings. The proposed disambiguation approach resolves most ambiguities in our test set with high accuracy, including the important gene/not a gene decisions. The algorithm is fast and scalable, enabling gene-symbol disambiguation in massive text mining applications.

  18. Trichoderma genes

    Science.gov (United States)

    Foreman, Pamela [Los Altos, CA; Goedegebuur, Frits [Vlaardingen, NL; Van Solingen, Pieter [Naaldwijk, NL; Ward, Michael [San Francisco, CA

    2012-06-19

    Described herein are novel gene sequences isolated from Trichoderma reesei. Two genes encoding proteins comprising a cellulose binding domain, one encoding an arabionfuranosidase and one encoding an acetylxylanesterase are described. The sequences, CIP1 and CIP2, contain a cellulose binding domain. These proteins are especially useful in the textile and detergent industry and in pulp and paper industry.

  19. Ready, set, move!

    CERN Document Server

    Anaïs Schaeffer

    2012-01-01

    This year, the CERN Medical Service is launching a new public health campaign. Advertised by the catchphrase “Move! & Eat Better”, the particular aim of the campaign is to encourage people at CERN to take more regular exercise, of whatever kind.   The CERN annual relay race is scheduled on 24 May this year. The CERN Medical Service will officially launch its “Move! & Eat Better” campaign at this popular sporting event. “We shall be on hand on the day of the race to strongly advocate regular physical activity,” explains Rachid Belkheir, one of the Medical Service doctors. "We really want to pitch our campaign and answer any questions people may have. Above all we want to set an example. So we are going to walk the same circuit as the runners to underline to people that they can easily incorporate movement into their daily routine.” An underlying concern has prompted this campaign: during their first few year...

  20. Setting the scene

    International Nuclear Information System (INIS)

    Curran, S.

    1977-01-01

    The reasons for the special meeting on the breeder reactor are outlined with some reference to the special Scottish interest in the topic. Approximately 30% of the electrical energy generated in Scotland is nuclear and the special developments at Dounreay make policy decisions on the future of the commercial breeder reactor urgent. The participants review the major questions arising in arriving at such decisions. In effect an attempt is made to respond to the wish of the Secretary of State for Energy to have informed debate. To set the scene the importance of energy availability as regards to the strength of the national economy is stressed and the reasons for an increasing energy demand put forward. Examination of alternative sources of energy shows that none is definitely capable of filling the foreseen energy gap. This implies an integrated thermal/breeder reactor programme as the way to close the anticipated gap. The problems of disposal of radioactive waste and the safeguards in the handling of plutonium are outlined. Longer-term benefits, including the consumption of plutonium and naturally occurring radioactive materials, are examined. (author)

  1. Investigation progress of PET reporter gene imaging

    International Nuclear Information System (INIS)

    Chen Yumei; Huang Gang

    2006-01-01

    Molecular imaging for gene therapy and gene expression has been more and more attractive, while the use of gene therapy has been widely investigated and intense research have allowed it to the clinical setting in the last two-decade years. In vivo imaging with positron emission tomography (PET) by combination of appropriate PET reporter gene and PET reporter probe could provide qualitative and quantitative information for gene therapy. PET imaging could also obtain some valuable parameters not available by other techniques. This technology is useful to understand the process and development of gene therapy and how to apply it into clinical practice in the future. (authors)

  2. Set discrimination of quantum states

    International Nuclear Information System (INIS)

    Zhang Shengyu; Ying Mingsheng

    2002-01-01

    We introduce a notion of set discrimination, which is an interesting extension of quantum state discrimination. A state is secretly chosen from a number of quantum states, which are partitioned into some disjoint sets. A set discrimination is required to identify which set the given state belongs to. Several essential problems are addressed in this paper, including the condition of perfect set discrimination, unambiguous set discrimination, and in the latter case, the efficiency of the discrimination. This generalizes some important results on quantum state discrimination in the literature. A combination of state and set discrimination and the efficiency are also studied

  3. Soft sets combined with interval valued intuitionistic fuzzy sets of type-2 and rough sets

    Directory of Open Access Journals (Sweden)

    Anjan Mukherjee

    2015-03-01

    Full Text Available Fuzzy set theory, rough set theory and soft set theory are all mathematical tools dealing with uncertainties. The concept of type-2 fuzzy sets was introduced by Zadeh in 1975 which was extended to interval valued intuitionistic fuzzy sets of type-2 by the authors.This paper is devoted to the discussions of the combinations of interval valued intuitionistic sets of type-2, soft sets and rough sets.Three different types of new hybrid models, namely-interval valued intuitionistic fuzzy soft sets of type-2, soft rough interval valued intuitionistic fuzzy sets of type-2 and soft interval valued intuitionistic fuzzy rough sets of type-2 are proposed and their properties are derived.

  4. Cloning and selection of reference genes for gene expression ...

    African Journals Online (AJOL)

    Full length mRNA sequences of Ac-β-actin and Ac-gapdh, and partial mRNA sequences of Ac-18SrRNA and Ac-ubiquitin were cloned from pineapple in this study. The four genes were tested as housekeeping genes in three experimental sets. GeNorm and NormFinder analysis revealed that β-actin was the most ...

  5. Gene coexpression network analysis as a source of functional annotation for rice genes.

    Directory of Open Access Journals (Sweden)

    Kevin L Childs

    Full Text Available With the existence of large publicly available plant gene expression data sets, many groups have undertaken data analyses to construct gene coexpression networks and functionally annotate genes. Often, a large compendium of unrelated or condition-independent expression data is used to construct gene networks. Condition-dependent expression experiments consisting of well-defined conditions/treatments have also been used to create coexpression networks to help examine particular biological processes. Gene networks derived from either condition-dependent or condition-independent data can be difficult to interpret if a large number of genes and connections are present. However, algorithms exist to identify modules of highly connected and biologically relevant genes within coexpression networks. In this study, we have used publicly available rice (Oryza sativa gene expression data to create gene coexpression networks using both condition-dependent and condition-independent data and have identified gene modules within these networks using the Weighted Gene Coexpression Network Analysis method. We compared the number of genes assigned to modules and the biological interpretability of gene coexpression modules to assess the utility of condition-dependent and condition-independent gene coexpression networks. For the purpose of providing functional annotation to rice genes, we found that gene modules identified by coexpression analysis of condition-dependent gene expression experiments to be more useful than gene modules identified by analysis of a condition-independent data set. We have incorporated our results into the MSU Rice Genome Annotation Project database as additional expression-based annotation for 13,537 genes, 2,980 of which lack a functional annotation description. These results provide two new types of functional annotation for our database. Genes in modules are now associated with groups of genes that constitute a collective functional

  6. On a class of sets between μ-closed sets and μg-closed sets

    Directory of Open Access Journals (Sweden)

    Bishwambhar Roy

    2017-03-01

    Full Text Available In this paper, a new class of sets called μ*g-closed sets are introduced and investigated with the help of μ-open and μg-open sets. Relationships between this new class and other related classes of sets are established. Some separation axioms has also being studied. Finally, some preservation theorems have been given.

  7. Immunoglobulin genes

    National Research Council Canada - National Science Library

    Honjo, T; Alt, F. W; Rabbitts, T. H

    1989-01-01

    ... Cataloguing in Publication Data Immunoglobulin genes 1. Vertebrates. Immunoglobulins 1. Honjo, T. II. Alt, F.W. III. Rabbitts, T.H. 612'. 118223 ISBN 0-12-354865-9 This book is printed on acid-free paper ( T...

  8. Ageing genes

    DEFF Research Database (Denmark)

    Rattan, Suresh

    2018-01-01

    The idea of gerontogenes is in line with the evolutionary explanation of ageing as being an emergent phenomenon as a result of the imperfect maintenance and repair systems. Although evolutionary processes did not select for any specific ageing genes that restrict and determine the lifespan...... of an individual, the term ‘gerontogenes’ primarily refers to any genes that may seem to influence ageing and longevity, without being specifically selected for that role. Such genes can also be called ‘virtual gerontogenes’ by virtue of their indirect influence on the rate and process of ageing. More than 1000...... virtual gerontogenes have been associated with ageing and longevity in model organisms and humans. The ‘real’ genes, which do influence the essential lifespan of a species, and have been selected for in accordance with the evolutionary life history of the species, are known as the longevity assurance...

  9. Catalytic and functional roles of conserved amino acids in the SET domain of the S. cerevisiae lysine methyltransferase Set1.

    Directory of Open Access Journals (Sweden)

    Kelly Williamson

    Full Text Available In S. cerevisiae, the lysine methyltransferase Set1 is a member of the multiprotein complex COMPASS. Set1 catalyzes mono-, di- and trimethylation of the fourth residue, lysine 4, of histone H3 using methyl groups from S-adenosylmethionine, and requires a subset of COMPASS proteins for this activity. The methylation activity of COMPASS regulates gene expression and chromosome segregation in vivo. To improve understanding of the catalytic mechanism of Set1, single amino acid substitutions were made within the SET domain. These Set1 mutants were evaluated in vivo by determining the levels of K4-methylated H3, assaying the strength of gene silencing at the rDNA and using a genetic assessment of kinetochore function as a proxy for defects in Dam1 methylation. The findings indicate that no single conserved active site base is required for H3K4 methylation by Set1. Instead, our data suggest that a number of aromatic residues in the SET domain contribute to the formation of an active site that facilitates substrate binding and dictates product specificity. Further, the results suggest that the attributes of Set1 required for trimethylation of histone H3 are those required for Pol II gene silencing at the rDNA and kinetochore function.

  10. Neurobiology: Setting the Set Point for Neural Homeostasis.

    Science.gov (United States)

    Truszkowski, Torrey L S; Aizenman, Carlos D

    2015-12-07

    Neural homeostasis allows neural networks to maintain a dynamic range around a given set point. How this set point is determined remains unknown. New evidence shows that alterations of activity during a critical developmental period can alter the homeostatic set point, resulting in epilepsy-like activity. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Re-Setting Music Education's "Default Settings"

    Science.gov (United States)

    Regelski, Thomas A.

    2013-01-01

    This paper explores the effects and problems of one highly influential default setting of the "normal style template" of music education and proposes some alternatives. These do not require abandoning all traditional templates for school music. But re-setting the default settings does depend on reconsidering the promised function of…

  12. Classical Sets and Non-Classical Sets: An Overview -38 ...

    Indian Academy of Sciences (India)

    Mathematicians, logicians, and computer scientists are trying to model uncertain, imprecise or vague concepts. Here we present two models of vague concepts and draw a comparison between such imprecise sets and the stan- dard classical sets. In Section 1, we define classical sets, which model precise concepts.

  13. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    cells are capable of regulating their gene expression, so that each cell can only express a particular set of genes yielding limited numbers of proteins with specialized functions. Therefore a rigid control of differential gene expression is necessary for cellular diversity. On the other hand, aberrant...... gene regulation will disrupt the cell’s fundamental processes, which in turn can cause disease. Hence, understanding gene regulation is essential for deciphering the code of life. Along with the development of high throughput sequencing (HTS) technology and the subsequent large-scale data analysis......, genome-wide assays have increased our understanding of gene regulation significantly. This thesis describes the integration and analysis of HTS data across different important aspects of gene regulation. Gene expression can be regulated at different stages when the genetic information is passed from gene...

  14. Definition of Intuitive Set Theory

    OpenAIRE

    Nambiar, Kannan

    2001-01-01

    The two axioms which define intuitive set theory, Axiom of Combinatorial Sets and Axiom of Infinitesimals, are stated. Generalized Continuum Hypothesis is derived from the first axiom, and the infinitesimal is visualized using the latter.

  15. Hausdorff convergence of Julia sets

    NARCIS (Netherlands)

    Krauskopf, B; Kriete, H

    1999-01-01

    Consider a sequence {g(d)}(d is an element of N) converging uniformly on compact sets to g, where g and g(d) are meromorphic functions on C. We show that the Julia sets J(g(d)) converge to the Julia set J(g) in the Hausdorff metric, if the Fatou set F(g) is the union of basins of attracting periodic

  16. SET oncoprotein accumulation regulates transcription through DNA demethylation and histone hypoacetylation.

    Science.gov (United States)

    Almeida, Luciana O; Neto, Marinaldo P C; Sousa, Lucas O; Tannous, Maryna A; Curti, Carlos; Leopoldino, Andreia M

    2017-04-18

    Epigenetic modifications are essential in the control of normal cellular processes and cancer development. DNA methylation and histone acetylation are major epigenetic modifications involved in gene transcription and abnormal events driving the oncogenic process. SET protein accumulates in many cancer types, including head and neck squamous cell carcinoma (HNSCC); SET is a member of the INHAT complex that inhibits gene transcription associating with histones and preventing their acetylation. We explored how SET protein accumulation impacts on the regulation of gene expression, focusing on DNA methylation and histone acetylation. DNA methylation profile of 24 tumour suppressors evidenced that SET accumulation decreased DNA methylation in association with loss of 5-methylcytidine, formation of 5-hydroxymethylcytosine and increased TET1 levels, indicating an active DNA demethylation mechanism. However, the expression of some suppressor genes was lowered in cells with high SET levels, suggesting that loss of methylation is not the main mechanism modulating gene expression. SET accumulation also downregulated the expression of 32 genes of a panel of 84 transcription factors, and SET directly interacted with chromatin at the promoter of the downregulated genes, decreasing histone acetylation. Gene expression analysis after cell treatment with 5-aza-2'-deoxycytidine (5-AZA) and Trichostatin A (TSA) revealed that histone acetylation reversed transcription repression promoted by SET. These results suggest a new function for SET in the regulation of chromatin dynamics. In addition, TSA diminished both SET protein levels and SET capability to bind to gene promoter, suggesting that administration of epigenetic modifier agents could be efficient to reverse SET phenotype in cancer.

  17. Identification of key player genes in gene regulatory networks.

    Science.gov (United States)

    Nazarieh, Maryam; Wiese, Andreas; Will, Thorsten; Hamed, Mohamed; Helms, Volkhard

    2016-09-06

    Identifying the gene regulatory networks governing the workings and identity of cells is one of the main challenges in understanding processes such as cellular differentiation, reprogramming or cancerogenesis. One particular challenge is to identify the main drivers and master regulatory genes that control such cell fate transitions. In this work, we reformulate this problem as the optimization problems of computing a Minimum Dominating Set and a Minimum Connected Dominating Set for directed graphs. Both MDS and MCDS are applied to the well-studied gene regulatory networks of the model organisms E. coli and S. cerevisiae and to a pluripotency network for mouse embryonic stem cells. The results show that MCDS can capture most of the known key player genes identified so far in the model organisms. Moreover, this method suggests an additional small set of transcription factors as novel key players for governing the cell-specific gene regulatory network which can also be investigated with regard to diseases. To this aim, we investigated the ability of MCDS to define key drivers in breast cancer. The method identified many known drug targets as members of the MDS and MCDS. This paper proposes a new method to identify key player genes in gene regulatory networks. The Java implementation of the heuristic algorithm explained in this paper is available as a Cytoscape plugin at http://apps.cytoscape.org/apps/mcds . The SageMath programs for solving integer linear programming formulations used in the paper are available at https://github.com/maryamNazarieh/KeyRegulatoryGenes and as supplementary material.

  18. Computational algorithms to predict Gene Ontology annotations.

    Science.gov (United States)

    Pinoli, Pietro; Chicco, Davide; Masseroli, Marco

    2015-01-01

    Gene function annotations, which are associations between a gene and a term of a controlled vocabulary describing gene functional features, are of paramount importance in modern biology. Datasets of these annotations, such as the ones provided by the Gene Ontology Consortium, are used to design novel biological experiments and interpret their results. Despite their importance, these sources of information have some known issues. They are incomplete, since biological knowledge is far from being definitive and it rapidly evolves, and some erroneous annotations may be present. Since the curation process of novel annotations is a costly procedure, both in economical and time terms, computational tools that can reliably predict likely annotations, and thus quicken the discovery of new gene annotations, are very useful. We used a set of computational algorithms and weighting schemes to infer novel gene annotations from a set of known ones. We used the latent semantic analysis approach, implementing two popular algorithms (Latent Semantic Indexing and Probabilistic Latent Semantic Analysis) and propose a novel method, the Semantic IMproved Latent Semantic Analysis, which adds a clustering step on the set of considered genes. Furthermore, we propose the improvement of these algorithms by weighting the annotations in the input set. We tested our methods and their weighted variants on the Gene Ontology annotation sets of three model organism genes (Bos taurus, Danio rerio and Drosophila melanogaster ). The methods showed their ability in predicting novel gene annotations and the weighting procedures demonstrated to lead to a valuable improvement, although the obtained results vary according to the dimension of the input annotation set and the considered algorithm. Out of the three considered methods, the Semantic IMproved Latent Semantic Analysis is the one that provides better results. In particular, when coupled with a proper weighting policy, it is able to predict a

  19. Quantifying inhomogeneity in fractal sets

    Science.gov (United States)

    Fraser, Jonathan M.; Todd, Mike

    2018-04-01

    An inhomogeneous fractal set is one which exhibits different scaling behaviour at different points. The Assouad dimension of a set is a quantity which finds the ‘most difficult location and scale’ at which to cover the set and its difference from box dimension can be thought of as a first-level overall measure of how inhomogeneous the set is. For the next level of analysis, we develop a quantitative theory of inhomogeneity by considering the measure of the set of points around which the set exhibits a given level of inhomogeneity at a certain scale. For a set of examples, a family of -invariant subsets of the 2-torus, we show that this quantity satisfies a large deviations principle. We compare members of this family, demonstrating how the rate function gives us a deeper understanding of their inhomogeneity.

  20. Analysis of gene expression using gene sets discriminates cancer patients with and without late radiation toxicity

    NARCIS (Netherlands)

    Svensson, J. Peter; Stalpers, Lukas J. A.; Esveldt-van Lange, Rebecca E. E.; Franken, Nicolaas A. P.; Haveman, Jaap; Klein, Binie; Turesson, Ingela; Vrieling, Harry; Giphart-Gassler, Micheline

    2006-01-01

    BACKGROUND: Radiation is an effective anti-cancer therapy but leads to severe late radiation toxicity in 5%-10% of patients. Assuming that genetic susceptibility impacts this risk, we hypothesized that the cellular response of normal tissue to X-rays could discriminate patients with and without late

  1. Gene Locater

    DEFF Research Database (Denmark)

    Anwar, Muhammad Zohaib; Sehar, Anoosha; Rehman, Inayat-Ur

    2012-01-01

    software's for calculating recombination frequency is mostly limited to the range and flexibility of this type of analysis. GENE LOCATER is a fully customizable program for calculating recombination frequency, written in JAVA. Through an easy-to-use interface, GENE LOCATOR allows users a high degree...... of flexibility in calculating genetic linkage and displaying linkage group. Among other features, this software enables user to identify linkage groups with output visualized graphically. The program calculates interference and coefficient of coincidence with elevated accuracy in sample datasets. AVAILABILITY......: The database is available for free at http://www.moperandib.com....

  2. Functional analysis of the molecular interactions of TATA box-containing genes and essential genes.

    Science.gov (United States)

    Bae, Sang-Hun; Han, Hyun Wook; Moon, Jisook

    2015-01-01

    Genes can be divided into TATA-containing genes and TATA-less genes according to the presence of TATA box elements at promoter regions. TATA-containing genes tend to be stress-responsive, whereas many TATA-less genes are known to be related to cell growth or "housekeeping" functions. In a previous study, we demonstrated that there are striking differences among four gene sets defined by the presence of TATA box (TATA-containing) and essentiality (TATA-less) with respect to number of associated transcription factors, amino acid usage, and functional annotation. Extending this research in yeast, we identified KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways that are statistically enriched in TATA-containing or TATA-less genes and evaluated the possibility that the enriched pathways are related to stress or growth as reflected by the individual functions of the genes involved. According to their enrichment for either of these two gene sets, we sorted KEGG pathways into TATA-containing-gene-enriched pathways (TEPs) and essential-gene-enriched pathways (EEPs). As expected, genes in TEPs and EEPs exhibited opposite results in terms of functional category, transcriptional regulation, codon adaptation index, and network properties, suggesting the possibility that the bipolar patterns in these pathways also contribute to the regulation of the stress response and to cell survival. Our findings provide the novel insight that significant enrichment of TATA-binding or TATA-less genes defines pathways as stress-responsive or growth-related.

  3. Handling gene redundancy in microarray data using Grey Relational Analysis.

    Science.gov (United States)

    Zhang, Li-Juan; Li, Zhou-Jun; Chen, Huo-Wang

    2008-01-01

    Gene selection is one of the important and frequently used techniques for microarray data classification. In this paper, we introduce a new metric to measure gene-class relevance and gene-gene redundancy. The new metric is based on Grey Relational Analysis (GRA), called Grey Relational Grade (GRG), and never used in gene selection before. Based on the GRG, we develop a new gene selection method, which uses GRG to group similar genes to clusters, and then select informative genes from each cluster to avoid redundancy. Experiments on public data sets demonstrate the effectiveness of the proposed method.

  4. Dissociating Stimulus-Set and Response-Set in the Context of Task-Set Switching

    Science.gov (United States)

    Kieffaber, Paul D.; Kruschke, John K.; Cho, Raymond Y.; Walker, Philip M.; Hetrick, William P.

    2014-01-01

    The primary aim of the present research was to determine how stimulus-set and response-set components of task-set contribute to switch costs and conflict processing. Three experiments are described wherein participants completed an explicitly cued task-switching procedure. Experiment 1 established that task switches requiring a reconfiguration of both stimulus- and response-set incurred larger residual switch costs than task switches requiring the reconfiguration of stimulus-set alone. Between-task interference was also drastically reduced for response-set conflict compared with stimulus-set conflict. A second experiment replicated these findings and demonstrated that stimulus- and response-conflict have dissociable effects on the “decision time” and “motor time” components of total response time. Finally, a third experiment replicated Experiment 2 and demonstrated that the stimulus- and response- components of task switching and conflict processing elicit dissociable neural activity as evidence by event-related brain potentials. PMID:22984990

  5. Social settings and addiction relapse.

    Science.gov (United States)

    Walton, M A; Reischl, T M; Ramanthan, C S

    1995-01-01

    Despite addiction theorists' acknowledgment of the impact of environmental factors on relapse, researchers have not adequately investigated these influences. Ninety-six substance users provided data regarding their perceived risk for relapse, exposure to substances, and involvement in reinforcing activities. These three setting attributes were assessed in their home, work, and community settings. Reuse was assessed 3 months later. When controlling for confounding variables, aspects of the home settings significantly distinguished abstainers from reusers; perceived risk for relapse was the strongest predictor of reuse. Exposure to substances and involvement in reinforcing activities were not robust reuse indicators. The work and community settings were not significant determinants of reuse. These findings offer some initial support for the utility of examining social settings to better understand addiction relapse and recovery. Identification of setting-based relapse determinants provides concrete targets for relapse prevention interventions.

  6. Reranking candidate gene models with cross-species comparison for improved gene prediction

    Directory of Open Access Journals (Sweden)

    Pereira Fernando CN

    2008-10-01

    Full Text Available Abstract Background Most gene finders score candidate gene models with state-based methods, typically HMMs, by combining local properties (coding potential, splice donor and acceptor patterns, etc. Competing models with similar state-based scores may be distinguishable with additional information. In particular, functional and comparative genomics datasets may help to select among competing models of comparable probability by exploiting features likely to be associated with the correct gene models, such as conserved exon/intron structure or protein sequence features. Results We have investigated the utility of a simple post-processing step for selecting among a set of alternative gene models, using global scoring rules to rerank competing models for more accurate prediction. For each gene locus, we first generate the K best candidate gene models using the gene finder Evigan, and then rerank these models using comparisons with putative orthologous genes from closely-related species. Candidate gene models with lower scores in the original gene finder may be selected if they exhibit strong similarity to probable orthologs in coding sequence, splice site location, or signal peptide occurrence. Experiments on Drosophila melanogaster demonstrate that reranking based on cross-species comparison outperforms the best gene models identified by Evigan alone, and also outperforms the comparative gene finders GeneWise and Augustus+. Conclusion Reranking gene models with cross-species comparison improves gene prediction accuracy. This straightforward method can be readily adapted to incorporate additional lines of evidence, as it requires only a ranked source of candidate gene models.

  7. Soft set theory and topology

    Directory of Open Access Journals (Sweden)

    D. N. Georgiou

    2014-04-01

    Full Text Available In this paper we study and discuss the soft set theory giving new definitions, examples, new classes of soft sets, and properties for mappings between different classes of soft sets. Furthermore, we investigate the theory of soft topological spaces and we present new definitions, characterizations, and properties concerning the soft closure, the soft interior, the soft boundary, the soft continuity, the soft open and closed maps, and the soft homeomorphism.

  8. Programming services with correlation sets

    DEFF Research Database (Denmark)

    Montesi, Fabrizio; Carbone, Marco

    2011-01-01

    Correlation sets define a powerful mechanism for routing incoming communications to the correct running session within a server, by inspecting the content of the received messages. We present a language for programming services based on correlation sets taking into account key aspects of service...... properties of programs with respect to correlation sets. We provide an implementation as an extension of the JOLIE language and apply it to a nontrivial real-world example of a fully-functional distributed user authentication system....

  9. Algorithms over partially ordered sets

    DEFF Research Database (Denmark)

    Baer, Robert M.; Østerby, Ole

    1969-01-01

    We here study some problems concerned with the computational analysis of finite partially ordered sets. We begin (in § 1) by showing that the matrix representation of a binary relationR may always be taken in triangular form ifR is a partial ordering. We consider (in § 2) the chain structure...... in partially ordered sets, answer the combinatorial question of how many maximal chains might exist in a partially ordered set withn elements, and we give an algorithm for enumerating all maximal chains. We give (in § 3) algorithms which decide whether a partially ordered set is a (lower or upper) semi...

  10. A book of set theory

    CERN Document Server

    Pinter, Charles C

    2014-01-01

    Suitable for upper-level undergraduates, this accessible approach to set theory poses rigorous but simple arguments. Each definition is accompanied by commentary that motivates and explains new concepts. Starting with a repetition of the familiar arguments of elementary set theory, the level of abstract thinking gradually rises for a progressive increase in complexity.A historical introduction presents a brief account of the growth of set theory, with special emphasis on problems that led to the development of the various systems of axiomatic set theory. Subsequent chapters explore classes and

  11. Closed sets of nonlocal correlations

    International Nuclear Information System (INIS)

    Allcock, Jonathan; Linden, Noah; Brunner, Nicolas; Popescu, Sandu; Skrzypczyk, Paul; Vertesi, Tamas

    2009-01-01

    We present a fundamental concept - closed sets of correlations - for studying nonlocal correlations. We argue that sets of correlations corresponding to information-theoretic principles, or more generally to consistent physical theories, must be closed under a natural set of operations. Hence, studying the closure of sets of correlations gives insight into which information-theoretic principles are genuinely different, and which are ultimately equivalent. This concept also has implications for understanding why quantum nonlocality is limited, and for finding constraints on physical theories beyond quantum mechanics.

  12. Plant SET domain-containing proteins: structure, function and regulation

    Czech Academy of Sciences Publication Activity Database

    Ng, D.W.K.; Wang, T.; Chandrasekharan, M.B.; Aramayo, R.; Kertbundit, Sunee; Hall, T.C.

    2007-01-01

    Roč. 1769, 5-6 (2007), s. 316-329 ISSN 0167-4781 Institutional research plan: CEZ:AV0Z50380511 Keywords : arabidopsis SET genes * alternative splicing * epigenetics Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 1.704, year: 2007

  13. Gene Prioritization by Compressive Data Fusion and Chaining.

    Directory of Open Access Journals (Sweden)

    Marinka Žitnik

    2015-10-01

    Full Text Available Data integration procedures combine heterogeneous data sets into predictive models, but they are limited to data explicitly related to the target object type, such as genes. Collage is a new data fusion approach to gene prioritization. It considers data sets of various association levels with the prediction task, utilizes collective matrix factorization to compress the data, and chaining to relate different object types contained in a data compendium. Collage prioritizes genes based on their similarity to several seed genes. We tested Collage by prioritizing bacterial response genes in Dictyostelium as a novel model system for prokaryote-eukaryote interactions. Using 4 seed genes and 14 data sets, only one of which was directly related to the bacterial response, Collage proposed 8 candidate genes that were readily validated as necessary for the response of Dictyostelium to Gram-negative bacteria. These findings establish Collage as a method for inferring biological knowledge from the integration of heterogeneous and coarsely related data sets.

  14. Gene Prioritization by Compressive Data Fusion and Chaining.

    Science.gov (United States)

    Žitnik, Marinka; Nam, Edward A; Dinh, Christopher; Kuspa, Adam; Shaulsky, Gad; Zupan, Blaž

    2015-10-01

    Data integration procedures combine heterogeneous data sets into predictive models, but they are limited to data explicitly related to the target object type, such as genes. Collage is a new data fusion approach to gene prioritization. It considers data sets of various association levels with the prediction task, utilizes collective matrix factorization to compress the data, and chaining to relate different object types contained in a data compendium. Collage prioritizes genes based on their similarity to several seed genes. We tested Collage by prioritizing bacterial response genes in Dictyostelium as a novel model system for prokaryote-eukaryote interactions. Using 4 seed genes and 14 data sets, only one of which was directly related to the bacterial response, Collage proposed 8 candidate genes that were readily validated as necessary for the response of Dictyostelium to Gram-negative bacteria. These findings establish Collage as a method for inferring biological knowledge from the integration of heterogeneous and coarsely related data sets.

  15. Industrial scale gene synthesis.

    Science.gov (United States)

    Notka, Frank; Liss, Michael; Wagner, Ralf

    2011-01-01

    The most recent developments in the area of deep DNA sequencing and downstream quantitative and functional analysis are rapidly adding a new dimension to understanding biochemical pathways and metabolic interdependencies. These increasing insights pave the way to designing new strategies that address public needs, including environmental applications and therapeutic inventions, or novel cell factories for sustainable and reconcilable energy or chemicals sources. Adding yet another level is building upon nonnaturally occurring networks and pathways. Recent developments in synthetic biology have created economic and reliable options for designing and synthesizing genes, operons, and eventually complete genomes. Meanwhile, high-throughput design and synthesis of extremely comprehensive DNA sequences have evolved into an enabling technology already indispensable in various life science sectors today. Here, we describe the industrial perspective of modern gene synthesis and its relationship with synthetic biology. Gene synthesis contributed significantly to the emergence of synthetic biology by not only providing the genetic material in high quality and quantity but also enabling its assembly, according to engineering design principles, in a standardized format. Synthetic biology on the other hand, added the need for assembling complex circuits and large complexes, thus fostering the development of appropriate methods and expanding the scope of applications. Synthetic biology has also stimulated interdisciplinary collaboration as well as integration of the broader public by addressing socioeconomic, philosophical, ethical, political, and legal opportunities and concerns. The demand-driven technological achievements of gene synthesis and the implemented processes are exemplified by an industrial setting of large-scale gene synthesis, describing production from order to delivery. Copyright © 2011 Elsevier Inc. All rights reserved.

  16. Bankruptcy Prediction with Rough Sets

    NARCIS (Netherlands)

    J.C. Bioch (Cor); V. Popova (Viara)

    2001-01-01

    textabstractThe bankruptcy prediction problem can be considered an or dinal classification problem. The classical theory of Rough Sets describes objects by discrete attributes, and does not take into account the order- ing of the attributes values. This paper proposes a modification of the Rough Set

  17. Healthcare priority setting in Kenya

    DEFF Research Database (Denmark)

    Bukachi, Salome A.; Onyango-Ouma, Washington; Siso, Jared Maaka

    2014-01-01

    improves the priority setting decisions. This paper describes the healthcare priority setting processes in Malindi district, Kenya, prior to the implementation of A4R in 2008 and evaluates the process for its conformance with the conditions for A4R. In-depth interviews and focus group discussions with key...

  18. Development of detection method for novel fusion gene using GeneChip exon array.

    Science.gov (United States)

    Wada, Yusaku; Matsuura, Masaaki; Sugawara, Minoru; Ushijima, Masaru; Miyata, Satoshi; Nagasaki, Koichi; Noda, Tetsuo; Miki, Yoshio

    2014-02-18

    Fusion genes have been recognized to play key roles in oncogenesis. Though, many techniques have been developed for genome-wide analysis of fusion genes, a more efficient method is desired. We introduced a new method of detecting the novel fusion gene by using GeneChip Exon Array that enables exon expression analysis on a whole-genome scale and TAIL-PCR. To screen genes with abnormal exon expression profiles, we developed computational program, and confirmed that the program was able to search the fusion partner gene using Exon Array data of T-cell acute lymphocytic leukemia (T-ALL) cell lines. It was reported that the T-ALL cell lines, ALL-SIL, BE13 and LOUCY, harbored the fusion gene NUP214-ABL1, NUP214-ABL1 and SET-NUP214, respectively. The program extracted the candidate genes with abnormal exon expression profiles: 1 gene in ALL-SIL, 1 gene in BE13, and 2 genes in LOUCY. The known fusion partner gene NUP214 was included in the genes in ALL-SIL and LOUCY. Thus, we applied the proposed program to the detection of fusion partner genes in other tumors. To discover novel fusion genes, we examined 24 breast cancer cell lines and 20 pancreatic cancer cell lines by using the program. As a result, 20 and 23 candidate genes were obtained for the breast and pancreatic cancer cell lines respectively, and seven genes were selected as the final candidate gene based on information of the EST data base, comparison with normal cell samples and visual inspection of Exon expression profile. Finding of fusion partners for the final candidate genes was tried by TAIL-PCR, and three novel fusion genes were identified. The usefulness of our detection method was confirmed. Using this method for more samples, it is thought that fusion genes can be identified.

  19. Using RNA-seq data to select reference genes for normalizing gene expression in apple roots.

    Directory of Open Access Journals (Sweden)

    Zhe Zhou

    Full Text Available Gene expression in apple roots in response to various stress conditions is a less-explored research subject. Reliable reference genes for normalizing quantitative gene expression data have not been carefully investigated. In this study, the suitability of a set of 15 apple genes were evaluated for their potential use as reliable reference genes. These genes were selected based on their low variance of gene expression in apple root tissues from a recent RNA-seq data set, and a few previously reported apple reference genes for other tissue types. Four methods, Delta Ct, geNorm, NormFinder and BestKeeper, were used to evaluate their stability in apple root tissues of various genotypes and under different experimental conditions. A small panel of stably expressed genes, MDP0000095375, MDP0000147424, MDP0000233640, MDP0000326399 and MDP0000173025 were recommended for normalizing quantitative gene expression data in apple roots under various abiotic or biotic stresses. When the most stable and least stable reference genes were used for data normalization, significant differences were observed on the expression patterns of two target genes, MdLecRLK5 (MDP0000228426, a gene encoding a lectin receptor like kinase and MdMAPK3 (MDP0000187103, a gene encoding a mitogen-activated protein kinase. Our data also indicated that for those carefully validated reference genes, a single reference gene is sufficient for reliable normalization of the quantitative gene expression. Depending on the experimental conditions, the most suitable reference genes can be specific to the sample of interest for more reliable RT-qPCR data normalization.

  20. Multiobjective differential evolution-based multifactor dimensionality reduction for detecting gene-gene interactions.

    Science.gov (United States)

    Yang, Cheng-Hong; Chuang, Li-Yeh; Lin, Yu-Da

    2017-10-09

    Epistasis within disease-related genes (gene-gene interactions) was determined through contingency table measures based on multifactor dimensionality reduction (MDR) using single-nucleotide polymorphisms (SNPs). Most MDR-based methods use the single contingency table measure to detect gene-gene interactions; however, some gene-gene interactions may require identification through multiple contingency table measures. In this study, a multiobjective differential evolution method (called MODEMDR) was proposed to merge the various contingency table measures based on MDR to detect significant gene-gene interactions. Two contingency table measures, namely the correct classification rate and normalized mutual information, were selected to design the fitness functions in MODEMDR. The characteristics of multiobjective optimization enable MODEMDR to use multiple measures to efficiently and synchronously detect significant gene-gene interactions within a reasonable time frame. Epistatic models with and without marginal effects under various parameter settings (heritability and minor allele frequencies) were used to assess existing methods by comparing the detection success rates of gene-gene interactions. The results of the simulation datasets show that MODEMDR is superior to existing methods. Moreover, a large dataset obtained from the Wellcome Trust Case Control Consortium was used to assess MODEMDR. MODEMDR exhibited efficiency in identifying significant gene-gene interactions in genome-wide association studies.

  1. Interpreting lattice-valued set theory in fuzzy set theory

    Czech Academy of Sciences Publication Activity Database

    Hájek, Petr; Haniková, Zuzana

    2013-01-01

    Roč. 21, č. 1 (2013), s. 77-90 ISSN 1367-0751 R&D Projects: GA ČR GAP202/10/1826; GA MŠk ME09110 Institutional research plan: CEZ:AV0Z10300504 Keywords : lattice-valued logic * lattice-valued set theory * basic fuzzy logic * fuzzy set theory Subject RIV: BA - General Mathematics Impact factor: 0.530, year: 2013

  2. Application of Gene Shaving and Mixture Models to Cluster Microarray Gene Expression Data

    Directory of Open Access Journals (Sweden)

    S. Wen

    2007-01-01

    Full Text Available Researchers are frequently faced with the analysis of microarray data of a relatively large number of genes using a small number of tissue samples. We examine the application of two statistical methods for clustering such microarray expression data: EMMIX-GENE and GeneClust. EMMIX-GENE is a mixture-model based clustering approach, designed primarily to cluster tissue samples on the basis of the genes. GeneClust is an implementation of the gene shaving methodology, motivated by research to identify distinct sets of genes for which variation in expression could be related to a biological property of the tissue samples. We illustrate the use of these two methods in the analysis of Affymetrix oligonucleotide arrays of well-known data sets from colon tissue samples with and without tumors, and of tumor tissue samples from patients with leukemia. Although the two approaches have been developed from different perspectives, the results demonstrate a clear correspondence between gene clusters produced by GeneClust and EMMIX-GENE for the colon tissue data. It is demonstrated, for the case of ribosomal proteins and smooth muscle genes in the colon data set, that both methods can classify genes into co-regulated families. It is further demonstrated that tissue types (tumor and normal can be separated on the basis of subtle distributed patterns of genes. Application to the leukemia tissue data produces a division of tissues corresponding closely to the external classification, acute myeloid leukemia (AML and acute lymphoblastic leukaemia (ALL, for both methods. In addition, we also identify genes specifi c for the subgroup of ALL-T cell samples. Overall, we find that the gene shaving method produces gene clusters at great speed; allows variable cluster sizes and can incorporate partial or full supervision; and finds clusters of genes in which the gene expression varies greatly over the tissue samples while maintaining a high level of coherence between the

  3. First-Class Object Sets

    DEFF Research Database (Denmark)

    Ernst, Erik

    Typically, objects are monolithic entities with a fixed interface. To increase the flexibility in this area, this paper presents first-class object sets as a language construct. An object set offers an interface which is a disjoint union of the interfaces of its member objects. It may also be used...... for a special kind of method invocation involving multiple objects in a dynamic lookup process. With support for feature access and late-bound method calls object sets are similar to ordinary objects, only more flexible. The approach is made precise by means of a small calculus, and the soundness of its type...

  4. Path integrals on causal sets

    International Nuclear Information System (INIS)

    Johnston, Steven

    2009-01-01

    We describe a quantum mechanical model for particle propagation on a causal set. The model involves calculating a particle propagator by summing amplitudes assigned to trajectories within the causal set. This 'discrete path integral' is calculated using a matrix geometric series. Amplitudes are given which, when the causal set is generated by sprinkling points into 1+1 or 3+1 Minkowski spacetime, ensure the particle propagator agrees in a suitable sense, with the retarded causal propagator for the Klein-Gordon equation.

  5. DNA sequences required for regulated expresson of the β-globin genes in murine erythroleukaemia cells.

    NARCIS (Netherlands)

    S. Wright; E. de Boer (Ernie); A. Rosenthal; R.A. Flavell (Richard); F.G. Grosveld (Frank)

    1984-01-01

    textabstractWe have introduced into murine erythroleukaemia (MEL) cells a series of human globin gene cosmids and two sets of hybrid genes constructed from the human beta-globin gene and the human gamma-globin or murine H-2Kbm1 genes. S1-nuclease analysis of the mRNA products from these genes before

  6. Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria

    Directory of Open Access Journals (Sweden)

    Muscariello Lidia

    2006-05-01

    Full Text Available Abstract Background Genomes of gram-positive bacteria encode many putative cell-surface proteins, of which the majority has no known function. From the rapidly increasing number of available genome sequences it has become apparent that many cell-surface proteins are conserved, and frequently encoded in gene clusters or operons, suggesting common functions, and interactions of multiple components. Results A novel gene cluster encoding exclusively cell-surface proteins was identified, which is conserved in a subgroup of gram-positive bacteria. Each gene cluster generally has one copy of four new gene families called cscA, cscB, cscC and cscD. Clusters encoding these cell-surface proteins were found only in complete genomes of Lactobacillus plantarum, Lactobacillus sakei, Enterococcus faecalis, Listeria innocua, Listeria monocytogenes, Lactococcus lactis ssp lactis and Bacillus cereus and in incomplete genomes of L. lactis ssp cremoris, Lactobacillus casei, Enterococcus faecium, Pediococcus pentosaceus, Lactobacillius brevis, Oenococcus oeni, Leuconostoc mesenteroides, and Bacillus thuringiensis. These genes are neither present in the genomes of streptococci, staphylococci and clostridia, nor in the Lactobacillus acidophilus group, suggesting a niche-specific distribution, possibly relating to association with plants. All encoded proteins have a signal peptide for secretion by the Sec-dependent pathway, while some have cell-surface anchors, novel WxL domains, and putative domains for sugar binding and degradation. Transcriptome analysis in L. plantarum shows that the cscA-D genes are co-expressed, supporting their operon organization. Many gene clusters are significantly up-regulated in a glucose-grown, ccpA-mutant derivative of L. plantarum, suggesting catabolite control. This is supported by the presence of predicted CRE-sites upstream or inside the up-regulated cscA-D gene clusters. Conclusion We propose that the CscA, CscB, CscC and Csc

  7. Identification of Human Housekeeping Genes and Tissue-Selective Genes by Microarray Meta-Analysis

    Science.gov (United States)

    Chang, Cheng-Wei; Cheng, Wei-Chung; Chen, Chaang-Ray; Shu, Wun-Yi; Tsai, Min-Lung; Huang, Ching-Lung; Hsu, Ian C.

    2011-01-01

    Background Categorizing protein-encoding transcriptomes of normal tissues into housekeeping genes and tissue-selective genes is a fundamental step toward studies of genetic functions and genetic associations to tissue-specific diseases. Previous studies have been mainly based on a few data sets with limited samples in each tissue, which restrained the representativeness of their identified genes, and resulted in low consensus among them. Results This study compiled 1,431 samples in 43 normal human tissues from 104 microarray data sets. We developed a new method to improve gene expression assessment, and showed that more than ten samples are needed to robustly identify the protein-encoding transcriptome of a tissue. We identified 2,064 housekeeping genes and 2,293 tissue-selective genes, and analyzed gene lists by functional enrichment analysis. The housekeeping genes are mainly involved in fundamental cellular functions, and the tissue-selective genes are strikingly related to functions and diseases corresponding to tissue-origin. We also compared agreements and related functions among our housekeeping genes and those of previous studies, and pointed out some reasons for the low consensuses. Conclusions The results indicate that sufficient samples have improved the identification of protein-encoding transcriptome of a tissue. Comprehensive meta-analysis has proved the high quality of our identified HK and TS genes. These results could offer a useful resource for future research on functional and genomic features of HK and TS genes. PMID:21818400

  8. IGBT accelerated aging data set.

    Data.gov (United States)

    National Aeronautics and Space Administration — Preliminary data from thermal overstress accelerated aging using the aging and characterization system. The data set contains aging data from 6 devices, one device...

  9. Mycobacterium abscessus in Healthcare Settings

    Science.gov (United States)

    ... Sepsis Sharps Safety - CDC Transplant Safety Vaccine Safety Mycobacterium abscessus in Healthcare Settings Recommend on Facebook Tweet ... Mycobacterium abscessus Recommendations and Guidelines General information about Mycobacterium abscessus Mycobacterium abscessus [mī–kō–bak–tair–ee– ...

  10. First-Class Object Sets

    DEFF Research Database (Denmark)

    Ernst, Erik

    2009-01-01

    Typically, an object is a monolithic entity with a fixed interface.  To increase flexibility in this area, this paper presents first-class object sets as a language construct.  An object set offers an interface which is a disjoint union of the interfaces of its member objects.  It may also be used...... for a special kind of method invocation involving multiple objects in a dynamic lookup process.  With support for feature access and late-bound method calls, object sets are similar to ordinary objects, only more flexible.  Object sets are particularly convenient as a lightweight primitive which may be added...

  11. On Intuitionistic Fuzzy Sets Theory

    CERN Document Server

    Atanassov, Krassimir T

    2012-01-01

    This book aims to be a  comprehensive and accurate survey of state-of-art research on intuitionistic fuzzy sets theory and could be considered a continuation and extension of the author´s previous book on Intuitionistic Fuzzy Sets, published by Springer in 1999 (Atanassov, Krassimir T., Intuitionistic Fuzzy Sets, Studies in Fuzziness and soft computing, ISBN 978-3-7908-1228-2, 1999). Since the aforementioned  book has appeared, the research activity of the author within the area of intuitionistic fuzzy sets has been expanding into many directions. The results of the author´s most recent work covering the past 12 years as well as the newest general ideas and open problems in this field have been therefore collected in this new book.

  12. Pseudomonas aeruginosa in Healthcare Settings

    Science.gov (United States)

    ... Sepsis Sharps Safety - CDC Transplant Safety Vaccine Safety Pseudomonas aeruginosa in Healthcare Settings Recommend on Facebook Tweet ... and/or help treat infections? What is a Pseudomonas infection? Pseudomonas infection is caused by strains of ...

  13. Test Program Set (TPS) Lab

    Data.gov (United States)

    Federal Laboratory Consortium — The ARDEC TPS Laboratory provides an organic Test Program Set (TPS) development, maintenance, and life cycle management capability for DoD LCMC materiel developers....

  14. Agenda-setting the unknown

    DEFF Research Database (Denmark)

    Dannevig, Halvor

    -setting theory, it is concluded that agenda-setting of climate change adaptation requires human agency in providing local legitimacy and salience for the issue. The thesis also finds that boundary arrangements are needed to bridge the gap between local knowledge and scientific knowledge for adaptation governance....... Attempts at such boundary arrangements are already in place at the regional governance levels, but they must be strengthened if municipalities are to take further steps in implementing adaptation measures....

  15. Julia Sets of Orthogonal Polynomials

    DEFF Research Database (Denmark)

    Christiansen, Jacob Stordal; Henriksen, Christian; Petersen, Henrik Laurberg

    2018-01-01

    For a probability measure with compact and non-polar support in the complex plane we relate dynamical properties of the associated sequence of orthogonal polynomials fPng to properties of the support. More precisely we relate the Julia set of Pn to the outer boundary of the support, the lled Julia...... set to the polynomial convex hull K of the support, and the Green's function associated with Pn to the Green's function for the complement of K....

  16. Philosophical introduction to set theory

    CERN Document Server

    Pollard, Stephen

    2015-01-01

    The primary mechanism for ideological and theoretical unification in modern mathematics, set theory forms an essential element of any comprehensive treatment of the philosophy of mathematics. This unique approach to set theory offers a technically informed discussion that covers a variety of philosophical issues. Rather than focusing on intuitionist and constructive alternatives to the Cantorian/Zermelian tradition, the author examines the two most important aspects of the current philosophy of mathematics, mathematical structuralism and mathematical applications of plural reference and plural

  17. Introduction to Fuzzy Set Theory

    Science.gov (United States)

    Kosko, Bart

    1990-01-01

    An introduction to fuzzy set theory is described. Topics covered include: neural networks and fuzzy systems; the dynamical systems approach to machine intelligence; intelligent behavior as adaptive model-free estimation; fuzziness versus probability; fuzzy sets; the entropy-subsethood theorem; adaptive fuzzy systems for backing up a truck-and-trailer; product-space clustering with differential competitive learning; and adaptive fuzzy system for target tracking.

  18. Genes and Hearing Loss

    Science.gov (United States)

    ... ENTCareers Marketplace Find an ENT Doctor Near You Genes and Hearing Loss Genes and Hearing Loss Patient ... mutation may only have dystopia canthorum. How Do Genes Work? Genes are a road map for the ...

  19. Minimal output sets for identifiability.

    Science.gov (United States)

    Anguelova, Milena; Karlsson, Johan; Jirstrand, Mats

    2012-09-01

    Ordinary differential equation models in biology often contain a large number of parameters that must be determined from measurements by parameter estimation. For a parameter estimation procedure to be successful, there must be a unique set of parameters that can have produced the measured data. This is not the case if a model is not uniquely structurally identifiable with the given set of outputs selected as measurements. In designing an experiment for the purpose of parameter estimation, given a set of feasible but resource-consuming measurements, it is useful to know which ones must be included in order to obtain an identifiable system, or whether the system is unidentifiable from the feasible measurement set. We have developed an algorithm that, from a user-provided set of variables and parameters or functions of them assumed to be measurable or known, determines all subsets that when used as outputs give a locally structurally identifiable system and are such that any output set for which the system is structurally identifiable must contain at least one of the calculated subsets. The algorithm has been implemented in Mathematica and shown to be feasible and efficient. We have successfully applied it in the analysis of large signalling pathway models from the literature. Copyright © 2012 Elsevier Inc. All rights reserved.

  20. An introduction to random sets

    CERN Document Server

    Nguyen, Hung T

    2006-01-01

    The study of random sets is a large and rapidly growing area with connections to many areas of mathematics and applications in widely varying disciplines, from economics and decision theory to biostatistics and image analysis. The drawback to such diversity is that the research reports are scattered throughout the literature, with the result that in science and engineering, and even in the statistics community, the topic is not well known and much of the enormous potential of random sets remains untapped.An Introduction to Random Sets provides a friendly but solid initiation into the theory of random sets. It builds the foundation for studying random set data, which, viewed as imprecise or incomplete observations, are ubiquitous in today''s technological society. The author, widely known for his best-selling A First Course in Fuzzy Logic text as well as his pioneering work in random sets, explores motivations, such as coarse data analysis and uncertainty analysis in intelligent systems, for studying random s...

  1. Setting MEPS for electronic products

    International Nuclear Information System (INIS)

    Siderius, Hans-Paul

    2014-01-01

    When analysing price, performance and efficiency data for 15 consumer electronic and information and communication technology products, we found that in general price did not relate to the efficiency of the product. Prices of electronic products with comparable performance decreased over time. For products where the data allowed fitting the relationship, we found an exponential decrease in price with an average time constant of −0.30 [1/year], meaning that every year the product became 26% cheaper on average. The results imply that the classical approach of setting minimum efficiency performance standards (MEPS) by means of life cycle cost calculations cannot be applied to electronic products. Therefore, an alternative approach based on the improvement of efficiency over time and the variation in efficiency of products on the market, is presented. The concept of a policy action window can provide guidance for the decision on whether setting MEPS for a certain product is appropriate. If the (formal) procedure for setting MEPS takes longer than the policy action window, this means that the efficiency improvement will also be achieved without setting MEPS. We found short, i.e. less than three years, policy action windows for graphic cards, network attached storage products, network switches and televisions. - Highlights: • For electronic consumer products price does not relate to efficiency. • Average price decrease of selected electronic products is 26 % per year. • We give an alternative approach to life cycle cost calculations for setting MEPS. • The policy action window indicates whether setting MEPS is appropriate

  2. Analytical comparison of nine PCR primer sets designed to detect the presence of Escherichia coli/Shigella in water samples.

    Science.gov (United States)

    Maheux, Andrée F; Picard, François J; Boissinot, Maurice; Bissonnette, Luc; Paradis, Sonia; Bergeron, Michel G

    2009-07-01

    The analytical performance of 9 different PCR primer sets designed to detect Escherichia coli and Shigella in water has been evaluated in terms of ubiquity, specificity, and analytical detection limit. Of the 9 PCR primer sets tested, only 3 of the 5 primer sets targeting uidA gene and the primer set targeting tuf gene amplified DNA from all E. coli strains tested. However, of those 4 primer sets, only the primer set targeting the tuf gene also amplified DNA from all Shigella strains tested. For the specificity, only the primer sets targeting the uidA gene were 100% specific although the primer sets targeting 16S rRNA, phoE, and tuf genes only amplified Escherichia fergusonii as non-specific target. Finally, the primer set targeting the 16S-ITS-23S gene region, was not specific as it amplified DNA from many other Enterobacteriaceae species. In summary, only the assay targeting the tuf gene detected all E. coli/Shigella strains tested in this study. However, if it becomes important to discriminate between E. coli and E. fergusonii, assays targeting the uidA gene would represent a good choice although none of them were totally ubiquitous to detect of the presence of Shigella strains.

  3. The Descent Set and Connectivity Set of a Permutation

    Science.gov (United States)

    Stanley, Richard P.

    2005-08-01

    The descent set D(w) of a permutation w of 1,2,...,n is a standard and well-studied statistic. We introduce a new statistic, the connectivity set C(w), and show that it is a kind of dual object to D(w). The duality is stated in terms of the inverse of a matrix that records the joint distribution of D(w) and C(w). We also give a variation involving permutations of a multiset and a q-analogue that keeps track of the number of inversions of w.

  4. Identification of Human HK Genes and Gene Expression Regulation Study in Cancer from Transcriptomics Data Analysis

    Science.gov (United States)

    Zhang, Zhang; Liu, Jingxing; Wu, Jiayan; Yu, Jun

    2013-01-01

    The regulation of gene expression is essential for eukaryotes, as it drives the processes of cellular differentiation and morphogenesis, leading to the creation of different cell types in multicellular organisms. RNA-Sequencing (RNA-Seq) provides researchers with a powerful toolbox for characterization and quantification of transcriptome. Many different human tissue/cell transcriptome datasets coming from RNA-Seq technology are available on public data resource. The fundamental issue here is how to develop an effective analysis method to estimate expression pattern similarities between different tumor tissues and their corresponding normal tissues. We define the gene expression pattern from three directions: 1) expression breadth, which reflects gene expression on/off status, and mainly concerns ubiquitously expressed genes; 2) low/high or constant/variable expression genes, based on gene expression level and variation; and 3) the regulation of gene expression at the gene structure level. The cluster analysis indicates that gene expression pattern is higher related to physiological condition rather than tissue spatial distance. Two sets of human housekeeping (HK) genes are defined according to cell/tissue types, respectively. To characterize the gene expression pattern in gene expression level and variation, we firstly apply improved K-means algorithm and a gene expression variance model. We find that cancer-associated HK genes (a HK gene is specific in cancer group, while not in normal group) are expressed higher and more variable in cancer condition than in normal condition. Cancer-associated HK genes prefer to AT-rich genes, and they are enriched in cell cycle regulation related functions and constitute some cancer signatures. The expression of large genes is also avoided in cancer group. These studies will help us understand which cell type-specific patterns of gene expression differ among different cell types, and particularly for cancer. PMID:23382867

  5. Rooted triple consensus and anomalous gene trees

    Directory of Open Access Journals (Sweden)

    Schmidt Heiko A

    2008-04-01

    Full Text Available Abstract Background Anomalous gene trees (AGTs are gene trees with a topology different from a species tree that are more probable to observe than congruent gene trees. In this paper we propose a rooted triple approach to finding the correct species tree in the presence of AGTs. Results Based on simulated data we show that our method outperforms the extended majority rule consensus strategy, while still resolving the species tree. Applying both methods to a metazoan data set of 216 genes, we tested whether AGTs substantially interfere with the reconstruction of the metazoan phylogeny. Conclusion Evidence of AGTs was not found in this data set, suggesting that erroneously reconstructed gene trees are the most significant challenge in the reconstruction of phylogenetic relationships among species with current data. The new method does however rule out the erroneous reconstruction of deep or poorly resolved splits in the presence of lineage sorting.

  6. PhyloPat: phylogenetic pattern analysis of eukaryotic genes

    NARCIS (Netherlands)

    Hulsen, T.; Vlieg, J. de; Groenen, P.M.

    2006-01-01

    BACKGROUND: Phylogenetic patterns show the presence or absence of certain genes or proteins in a set of species. They can also be used to determine sets of genes or proteins that occur only in certain evolutionary branches. Phylogenetic patterns analysis has routinely been applied to protein

  7. Dimerization of a Viral SET Protein Endows its Function

    Energy Technology Data Exchange (ETDEWEB)

    H Wei; M Zhou

    2011-12-31

    Histone modifications are regarded as the most indispensible phenomena in epigenetics. Of these modifications, lysine methylation is of the greatest complexity and importance as site- and state-specific lysine methylation exerts a plethora of effects on chromatin structure and gene transcription. Notably, paramecium bursaria chlorella viruses encode a conserved SET domain methyltransferase, termed vSET, that functions to suppress host transcription by methylating histone H3 at lysine 27 (H3K27), a mark for eukaryotic gene silencing. Unlike mammalian lysine methyltransferases (KMTs), vSET functions only as a dimer, but the underlying mechanism has remained elusive. In this study, we demonstrate that dimeric vSET operates with negative cooperativity between the two active sites and engages in H3K27 methylation one site at a time. New atomic structures of vSET in the free form and a ternary complex with S-adenosyl homocysteine and a histone H3 peptide and biochemical analyses reveal the molecular origin for the negative cooperativity and explain the substrate specificity of H3K27 methyltransferases. Our study suggests a 'walking' mechanism, by which vSET acts all by itself to globally methylate host H3K27, which is accomplished by the mammalian EZH2 KMT only in the context of the Polycomb repressive complex.

  8. Application of community phylogenetic approaches to understand gene expression: differential exploration of venom gene space in predatory marine gastropods.

    Science.gov (United States)

    Chang, Dan; Duda, Thomas F

    2014-06-05

    Predatory marine gastropods of the genus Conus exhibit substantial variation in venom composition both within and among species. Apart from mechanisms associated with extensive turnover of gene families and rapid evolution of genes that encode venom components ('conotoxins'), the evolution of distinct conotoxin expression patterns is an additional source of variation that may drive interspecific differences in the utilization of species' 'venom gene space'. To determine the evolution of expression patterns of venom genes of Conus species, we evaluated the expression of A-superfamily conotoxin genes of a set of closely related Conus species by comparing recovered transcripts of A-superfamily genes that were previously identified from the genomes of these species. We modified community phylogenetics approaches to incorporate phylogenetic history and disparity of genes and their expression profiles to determine patterns of venom gene space utilization. Less than half of the A-superfamily gene repertoire of these species is expressed, and only a few orthologous genes are coexpressed among species. Species exhibit substantially distinct expression strategies, with some expressing sets of closely related loci ('under-dispersed' expression of available genes) while others express sets of more disparate genes ('over-dispersed' expression). In addition, expressed genes show higher dN/dS values than either unexpressed or ancestral genes; this implies that expression exposes genes to selection and facilitates rapid evolution of these genes. Few recent lineage-specific gene duplicates are expressed simultaneously, suggesting that expression divergence among redundant gene copies may be established shortly after gene duplication. Our study demonstrates that venom gene space is explored differentially by Conus species, a process that effectively permits the independent and rapid evolution of venoms in these species.

  9. Gene expression and gene therapy imaging

    International Nuclear Information System (INIS)

    Rome, Claire; Couillaud, Franck; Moonen, Chrit T.W.

    2007-01-01

    The fast growing field of molecular imaging has achieved major advances in imaging gene expression, an important element of gene therapy. Gene expression imaging is based on specific probes or contrast agents that allow either direct or indirect spatio-temporal evaluation of gene expression. Direct evaluation is possible with, for example, contrast agents that bind directly to a specific target (e.g., receptor). Indirect evaluation may be achieved by using specific substrate probes for a target enzyme. The use of marker genes, also called reporter genes, is an essential element of MI approaches for gene expression in gene therapy. The marker gene may not have a therapeutic role itself, but by coupling the marker gene to a therapeutic gene, expression of the marker gene reports on the expression of the therapeutic gene. Nuclear medicine and optical approaches are highly sensitive (detection of probes in the picomolar range), whereas MRI and ultrasound imaging are less sensitive and require amplification techniques and/or accumulation of contrast agents in enlarged contrast particles. Recently developed MI techniques are particularly relevant for gene therapy. Amongst these are the possibility to track gene therapy vectors such as stem cells, and the techniques that allow spatiotemporal control of gene expression by non-invasive heating (with MRI guided focused ultrasound) and the use of temperature sensitive promoters. (orig.)

  10. Reference genes for gene expression studies in wheat flag leaves grown under different farming conditions

    Directory of Open Access Journals (Sweden)

    Cordeiro Raposo Fernando

    2011-09-01

    Full Text Available Abstract Background Internal control genes with highly uniform expression throughout the experimental conditions are required for accurate gene expression analysis as no universal reference genes exists. In this study, the expression stability of 24 candidate genes from Triticum aestivum cv. Cubus flag leaves grown under organic and conventional farming systems was evaluated in two locations in order to select suitable genes that can be used for normalization of real-time quantitative reverse-transcription PCR (RT-qPCR reactions. The genes were selected among the most common used reference genes as well as genes encoding proteins involved in several metabolic pathways. Findings Individual genes displayed different expression rates across all samples assayed. Applying geNorm, a set of three potential reference genes were suitable for normalization of RT-qPCR reactions in winter wheat flag leaves cv. Cubus: TaFNRII (ferredoxin-NADP(H oxidoreductase; AJ457980.1, ACT2 (actin 2; TC234027, and rrn26 (a putative homologue to RNA 26S gene; AL827977.1. In addition of these three genes that were also top-ranked by NormFinder, two extra genes: CYP18-2 (Cyclophilin A, AY456122.1 and TaWIN1 (14-3-3 like protein, AB042193 were most consistently stably expressed. Furthermore, we showed that TaFNRII, ACT2, and CYP18-2 are suitable for gene expression normalization in other two winter wheat varieties (Tommi and Centenaire grown under three treatments (organic, conventional and no nitrogen and a different environment than the one tested with cv. Cubus. Conclusions This study provides a new set of reference genes which should improve the accuracy of gene expression analyses when using wheat flag leaves as those related to the improvement of nitrogen use efficiency for cereal production.

  11. Plant reference genes for development and stress response studies

    Indian Academy of Sciences (India)

    Many reference genes are used by different laboratories for gene expression analyses to indicate the relative amount of inputRNA/DNA in the experiment. These reference genes are supposed to show least variation among the treatments and withthe control sets in a given experiment. However, expression of reference ...

  12. Genomewide analysis of the lateral organ boundaries domain gene ...

    Indian Academy of Sciences (India)

    sion patterns of six LBD genes through quantitative real-time polymerase chain reation analysis. The six LBD genes ... Keywords. genomewide analysis; lateral organ boundaries domain; gene family; stress; expression; Vitis vinifera. Journal of .... available from the NCBI were used with an e-value cut-off set to 1e-003 ...

  13. Maximal abelian sets of roots

    CERN Document Server

    Lawther, R

    2018-01-01

    In this work the author lets \\Phi be an irreducible root system, with Coxeter group W. He considers subsets of \\Phi which are abelian, meaning that no two roots in the set have sum in \\Phi \\cup \\{ 0 \\}. He classifies all maximal abelian sets (i.e., abelian sets properly contained in no other) up to the action of W: for each W-orbit of maximal abelian sets we provide an explicit representative X, identify the (setwise) stabilizer W_X of X in W, and decompose X into W_X-orbits. Abelian sets of roots are closely related to abelian unipotent subgroups of simple algebraic groups, and thus to abelian p-subgroups of finite groups of Lie type over fields of characteristic p. Parts of the work presented here have been used to confirm the p-rank of E_8(p^n), and (somewhat unexpectedly) to obtain for the first time the 2-ranks of the Monster and Baby Monster sporadic groups, together with the double cover of the latter. Root systems of classical type are dealt with quickly here; the vast majority of the present work con...

  14. Maximal Abelian sets of roots

    CERN Document Server

    Lawther, R

    2018-01-01

    In this work the author lets \\Phi be an irreducible root system, with Coxeter group W. He considers subsets of \\Phi which are abelian, meaning that no two roots in the set have sum in \\Phi \\cup \\{ 0 \\}. He classifies all maximal abelian sets (i.e., abelian sets properly contained in no other) up to the action of W: for each W-orbit of maximal abelian sets we provide an explicit representative X, identify the (setwise) stabilizer W_X of X in W, and decompose X into W_X-orbits. Abelian sets of roots are closely related to abelian unipotent subgroups of simple algebraic groups, and thus to abelian p-subgroups of finite groups of Lie type over fields of characteristic p. Parts of the work presented here have been used to confirm the p-rank of E_8(p^n), and (somewhat unexpectedly) to obtain for the first time the 2-ranks of the Monster and Baby Monster sporadic groups, together with the double cover of the latter. Root systems of classical type are dealt with quickly here; the vast majority of the present work con...

  15. Annotation-based feature extraction from sets of SBML models.

    Science.gov (United States)

    Alm, Rebekka; Waltemath, Dagmar; Wolfien, Markus; Wolkenhauer, Olaf; Henkel, Ron

    2015-01-01

    Model repositories such as BioModels Database provide computational models of biological systems for the scientific community. These models contain rich semantic annotations that link model entities to concepts in well-established bio-ontologies such as Gene Ontology. Consequently, thematically similar models are likely to share similar annotations. Based on this assumption, we argue that semantic annotations are a suitable tool to characterize sets of models. These characteristics improve model classification, allow to identify additional features for model retrieval tasks, and enable the comparison of sets of models. In this paper we discuss four methods for annotation-based feature extraction from model sets. We tested all methods on sets of models in SBML format which were composed from BioModels Database. To characterize each of these sets, we analyzed and extracted concepts from three frequently used ontologies, namely Gene Ontology, ChEBI and SBO. We find that three out of the methods are suitable to determine characteristic features for arbitrary sets of models: The selected features vary depending on the underlying model set, and they are also specific to the chosen model set. We show that the identified features map on concepts that are higher up in the hierarchy of the ontologies than the concepts used for model annotations. Our analysis also reveals that the information content of concepts in ontologies and their usage for model annotation do not correlate. Annotation-based feature extraction enables the comparison of model sets, as opposed to existing methods for model-to-keyword comparison, or model-to-model comparison.

  16. Variable-length haplotype construction for gene-gene interaction studies

    OpenAIRE

    Assawamakin, Anunchai; Chaiyaratana, Nachol; Limwongse, Chanin; Sinsomros, Saravudh; Yenchitsomanus, Pa-thai; Youngkong, Prakarnkiat

    2013-01-01

    This paper presents a non-parametric classification technique for identifying a candidate bi-allelic genetic marker set that best describes disease susceptibility in gene-gene interaction studies. The developed technique functions by creating a mapping between inferred haplotypes and case/control status. The technique cycles through all possible marker combination models generated from the available marker set where the best interaction model is determined from prediction accuracy and two aux...

  17. Imaging reporter gene for monitoring gene therapy

    International Nuclear Information System (INIS)

    Beco, V. de; Baillet, G.; Tamgac, F.; Tofighi, M.; Weinmann, P.; Vergote, J.; Moretti, J.L.; Tamgac, G.

    2002-01-01

    Scintigraphic images can be obtained to document gene function at cellular level. This approach is presented here and the use of a reporter gene to monitor gene therapy is described. Two main ways are presented: either the use of a reporter gene coding for an enzyme the action of which will be monitored by radiolabeled pro-drug, or a cellular receptor gene, the action of which is documented by a radio labeled cognate receptor ligand. (author)

  18. Programming services with correlation sets

    DEFF Research Database (Denmark)

    Montesi, Fabrizio; Carbone, Marco

    2011-01-01

    Correlation sets define a powerful mechanism for routing incoming communications to the correct running session within a server, by inspecting the content of the received messages. We present a language for programming services based on correlation sets taking into account key aspects of service......-oriented systems, such as distribution, loose coupling, open-endedness and integration. Distinguishing features of our approach are the notion of correlation aliases and an asynchronous communication model. Our language is equipped with formal syntax, semantics, and a typing system for ensuring desirable...... properties of programs with respect to correlation sets. We provide an implementation as an extension of the JOLIE language and apply it to a nontrivial real-world example of a fully-functional distributed user authentication system....

  19. Cosmic setting for chondrule formation

    International Nuclear Information System (INIS)

    Taylor, G.J.; Scott, E.R.D.; Keil, K.

    1983-01-01

    Chondrules are igneous-textured, millimeter-sized, spherical to irregularly-shaped silicate objects which constitute the major component of most chondrites. There is agreement that chondrules were once molten. Models for chondrule origin can be divided into two categories. One involves a planetary setting, which envisages chondrules forming on the surfaces of parent bodies. Melting mechanisms include impact and volcanism. The other category is concerned with a cosmic setting in the solar nebula, prior to nebula formation. Aspects regarding the impact on planetary surfaces are considered, taking into account chondrule abundances, the abundancy of agglutinates on the moon, comminution, hypervelocity impact pits, questions of age, and chondrule compositions. Attention is also given to collisions during accretion, collisions between molten planetesimals, volcanism, and virtues of a nebular setting. 101 references

  20. Physical Quantities, Measurement Sets, Theories

    Science.gov (United States)

    Viallefond, F.

    2012-09-01

    A methodology is proposed to develop efficient, robust and expressive data models. The idea is to transform objects described using our human language into mathematical objects which can then be used efficiently in information systems. This is done using topological spaces and algebras to model data types. Technically it is implemented using parametric polymorphism. Two examples are shown, 1) a simple well known object, the physical quantities, and 2) a data-base object, the measurement sets which bind the measurements to their experimental contexts. This leads to theories. The result is high expressiveness by formulating equations and data base operations by means of λ calculi. The theory of the measurement set encapsulates the relational model. Using topoi it is a generalization, a category above the sets.

  1. Simple Comparative Analyses of Differentially Expressed Gene Lists May Overestimate Gene Overlap.

    Science.gov (United States)

    Lawhorn, Chelsea M; Schomaker, Rachel; Rowell, Jonathan T; Rueppell, Olav

    2018-04-16

    Comparing the overlap between sets of differentially expressed genes (DEGs) within or between transcriptome studies is regularly used to infer similarities between biological processes. Significant overlap between two sets of DEGs is usually determined by a simple test. The number of potentially overlapping genes is compared to the number of genes that actually occur in both lists, treating every gene as equal. However, gene expression is controlled by transcription factors that bind to a variable number of transcription factor binding sites, leading to variation among genes in general variability of their expression. Neglecting this variability could therefore lead to inflated estimates of significant overlap between DEG lists. With computer simulations, we demonstrate that such biases arise from variation in the control of gene expression. Significant overlap commonly arises between two lists of DEGs that are randomly generated, assuming that the control of gene expression is variable among genes but consistent between corresponding experiments. More overlap is observed when transcription factors are specific to their binding sites and when the number of genes is considerably higher than the number of different transcription factors. In contrast, overlap between two DEG lists is always lower than expected when the genetic architecture of expression is independent between the two experiments. Thus, the current methods for determining significant overlap between DEGs are potentially confounding biologically meaningful overlap with overlap that arises due to variability in control of expression among genes, and more sophisticated approaches are needed.

  2. Selection of Phototransduction Genes in Homo sapiens.

    Science.gov (United States)

    Christopher, Mark; Scheetz, Todd E; Mullins, Robert F; Abràmoff, Michael D

    2013-08-13

    We investigated the evidence of recent positive selection in the human phototransduction system at single nucleotide polymorphism (SNP) and gene level. SNP genotyping data from the International HapMap Project for European, Eastern Asian, and African populations was used to discover differences in haplotype length and allele frequency between these populations. Numeric selection metrics were computed for each SNP and aggregated into gene-level metrics to measure evidence of recent positive selection. The level of recent positive selection in phototransduction genes was evaluated and compared to a set of genes shown previously to be under recent selection, and a set of highly conserved genes as positive and negative controls, respectively. Six of 20 phototransduction genes evaluated had gene-level selection metrics above the 90th percentile: RGS9, GNB1, RHO, PDE6G, GNAT1, and SLC24A1. The selection signal across these genes was found to be of similar magnitude to the positive control genes and much greater than the negative control genes. There is evidence for selective pressure in the genes involved in retinal phototransduction, and traces of this selective pressure can be demonstrated using SNP-level and gene-level metrics of allelic variation. We hypothesize that the selective pressure on these genes was related to their role in low light vision and retinal adaptation to ambient light changes. Uncovering the underlying genetics of evolutionary adaptations in phototransduction not only allows greater understanding of vision and visual diseases, but also the development of patient-specific diagnostic and intervention strategies.

  3. SHAREPOINT SITE CREATING AND SETTING

    Directory of Open Access Journals (Sweden)

    Oleksandr V. Tebenko

    2011-02-01

    Full Text Available Tools for sites building that offer users the ability to work together, an actual theme in information society and modern Web technologies. This article considers the SharePoint system, which enables to create sites of any complexity, including large portals with a complex structure of documents. Purpose of this article is to consider the main points of site creating and its setting with tools of SharePoint system, namely: a site template creating and configuring, web application environment to create and configure Web applications, change of existing and creation of new theme site, a web part setting.

  4. SITE DESIGN SETTING IN SHAREPOINT

    Directory of Open Access Journals (Sweden)

    Oleksii V. Tebenko

    2010-10-01

    Full Text Available Creation and promotion of the site is one of way to implement of ICT in education. To build modern sites and large portals, to avoid the process of content creation and management tools, when new content is added, dynamic model for page building is used. The article deals with means and methods of dynamic page template in SharePoint. Purpose of the article is to analyze the key components of SharePoint for dynamic pages, such as setting and changing master pages, standard types of spaceholders master pages, setting aggregates content and consideration of standard types of SharePoint spaceholders.

  5. Setting to earth for computer

    International Nuclear Information System (INIS)

    Gallego V, Luis Eduardo; Montana Ch, Johny Hernan; Tovar P, Andres Fernando; Amortegui, Francisco

    2000-01-01

    The program GMT allows the analysis of setting to earth for tensions DC and AC (of low frequency) of diverse configurations composed by cylindrical electrodes interconnected, in a homogeneous land or stratified (two layers). This analysis understands among other aspects: calculation of the setting resistance to earth, elevation of potential of the system (GPR), calculation of current densities in the conductors, potentials calculation in which point on the land surface (profile and surfaces), tensions calculation in passing and of contact, also, it carries out the interpretation of resistivity measures for Wenner and Schlumberger methods, finding a model of two layers

  6. Recommendation Sets and Choice Queries

    DEFF Research Database (Denmark)

    Viappiani, Paolo Renato; Boutilier, Craig

    2011-01-01

    Utility elicitation is an important component of many applications, such as decision support systems and recommender systems. Such systems query users about their preferences and offer recommendations based on the system's belief about the user's utility function. We analyze the connection between...... the problem of generating optimal recommendation sets and the problem of generating optimal choice queries, considering both Bayesian and regret-based elicitation. Our results show that, somewhat surprisingly, under very general circumstances, the optimal recommendation set coincides with the optimal query....

  7. GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics.

    Science.gov (United States)

    Piovesan, Allison; Caracausi, Maria; Antonaros, Francesca; Pelleri, Maria Chiara; Vitale, Lorenza

    2016-01-01

    We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a 'mean' human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves.Database URL: http://apollo11.isto.unibo.it/software/. © The Author(s) 2016. Published by Oxford University Press.

  8. Conversational Competence in Academic Settings

    Science.gov (United States)

    Bowman, Richard F.

    2014-01-01

    Conversational competence is a process, not a state. Ithaca does not exist, only the voyage to Ithaca. Vibrant campuses are a series of productive conversations. At its core, communicative competence in academic settings mirrors a collective search for meaning regarding the purpose and direction of a campus community. Communicative competence…

  9. Promoting Literacy in Multilingual Settings

    Science.gov (United States)

    Kosonen, Kimmo; Young, Catherine; Malone, Susan

    2006-01-01

    This compilation of resource papers and findings is from a regional workshop on mother-tongue/bilingual literacy programmes for ethnic and linguistic minorities in multilingual settings. It was organized by Asia-Pacific Programme of Education for All (APPEAL), United Nations Educational and Cultural Organization (UNESCO) Bangkok, 6-10 December…

  10. Repeated Interaction in Standard Setting

    NARCIS (Netherlands)

    Larouche, Pierre; Schütt, Florian

    2016-01-01

    As part of the standard-setting process, certain patents become essential. This may allow the owners of these standard-essential patents to hold up implementers of the standard, who can no longer turn to substitute technologies. However, many real-world standards evolve over time, with several

  11. Some remarks on good sets

    Indian Academy of Sciences (India)

    R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

    NadkarniMGandBhaskarRaoKPS,Whenisf (x1,... ,xn) = u1(x1)+. ททท+ un(xn)?, Proc. Indian Acad. Sci. (Math. Sci.) 113 (2003) 77–86. [3] Klopotowski A, Nadkarni M G and Bhaskar Rao K P S, Geometry of good sets in n-fold. Cartesian products, Proc. Indian Acad. Sci. (Math.

  12. The Value of Value Sets

    DEFF Research Database (Denmark)

    Sløk-Madsen, Stefan Kirkegaard; Christensen, Jesper

    The world over classrooms in business schools are being taught that corporate values can impact performance. The argument is typically that culture matter more than strategy plans and culture can be influenced and indeed changed by a shared corporate value set. While the claim seems intuitively...

  13. Neuropsychological Counseling in Hospital Settings.

    Science.gov (United States)

    Larson, Paul C.

    1992-01-01

    Explores integration of counseling psychology and neuropsychology in hospital setting. Sees example of such interchange occurring in rehabilitation unit or hospital where psychologist has responsibilities for helping patients, families, and staff to understand implications of central nervous system dysfunction and to adapt to changes. Discusses…

  14. Behavior Management in Afterschool Settings

    Science.gov (United States)

    Mahoney, Joseph L.

    2014-01-01

    Although behavioral management is one of the most challenging aspects of working in an afterschool setting, staff do not typically receive formal training in evidence-based approaches to handling children's behavior problems. Common approaches to behavioral management such as punishment or time-out are temporary solutions because they do not…

  15. False set in aireated cements

    Directory of Open Access Journals (Sweden)

    Vázquez, T.

    1986-06-01

    Full Text Available The influence of aireation on the appearance or elimination of the false setting in industrial portland cements is studied by means of infrared spectroscopy.

    Se estudia por medio de la espectroscopia infrarroja la influencia de la aireación sobre la aparición o eliminación del fraguado, en cemento portland industriales.

  16. Test set for IVP solvers

    NARCIS (Netherlands)

    W.M. Lioen (Walter); J.J.B. de Swart (Jacques); W.A. van der Veen

    1996-01-01

    textabstractIn this paper a collection of Initial Value test Problems for systems of Ordinary Differential Equations, Implicit Differential Equations and Differential-Algebraic Equations is presented. This test set is maintained by the project group for Parallel IVP Solvers of CWI, department of

  17. The IRI marketing data set

    NARCIS (Netherlands)

    Bronnenberg, B.J.; Kruger, M.W.; Mela, C.

    2008-01-01

    This paper describes a new data set available to academic researchers (at the following website: http://mktsci.pubs.informs.org). These data are comprised of store sales and consumer panel data for 30 product categories. The store sales data contain 5 years of product sales, pricing, and promotion

  18. Word graphs: The second set

    NARCIS (Netherlands)

    Hoede, C.; Liu, X

    1998-01-01

    In continuation of the paper of Hoede and Li on word graphs for a set of prepositions, word graphs are given for adjectives, adverbs and Chinese classifier words. It is argued that these three classes of words belong to a general class of words that may be called adwords. These words express the

  19. Health promotion and prison settings.

    Science.gov (United States)

    Santora, Lidia; Arild Espnes, Geir; Lillefjell, Monica

    2014-01-01

    The purpose of this paper is to examine the contribution of modern correctional service in health promotion exemplified by the case study of Norwegian health promotion policies in prison settings. This paper applies a two-fold methodology. First a narrative systematic literature review based on the Norwegian policy documents relevant for correctional settings is conducted. This is followed by a general review of the literature on the principles of humane service delivery in offender rehabilitation. Alongside the contribution of the Risk-Need-Responsivity Model in corrections and prevention of reoffending, the findings demonstrate an evident involvement of Norway in health promotion through authentic health promoting actions applied in prison settings. The actions are anchored in health policy's overarching goals of equity and "health in all public policy" aiming to reduce social inequalities in population health. In order to achieve a potential success of promoting health in correctional settings, policy makers have much to gain from endorsing a dialogue that respects the unique contributions of correctional research and health promotion. Focussing on inter-agency partnership and interdisciplinary collaboration between humane services may result in promising outcomes for individual, community and public health gain. The organizational factors and community involvement may be a significant aspect in prisoner rehabilitation, reentry and reintegration.

  20. Guidelines for setting speed limits

    CSIR Research Space (South Africa)

    Wium, DJW

    1986-02-01

    Full Text Available A method is described for setting the speed limit for a particular road section. Several speed limits based on different criteria are described for each of nine traffic and road factors. The most appropriate speed limit for each relevant factor...

  1. Abstraction by Set-Membership

    DEFF Research Database (Denmark)

    Mödersheim, Sebastian Alexander

    2010-01-01

    that the set of true facts does not monotonically grow with the transitions. We extend the scope of these over-approximation methods by defining a new way of abstraction that can handle such databases, and we formally prove that the abstraction is sound. We realize a translator from a convenient specification...

  2. Fuzzy-Set Case Studies

    Science.gov (United States)

    Mikkelsen, Kim Sass

    2017-01-01

    Contemporary case studies rely on verbal arguments and set theory to build or evaluate theoretical claims. While existing procedures excel in the use of qualitative information (information about kind), they ignore quantitative information (information about degree) at central points of the analysis. Effectively, contemporary case studies rely on…

  3. Parameter setting and input reduction

    NARCIS (Netherlands)

    Evers, A.; van Kampen, N.J.|info:eu-repo/dai/nl/126439737

    2008-01-01

    The language acquisition procedure identifies certain properties of the target grammar before others. The evidence from the input is processed in a stepwise order. Section 1 equates that order and its typical effects with an order of parameter setting. The question is how the acquisition procedure

  4. Multiclass gene selection using Pareto-fronts.

    Science.gov (United States)

    Rajapakse, Jagath C; Mundra, Piyushkumar A

    2013-01-01

    Filter methods are often used for selection of genes in multiclass sample classification by using microarray data. Such techniques usually tend to bias toward a few classes that are easily distinguishable from other classes due to imbalances of strong features and sample sizes of different classes. It could therefore lead to selection of redundant genes while missing the relevant genes, leading to poor classification of tissue samples. In this manuscript, we propose to decompose multiclass ranking statistics into class-specific statistics and then use Pareto-front analysis for selection of genes. This alleviates the bias induced by class intrinsic characteristics of dominating classes. The use of Pareto-front analysis is demonstrated on two filter criteria commonly used for gene selection: F-score and KW-score. A significant improvement in classification performance and reduction in redundancy among top-ranked genes were achieved in experiments with both synthetic and real-benchmark data sets.

  5. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  6. Bioconductor's EnrichmentBrowser: seamless navigation through combined results of set- & network-based enrichment analysis.

    Science.gov (United States)

    Geistlinger, Ludwig; Csaba, Gergely; Zimmer, Ralf

    2016-01-20

    Enrichment analysis of gene expression data is essential to find functional groups of genes whose interplay can explain experimental observations. Numerous methods have been published that either ignore (set-based) or incorporate (network-based) known interactions between genes. However, the often subtle benefits and disadvantages of the individual methods are confusing for most biological end users and there is currently no convenient way to combine methods for an enhanced result interpretation. We present the EnrichmentBrowser package as an easily applicable software that enables (1) the application of the most frequently used set-based and network-based enrichment methods, (2) their straightforward combination, and (3) a detailed and interactive visualization and exploration of the results. The package is available from the Bioconductor repository and implements additional support for standardized expression data preprocessing, differential expression analysis, and definition of suitable input gene sets and networks. The EnrichmentBrowser package implements essential functionality for the enrichment analysis of gene expression data. It combines the advantages of set-based and network-based enrichment analysis in order to derive high-confidence gene sets and biological pathways that are differentially regulated in the expression data under investigation. Besides, the package facilitates the visualization and exploration of such sets and pathways.

  7. How Genes Evolve

    Indian Academy of Sciences (India)

    which they are found e.g. the evolution of the gene coding for the protein cytochrome-C which is part ofthe respiratory apparatus. On the contrary, paralogous genes are descendants of a duplicated gene. Paralogous genes therefore evolve within the same species as well as in different species e.g. genes coding for alpha ...

  8. Gene Expression Profile Analysis is Directly Affected by the Selected Reference Gene: The Case of Leaf-Cutting Atta Sexdens

    Directory of Open Access Journals (Sweden)

    Kalynka G. do Livramento

    2018-02-01

    Full Text Available Although several ant species are important targets for the development of molecular control strategies, only a few studies focus on identifying and validating reference genes for quantitative reverse transcription polymerase chain reaction (RT-qPCR data normalization. We provide here an extensive study to identify and validate suitable reference genes for gene expression analysis in the ant Atta sexdens, a threatening agricultural pest in South America. The optimal number of reference genes varies according to each sample and the result generated by RefFinder differed about which is the most suitable reference gene. Results suggest that the RPS16, NADH and SDHB genes were the best reference genes in the sample pool according to stability values. The SNF7 gene expression pattern was stable in all evaluated sample set. In contrast, when using less stable reference genes for normalization a large variability in SNF7 gene expression was recorded. There is no universal reference gene suitable for all conditions under analysis, since these genes can also participate in different cellular functions, thus requiring a systematic validation of possible reference genes for each specific condition. The choice of reference genes on SNF7 gene normalization confirmed that unstable reference genes might drastically change the expression profile analysis of target candidate genes.

  9. Gene Ontology based housekeeping gene selection for RNA-seq normalization.

    Science.gov (United States)

    Chen, Chien-Ming; Lu, Yu-Lun; Sio, Chi-Pong; Wu, Guan-Chung; Tzou, Wen-Shyong; Pai, Tun-Wen

    2014-06-01

    RNA-seq analysis provides a powerful tool for revealing relationships between gene expression level and biological function of proteins. In order to identify differentially expressed genes among various RNA-seq datasets obtained from different experimental designs, an appropriate normalization method for calibrating multiple experimental datasets is the first challenging problem. We propose a novel method to facilitate biologists in selecting a set of suitable housekeeping genes for inter-sample normalization. The approach is achieved by adopting user defined experimentally related keywords, GO annotations, GO term distance matrices, orthologous housekeeping gene candidates, and stability ranking of housekeeping genes. By identifying the most distanced GO terms from query keywords and selecting housekeeping gene candidates with low coefficients of variation among different spatio-temporal datasets, the proposed method can automatically enumerate a set of functionally irrelevant housekeeping genes for pratical normalization. Novel and benchmark testing RNA-seq datasets were applied to demostrate that different selections of housekeeping gene lead to strong impact on differential gene expression analysis, and compared results have shown that our proposed method outperformed other traditional approaches in terms of both sensitivity and specificity. The proposed mechanism of selecting appropriate houskeeping genes for inter-dataset normalization is robust and accurate for differential expression analyses. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Active set support vector regression.

    Science.gov (United States)

    Musicant, David R; Feinberg, Alexander

    2004-03-01

    This paper presents active set support vector regression (ASVR), a new active set strategy to solve a straightforward reformulation of the standard support vector regression problem. This new algorithm is based on the successful ASVM algorithm for classification problems, and consists of solving a finite number of linear equations with a typically large dimensionality equal to the number of points to be approximated. However, by making use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available. ASVR is extremely fast, produces comparable generalization error to other popular algorithms, and is available on the web for download.

  11. Hypergraphs combinatorics of finite sets

    CERN Document Server

    Berge, C

    1989-01-01

    Graph Theory has proved to be an extremely useful tool for solving combinatorial problems in such diverse areas as Geometry, Algebra, Number Theory, Topology, Operations Research and Optimization. It is natural to attempt to generalise the concept of a graph, in order to attack additional combinatorial problems. The idea of looking at a family of sets from this standpoint took shape around 1960. In regarding each set as a ``generalised edge'' and in calling the family itself a ``hypergraph'', the initial idea was to try to extend certain classical results of Graph Theory such as the theorems of Turán and König. It was noticed that this generalisation often led to simplification; moreover, one single statement, sometimes remarkably simple, could unify several theorems on graphs. This book presents what seems to be the most significant work on hypergraphs.

  12. Network Attack Reference Data Set

    Science.gov (United States)

    2004-12-01

    fingerprinting tools include QueSO [10] (literally translates to “what OS”) and nmap [11], however there are a number of additional tools available for...Network Attack Reference Data Set J. McKenna and J. Treurniet Defence R&D Canada √ Ottawa TECHNICAL...collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources

  13. Less than a Class Set

    Science.gov (United States)

    Bennett, Kristin Redington

    2012-01-01

    The iPad holds amazing potential for classroom use. Just a few--or even only one--is enough to get results. Having a class set promotes traditional, whole-class instruction, but fewer iPads facilitate individualized and tailored instruction. In this article, the author discusses the potential of the iPad and suggests ways to put the iPad to use in…

  14. Food systems in correctional settings

    DEFF Research Database (Denmark)

    Smoyer, Amy; Kjær Minke, Linda

    management of food systems may improve outcomes for incarcerated people and help correctional administrators to maximize their health and safety. This report summarizes existing research on food systems in correctional settings and provides examples of food programmes in prison and remand facilities......, including a case study of food-related innovation in the Danish correctional system. It offers specific conclusions for policy-makers, administrators of correctional institutions and prison-food-service professionals, and makes proposals for future research....

  15. Set Reordering for Paletted Data

    KAUST Repository

    Schneider, Jens

    2011-03-01

    We present a novel method to recycle bits of paletted data sets. We exploit that the codebook of such data can be reordered without affecting the content. Enumerating all possible permutations of N codebook entries yields an additional O(N log2 N) bits that can be used without storage overhed for the losless encoding of a limited amount of tags, meta-information, or part of the actual data. © 2011 IEEE.

  16. Compositional models for credal sets

    Czech Academy of Sciences Publication Activity Database

    Vejnarová, Jiřina

    2017-01-01

    Roč. 90, č. 1 (2017), s. 359-373 ISSN 0888-613X R&D Projects: GA ČR(CZ) GA16-12010S Institutional support: RVO:67985556 Keywords : Imprecise probabilities * Credal sets * Multidimensional models * Conditional independence Subject RIV: BA - General Mathematics OBOR OECD: Pure mathematics Impact factor: 2.845, year: 2016 http://library.utia.cas.cz/separaty/2017/MTR/vejnarova-0483288.pdf

  17. Scale setting in lattice QCD

    Energy Technology Data Exchange (ETDEWEB)

    Sommer, Rainer [DESY, Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC

    2014-02-15

    The principles of scale setting in lattice QCD as well as the advantages and disadvantages of various commonly used scales are discussed. After listing criteria for good scales, I concentrate on the main presently used ones with an emphasis on scales derived from the Yang-Mills gradient flow. For these I discuss discretisation errors, statistical precision and mass effects. A short review on numerical results also brings me to an unpleasant disagreement which remains to be explained.

  18. Sequencing and Gene Expression Analysis of Leishmania tropica LACK Gene.

    Science.gov (United States)

    Hammoudeh, Nour; Kweider, Mahmoud; Abbady, Abdul-Qader; Soukkarieh, Chadi

    2014-01-01

    Leishmania Homologue of receptors for Activated C Kinase (LACK) antigen is a 36-kDa protein, which provokes a very early immune response against Leishmania infection. There are several reports on the expression of LACK through different life-cycle stages of genus Leishmania, but only a few of them have focused on L.tropica. The present study provides details of the cloning, DNA sequencing and gene expression of LACK in this parasite species. First, several local isolates of Leishmania parasites were typed in our laboratory using PCR technique to verify of Leishmania parasite species. After that, LACK gene was amplified and cloned into a vector for sequencing. Finally, the expression of this molecule in logarithmic and stationary growth phase promastigotes, as well as in amastigotes, was evaluated by Reverse Transcription-PCR (RT-PCR) technique. The typing result confirmed that all our local isolates belong to L.tropica. LACK gene sequence was determined and high similarity was observed with the sequences of other Leishmania species. Furthermore, the expression of LACK gene in both promastigotes and amastigotes forms was confirmed. Overall, the data set the stage for future studies of the properties and immune role of LACK gene products.

  19. A course on Borel sets

    CERN Document Server

    Srivastava, S M

    1998-01-01

    The roots of Borel sets go back to the work of Baire [8]. He was trying to come to grips with the abstract notion of a function introduced by Dirich­ let and Riemann. According to them, a function was to be an arbitrary correspondence between objects without giving any method or procedure by which the correspondence could be established. Since all the specific functions that one studied were determined by simple analytic expressions, Baire delineated those functions that can be constructed starting from con­ tinuous functions and iterating the operation 0/ pointwise limit on a se­ quence 0/ functions. These functions are now known as Baire functions. Lebesgue [65] and Borel [19] continued this work. In [19], Borel sets were defined for the first time. In his paper, Lebesgue made a systematic study of Baire functions and introduced many tools and techniques that are used even today. Among other results, he showed that Borel functions coincide with Baire functions. The study of Borel sets got an impetus from...

  20. Introduction to axiomatic set theory

    CERN Document Server

    Takeuti, Gaisi

    1971-01-01

    In 1963, the first author introduced a course in set theory at the Uni­ versity of Illinois whose main objectives were to cover G6del's work on the consistency of the axiom of choice (AC) and the generalized con­ tinuum hypothesis (GCH), and Cohen's work on the independence of AC and the GCH. Notes taken in 1963 by the second author were the taught by him in 1966, revised extensively, and are presented here as an introduction to axiomatic set theory. Texts in set theory frequently develop the subject rapidly moving from key result to key result and suppressing many details. Advocates of the fast development claim at least two advantages. First, key results are highlighted, and second, the student who wishes to master the sub­ ject is compelled to develop the details on his own. However, an in­ structor using a "fast development" text must devote much class time to assisting his students in their efforts to bridge gaps in the text. We have chosen instead a development that is quite detailed and complete. F...

  1. EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance

    Directory of Open Access Journals (Sweden)

    Larsen Thomas

    2003-06-01

    Full Text Available Abstract Background Contrary to other areas of sequence analysis, a measure of statistical significance of a putative gene has not been devised to help in discriminating real genes from the masses of random Open Reading Frames (ORFs in prokaryotic genomes. Therefore, many genomes have too many short ORFs annotated as genes. Results In this paper, we present a new automated gene-finding method, EasyGene, which estimates the statistical significance of a predicted gene. The gene finder is based on a hidden Markov model (HMM that is automatically estimated for a new genome. Using extensions of similarities in Swiss-Prot, a high quality training set of genes is automatically extracted from the genome and used to estimate the HMM. Putative genes are then scored with the HMM, and based on score and length of an ORF, the statistical significance is calculated. The measure of statistical significance for an ORF is the expected number of ORFs in one megabase of random sequence at the same significance level or better, where the random sequence has the same statistics as the genome in the sense of a third order Markov chain. Conclusions The result is a flexible gene finder whose overall performance matches or exceeds other methods. The entire pipeline of computer processing from the raw input of a genome or set of contigs to a list of putative genes with significance is automated, making it easy to apply EasyGene to newly sequenced organisms. EasyGene with pre-trained models can be accessed at http://www.cbs.dtu.dk/services/EasyGene.

  2. Evaluation of suitable reference genes for gene expression studies in bovine muscular tissue

    Directory of Open Access Journals (Sweden)

    Dunner Susana

    2008-09-01

    Full Text Available Abstract Background Real-time reverse transcriptase quantitative polymerase chain reaction (real-time RTqPCR is a technique used to measure mRNA species copy number as a way to determine key genes involved in different biological processes. However, the expression level of these key genes may vary among tissues or cells not only as a consequence of differential expression but also due to different factors, including choice of reference genes to normalize the expression levels of the target genes; thus the selection of reference genes is critical for expression studies. For this purpose, ten candidate reference genes were investigated in bovine muscular tissue. Results The value of stability of ten candidate reference genes included in three groups was estimated: the so called 'classical housekeeping' genes (18S, GAPDH and ACTB, a second set of genes used in expression studies conducted on other tissues (B2M, RPII, UBC and HMBS and a third set of novel genes (SF3A1, EEF1A2 and CASC3. Three different statistical algorithms were used to rank the genes by their stability measures as produced by geNorm, NormFinder and Bestkeeper. The three methods tend to agree on the most stably expressed genes and the least in muscular tissue. EEF1A2 and HMBS followed by SF3A1, ACTB, and CASC3 can be considered as stable reference genes, and B2M, RPII, UBC and GAPDH would not be appropriate. Although the rRNA-18S stability measure seems to be within the range of acceptance, its use is not recommended because its synthesis regulation is not representative of mRNA levels. Conclusion Based on geNorm algorithm, we propose the use of three genes SF3A1, EEF1A2 and HMBS as references for normalization of real-time RTqPCR in muscle expression studies.

  3. Subclass mapping: identifying common subtypes in independent disease data sets.

    Directory of Open Access Journals (Sweden)

    Yujin Hoshida

    Full Text Available Whole genome expression profiles are widely used to discover molecular subtypes of diseases. A remaining challenge is to identify the correspondence or commonality of subtypes found in multiple, independent data sets generated on various platforms. While model-based supervised learning is often used to make these connections, the models can be biased to the training data set and thus miss inherent, relevant substructure in the test data. Here we describe an unsupervised subclass mapping method (SubMap, which reveals common subtypes between independent data sets. The subtypes within a data set can be determined by unsupervised clustering or given by predetermined phenotypes before applying SubMap. We define a measure of correspondence for subtypes and evaluate its significance building on our previous work on gene set enrichment analysis. The strength of the SubMap method is that it does not impose the structure of one data set upon another, but rather uses a bi-directional approach to highlight the common substructures in both. We show how this method can reveal the correspondence between several cancer-related data sets. Notably, it identifies common subtypes of breast cancer associated with estrogen receptor status, and a subgroup of lymphoma patients who share similar survival patterns, thus improving the accuracy of a clinical outcome predictor.

  4. Gene Ontology Consortium: going forward.

    Science.gov (United States)

    2015-01-01

    The Gene Ontology (GO; http://www.geneontology.org) is a community-based bioinformatics resource that supplies information about gene product function using ontologies to represent biological knowledge. Here we describe improvements and expansions to several branches of the ontology, as well as updates that have allowed us to more efficiently disseminate the GO and capture feedback from the research community. The Gene Ontology Consortium (GOC) has expanded areas of the ontology such as cilia-related terms, cell-cycle terms and multicellular organism processes. We have also implemented new tools for generating ontology terms based on a set of logical rules making use of templates, and we have made efforts to increase our use of logical definitions. The GOC has a new and improved web site summarizing new developments and documentation, serving as a portal to GO data. Users can perform GO enrichment analysis, and search the GO for terms, annotations to gene products, and associated metadata across multiple species using the all-new AmiGO 2 browser. We encourage and welcome the input of the research community in all biological areas in our continued effort to improve the Gene Ontology. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Superior cross-species reference genes: a blueberry case study.

    Directory of Open Access Journals (Sweden)

    Jose V Die

    Full Text Available The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well.

  6. Superior Cross-Species Reference Genes: A Blueberry Case Study

    Science.gov (United States)

    Die, Jose V.; Rowland, Lisa J.

    2013-01-01

    The advent of affordable Next Generation Sequencing technologies has had major impact on studies of many crop species, where access to genomic technologies and genome-scale data sets has been extremely limited until now. The recent development of genomic resources in blueberry will enable the application of high throughput gene expression approaches that should relatively quickly increase our understanding of blueberry physiology. These studies, however, require a highly accurate and robust workflow and make necessary the identification of reference genes with high expression stability for correct target gene normalization. To create a set of superior reference genes for blueberry expression analyses, we mined a publicly available transcriptome data set from blueberry for orthologs to a set of Arabidopsis genes that showed the most stable expression in a developmental series. In total, the expression stability of 13 putative reference genes was evaluated by qPCR and a set of new references with high stability values across a developmental series in fruits and floral buds of blueberry were identified. We also demonstrated the need to use at least two, preferably three, reference genes to avoid inconsistencies in results, even when superior reference genes are used. The new references identified here provide a valuable resource for accurate normalization of gene expression in Vaccinium spp. and may be useful for other members of the Ericaceae family as well. PMID:24058469

  7. Fusion gene microarray reveals cancer type-specificity among fusion genes.

    Science.gov (United States)

    Løvf, Marthe; Thomassen, Gard O S; Bakken, Anne Cathrine; Celestino, Ricardo; Fioretos, Thoas; Lind, Guro E; Lothe, Ragnhild A; Skotheim, Rolf I

    2011-05-01

    Detection of fusion genes for diagnostic purposes and as a guide to treatment is well-established in hematological malignancies, and the prevalence of fusion genes in epithelial cancers is also increasingly appreciated. To study whether established fusion genes are present within additional cancer types, we have used an updated version of our fusion gene microarray in a systematic survey of reported fusion genes in multiple cancer types. We assembled a comprehensive database of published fusion genes, including those reported only in individual studies and samples, and fusion genes resulting from deep sequencing of cancer genomes and transcriptomes. From the total set of 548 fusion genes, we designed 599,839 oligonucleotides, targeting both chimeric transcript junctions as well as sequences internal to each of the fusion gene partners. We investigated the presence of fusion genes in a series of 67 cell lines representing 15 different cancer types. Data from ten leukemia cell lines with known fusion gene status were used to develop an automated scoring algorithm, and in five cell lines the correct fusion gene was the top scoring hit, and one came second. Two additional fusion genes, BCAS4-BCAS3 in the MCF-7 breast cancer cell line and CCDC6-RET in the TPC-1 thyroid cancer cell line were validated as true positive fusion transcripts. However, these fusion genes were not new to these cancer types, and none of 548 fusion genes were identified from a novel cancer type. We therefore find it unlikely that the assayed fusion genes are commonly present across multiple cancer types. 2011 Wiley-Liss, Inc.

  8. Imaging gene expression in gene therapy

    International Nuclear Information System (INIS)

    Wiebe, Leonard I.

    1997-01-01

    Full text. Gene therapy can be used to introduce new genes, or to supplement the function of indigenous genes. At the present time, however, there is non-invasive test to demonstrate efficacy of the gene transfer and expression processes. It has been postulated that scintigraphic imaging can offer unique information on both the site at which the transferred gene is expressed, and the degree of expression, both of which are critical issue for safety and clinical efficacy. Many current studies are based on 'suicide gene therapy' of cancer. Cells modified to express these genes commit metabolic suicide in the presence of an enzyme encoded by the transferred gene and a specifically-convertible pro drug. Pro drug metabolism can lead to selective metabolic trapping, required for scintigraphy. Herpes simplex virus type-1 thymidine kinase (H S V-1 t k + ) has been use for 'suicide' in vivo tumor gene therapy. It has been proposed that radiolabelled nucleosides can be used as radiopharmaceuticals to detect H S V-1 t k + gene expression where the H S V-1 t k + gene serves a reporter or therapeutic function. Animal gene therapy models have been studied using purine-([ 18 F]F H P G; [ 18 F]-A C V), and pyrimidine- ([ 123 / 131 I]I V R F U; [ 124 / 131I ]) antiviral nucleosides. Principles of gene therapy and gene therapy imaging will be reviewed and experimental data for [ 123 / 131I ]I V R F U imaging with the H S V-1 t k + reporter gene will be presented

  9. Slender-Set Differential Cryptanalysis

    DEFF Research Database (Denmark)

    Borghoff, Julia; Knudsen, Lars Ramkilde; Leander, Gregor

    2013-01-01

    This paper considers PRESENT-like ciphers with key-dependent S-boxes. We focus on the setting where the same selection of S-boxes is used in every round. One particular variant with 16 rounds, proposed in 2009, is broken in practice in a chosen plaintext/chosen ciphertext scenario. Extrapolating ...... these results suggests that up to 28 rounds of such ciphers can be broken. Furthermore, we outline how our attack strategy can be applied to an extreme case where the S-boxes are chosen uniformly at random for each round, and where the bit permutation is key-dependent as well....

  10. Set Your Creative Forces Free!

    DEFF Research Database (Denmark)

    Meier Sørensen, Bent; Villadsen, Kaspar

    eccentric Managing Director. At first glance the essential message to the employees may be read as ‘set Your creative forces and potentials free!; a statement which activates a semantics of liberation of artistic creativeness and rebellious transgression of conventions. It is suggested, however......, that the manager’s bodily comportment activate and oscillates between a more complex web of managerial rationalities including sovereignty, discipline and pastoral care. It is further suggested that this managerial hybridity renders difficult, or even closes off, conventional forms of contestation and resistance...

  11. Set Your Creative Forces Free!

    DEFF Research Database (Denmark)

    Meier Sørensen, Bent; Villadsen, Kaspar

    eccentric Managing Director. At first glance the essential message to the employees may be read as ‘set Your creative forces and potentials free!; a statement which activates a semantics of liberation of artistic creativeness and rebellious transgression of conventions. It is suggested, however......-hierarchical’ and aestheticized managerial practice reconfigures power relations within a creative industry. The key problematic is ‘governmental’ in the sense suggested by Michel Foucault in as far as the manager’s ethical self-practice—which involves expressive and ‘liberated’ bodily comportment—is used as a means for culture...

  12. Setting up the QDDD spectrometer

    International Nuclear Information System (INIS)

    Bianchi, L.; Daronian, D.; David, M.; Gastebois, J.; Jusczak, F.; Lemaire, M.C.

    The whole set was received in Saclay on March 74. The field homogeneity is better than 1/2000 in the trajectory region. The effective field boundaries have been determined in the 3 dipoles using fringing field measurements performed with Hall probes. They are within +-0.5mm when compared to the theoretical design. The multipole components in the quadrupole lens have been measured and additive shims have been carefully adjusted to get an octupole intensity of 2.35% of the quadrupole one with the correct relative phase. The complete mounting must begin in October, and the search of focal position is planned for January 1975 [fr

  13. Identification of a large set of rare complete human knockouts.

    Science.gov (United States)

    Sulem, Patrick; Helgason, Hannes; Oddson, Asmundur; Stefansson, Hreinn; Gudjonsson, Sigurjon A; Zink, Florian; Hjartarson, Eirikur; Sigurdsson, Gunnar Th; Jonasdottir, Adalbjorg; Jonasdottir, Aslaug; Sigurdsson, Asgeir; Magnusson, Olafur Th; Kong, Augustine; Helgason, Agnar; Holm, Hilma; Thorsteinsdottir, Unnur; Masson, Gisli; Gudbjartsson, Daniel F; Stefansson, Kari

    2015-05-01

    Loss-of-function mutations cause many mendelian diseases. Here we aimed to create a catalog of autosomal genes that are completely knocked out in humans by rare loss-of-function mutations. We sequenced the whole genomes of 2,636 Icelanders and imputed the sequence variants identified in this set into 101,584 additional chip-genotyped and phased Icelanders. We found a total of 6,795 autosomal loss-of-function SNPs and indels in 4,924 genes. Of the genotyped Icelanders, 7.7% are homozygotes or compound heterozygotes for loss-of-function mutations with a minor allele frequency (MAF) below 2% in 1,171 genes (complete knockouts). Genes that are highly expressed in the brain are less often completely knocked out than other genes. Homozygous loss-of-function offspring of two heterozygous parents occurred less frequently than expected (deficit of 136 per 10,000 transmissions for variants with MAF <2%, 95% confidence interval (CI) = 10-261).

  14. A Probabilistic Genome-Wide Gene Reading Frame Sequence Model

    DEFF Research Database (Denmark)

    Have, Christian Theil; Mørk, Søren

    as output. The model can be used to obtain the most probable genome annotation based on a combination of i: a gene finder score of each gene candidate and ii: the sequence of the reading frames of gene candidates through a genome. The model --- as well as a higher order variant --- is developed and tested......We introduce a new type of probabilistic sequence model, that model the sequential composition of reading frames of genes in a genome. Our approach extends gene finders with a model of the sequential composition of genes at the genome-level -- effectively producing a sequential genome annotation...... using the probabilistic logic programming language and machine learning system PRISM - a fast and efficient model prototyping environment, using bacterial gene finding performance as a benchmark of signal strength. The model is used to prune a set of gene predictions from an underlying gene finder...

  15. Inter-species Inference of Gene Set Enrichment in Lung Epithelial Cells from Proteomic and Large Transcriptomic Data Sets

    NARCIS (Netherlands)

    Hormoz, Sahand; Bhanot, Gyan; Biehl, Michael; Bilal, Erhan; Meyer, Pablo; Norel, Raquel; Rhrissorrakrai, Kahn; Dayarian, Adel

    2014-01-01

    MOTIVATION: Translating findings in rodent models to humans has been a corner-stone of modern biology and drug development. However, in many cases a naive 'extrapolation' between the two species has not succeeded. As a result, clinical trials of new drugs sometimes fail even after considerable

  16. Setting Parameters for Biological Models With ANIMO

    Directory of Open Access Journals (Sweden)

    Stefano Schivo

    2014-03-01

    Full Text Available ANIMO (Analysis of Networks with Interactive MOdeling is a software for modeling biological networks, such as e.g. signaling, metabolic or gene networks. An ANIMO model is essentially the sum of a network topology and a number of interaction parameters. The topology describes the interactions between biological entities in form of a graph, while the parameters determine the speed of occurrence of such interactions. When a mismatch is observed between the behavior of an ANIMO model and experimental data, we want to update the model so that it explains the new data. In general, the topology of a model can be expanded with new (known or hypothetical nodes, and enables it to match experimental data. However, the unrestrained addition of new parts to a model causes two problems: models can become too complex too fast, to the point of being intractable, and too many parts marked as "hypothetical" or "not known" make a model unrealistic. Even if changing the topology is normally the easier task, these problems push us to try a better parameter fit as a first step, and resort to modifying the model topology only as a last resource. In this paper we show the support added in ANIMO to ease the task of expanding the knowledge on biological networks, concentrating in particular on the parameter settings.

  17. Maximizing biomarker discovery by minimizing gene signatures

    Directory of Open Access Journals (Sweden)

    Chang Chang

    2011-12-01

    Full Text Available Abstract Background The use of gene signatures can potentially be of considerable value in the field of clinical diagnosis. However, gene signatures defined with different methods can be quite various even when applied the same disease and the same endpoint. Previous studies have shown that the correct selection of subsets of genes from microarray data is key for the accurate classification of disease phenotypes, and a number of methods have been proposed for the purpose. However, these methods refine the subsets by only considering each single feature, and they do not confirm the association between the genes identified in each gene signature and the phenotype of the disease. We proposed an innovative new method termed Minimize Feature's Size (MFS based on multiple level similarity analyses and association between the genes and disease for breast cancer endpoints by comparing classifier models generated from the second phase of MicroArray Quality Control (MAQC-II, trying to develop effective meta-analysis strategies to transform the MAQC-II signatures into a robust and reliable set of biomarker for clinical applications. Results We analyzed the similarity of the multiple gene signatures in an endpoint and between the two endpoints of breast cancer at probe and gene levels, the results indicate that disease-related genes can be preferably selected as the components of gene signature, and that the gene signatures for the two endpoints could be interchangeable. The minimized signatures were built at probe level by using MFS for each endpoint. By applying the approach, we generated a much smaller set of gene signature with the similar predictive power compared with those gene signatures from MAQC-II. Conclusions Our results indicate that gene signatures of both large and small sizes could perform equally well in clinical applications. Besides, consistency and biological significances can be detected among different gene signatures, reflecting the

  18. Setting up crowd science projects.

    Science.gov (United States)

    Scheliga, Kaja; Friesike, Sascha; Puschmann, Cornelius; Fecher, Benedikt

    2016-11-29

    Crowd science is scientific research that is conducted with the participation of volunteers who are not professional scientists. Thanks to the Internet and online platforms, project initiators can draw on a potentially large number of volunteers. This crowd can be involved to support data-rich or labour-intensive projects that would otherwise be unfeasible. So far, research on crowd science has mainly focused on analysing individual crowd science projects. In our research, we focus on the perspective of project initiators and explore how crowd science projects are set up. Based on multiple case study research, we discuss the objectives of crowd science projects and the strategies of their initiators for accessing volunteers. We also categorise the tasks allocated to volunteers and reflect on the issue of quality assurance as well as feedback mechanisms. With this article, we contribute to a better understanding of how crowd science projects are set up and how volunteers can contribute to science. We suggest that our findings are of practical relevance for initiators of crowd science projects, for science communication as well as for informed science policy making. © The Author(s) 2016.

  19. Wound management in disaster settings.

    Science.gov (United States)

    Wuthisuthimethawee, Prasit; Lindquist, Samuel J; Sandler, Nicola; Clavisi, Ornella; Korin, Stephanie; Watters, David; Gruen, Russell L

    2015-04-01

    Few guidelines exist for the initial management of wounds in disaster settings. As wounds sustained are often contaminated, there is a high risk of further complications from infection, both local and systemic. Healthcare workers with little to no surgical training often provide early wound care, and where resources and facilities are also often limited, and clear appropriate guidance is needed for early wound management. We undertook a systematic review focusing on the nature of wounds in disaster situations, and the outcomes of wound management in recent disasters. We then presented the findings to an international consensus panel with a view to formulating a guideline for the initial management of wounds by first responders and subsequent healthcare personnel as they deploy. We included 62 studies in the review that described wound care challenges in a diverse range of disasters, and reported high rates of wound infection with multiple causative organisms. The panel defined a guideline in which the emphasis is on not closing wounds primarily but rather directing efforts toward cleaning, debridement, and dressing wounds in preparation for delayed primary closure, or further exploration and management by skilled surgeons. Good wound care in disaster settings, as outlined in this article, can be achieved with relatively simple measures, and have important mortality and morbidity benefits.

  20. ON NANO Λg-CLOSED SETS

    OpenAIRE

    Rajasekaran, Ilangovan; Nethaji, Ochanan

    2017-01-01

    Abstaract−In this paper, we introduce nano ∧g-closed sets in nano topological spaces. Some properties of nano ∧g-closed sets and nano ∧g-open sets are weaker forms of nano closed sets and nano open sets

  1. MorphDB: Prioritizing Genes for Specialized Metabolism Pathways and Gene Ontology Categories in Plants

    Directory of Open Access Journals (Sweden)

    Arthur Zwaenepoel

    2018-03-01

    Full Text Available Recent times have seen an enormous growth of “omics” data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes. A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named “MORPH bulk” (https://github.com/arzwa/morph-bulk, for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.

  2. Improved precision and accuracy for microarrays using updated probe set definitions

    Directory of Open Access Journals (Sweden)

    Larsson Ola

    2007-02-01

    Full Text Available Abstract Background Microarrays enable high throughput detection of transcript expression levels. Different investigators have recently introduced updated probe set definitions to more accurately map probes to our current knowledge of genes and transcripts. Results We demonstrate that updated probe set definitions provide both better precision and accuracy in probe set estimates compared to the original Affymetrix definitions. We show that the improved precision mainly depends on the increased number of probes that are integrated into each probe set, but we also demonstrate an improvement when the same number of probes is used. Conclusion Updated probe set definitions does not only offer expression levels that are more accurately associated to genes and transcripts but also improvements in the estimated transcript expression levels. These results give support for the use of updated probe set definitions for analysis and meta-analysis of microarray data.

  3. Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

    Directory of Open Access Journals (Sweden)

    Qiusheng Kong

    Full Text Available Gene expression analysis in watermelon (Citrullus lanatus fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC, β-actin (ClACT, and alpha tubulin 5 (ClTUA5 as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1, a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.

  4. Evaluation of Appropriate Reference Genes for Gene Expression Normalization during Watermelon Fruit Development.

    Science.gov (United States)

    Kong, Qiusheng; Yuan, Jingxian; Gao, Lingyun; Zhao, Liqiang; Cheng, Fei; Huang, Yuan; Bie, Zhilong

    2015-01-01

    Gene expression analysis in watermelon (Citrullus lanatus) fruit has drawn considerable attention with the availability of genome sequences to understand the regulatory mechanism of fruit development and to improve its quality. Real-time quantitative reverse-transcription PCR (qRT-PCR) is a routine technique for gene expression analysis. However, appropriate reference genes for transcript normalization in watermelon fruits have not been well characterized. The aim of this study was to evaluate the appropriateness of 12 genes for their potential use as reference genes in watermelon fruits. Expression variations of these genes were measured in 48 samples obtained from 12 successive developmental stages of parthenocarpic and fertilized fruits of two watermelon genotypes by using qRT-PCR analysis. Considering the effects of genotype, fruit setting method, and developmental stage, geNorm determined clathrin adaptor complex subunit (ClCAC), β-actin (ClACT), and alpha tubulin 5 (ClTUA5) as the multiple reference genes in watermelon fruit. Furthermore, ClCAC alone or together with SAND family protein (ClSAND) was ranked as the single or two best reference genes by NormFinder. By using the top-ranked reference genes to normalize the transcript abundance of phytoene synthase (ClPSY1), a good correlation between lycopene accumulation and ClPSY1 expression pattern was observed in ripening watermelon fruit. These validated reference genes will facilitate the accurate measurement of gene expression in the studies on watermelon fruit biology.

  5. Sex hormones and gene expression signatures in peripheral blood from postmenopausal women - the NOWAC postgenome study

    Directory of Open Access Journals (Sweden)

    Rylander Charlotta

    2011-03-01

    Full Text Available Abstract Background Postmenopausal hormone therapy (HT influences endogenous hormone concentrations and increases the risk of breast cancer. Gene expression profiling may reveal the mechanisms behind this relationship. Our objective was to explore potential associations between sex hormones and gene expression in whole blood from a population-based, random sample of postmenopausal women Methods Gene expression, as measured by the Applied Biosystems microarray platform, was compared between hormone therapy (HT users and non-users and between high and low hormone plasma concentrations using both gene-wise analysis and gene set analysis. Gene sets found to be associated with HT use were further analysed for enrichment in functional clusters and network predictions. The gene expression matrix included 285 samples and 16185 probes and was adjusted for significant technical variables. Results Gene-wise analysis revealed several genes significantly associated with different types of HT use. The functional cluster analyses provided limited information on these genes. Gene set analysis revealed 22 gene sets that were enriched between high and low estradiol concentration (HT-users excluded. Among these were seven oestrogen related gene sets, including our gene list associated with systemic estradiol use, which thereby represents a novel oestrogen signature. Seven gene sets were related to immune response. Among the 15 gene sets enriched for progesterone, 11 overlapped with estradiol. No significant gene expression patterns were found for testosterone, follicle stimulating hormone (FSH or sex hormone binding globulin (SHBG. Conclusions Distinct gene expression patterns associated with sex hormones are detectable in a random group of postmenopausal women, as demonstrated by the finding of a novel oestrogen signature.

  6. Small RNA regulation of rice homeobox genes.

    Science.gov (United States)

    Jain, Mukesh; Khurana, Jitendra P

    2008-11-01

    Recently, we reported the genome-wide identification of 107 homeobox genes in rice and classified them into ten distinct subfamilies based upon their domain composition and phylogenetic analysis. Microarray analysis revealed the tissue-specific and overlapping expression profiles of these genes during various stages of floral transition, panicle development and seed set. Several homeobox genes were also found to be differentially expressed under abiotic stress conditions. Based on massively parallel signature sequencing (MPSS) data analysis, we report here that a large number of small RNA signatures are associated with rice homeobox genes, which may be involved in their tissue-specific/developmental regulation and stress responses. The association of a very large number of small RNA signatures suggested an unusually high degree of regulation of homeobox genes by small RNAs during inflorescence development.

  7. A gene map of the human genome.

    Science.gov (United States)

    Schuler, G D; Boguski, M S; Stewart, E A; Stein, L D; Gyapay, G; Rice, K; White, R E; Rodriguez-Tomé, P; Aggarwal, A; Bajorek, E; Bentolila, S; Birren, B B; Butler, A; Castle, A B; Chiannilkulchai, N; Chu, A; Clee, C; Cowles, S; Day, P J; Dibling, T; Drouot, N; Dunham, I; Duprat, S; East, C; Edwards, C; Fan, J B; Fang, N; Fizames, C; Garrett, C; Green, L; Hadley, D; Harris, M; Harrison, P; Brady, S; Hicks, A; Holloway, E; Hui, L; Hussain, S; Louis-Dit-Sully, C; Ma, J; MacGilvery, A; Mader, C; Maratukulam, A; Matise, T C; McKusick, K B; Morissette, J; Mungall, A; Muselet, D; Nusbaum, H C; Page, D C; Peck, A; Perkins, S; Piercy, M; Qin, F; Quackenbush, J; Ranby, S; Reif, T; Rozen, S; Sanders, C; She, X; Silva, J; Slonim, D K; Soderlund, C; Sun, W L; Tabar, P; Thangarajah, T; Vega-Czarny, N; Vollrath, D; Voyticky, S; Wilmer, T; Wu, X; Adams, M D; Auffray, C; Walter, N A; Brandon, R; Dehejia, A; Goodfellow, P N; Houlgatte, R; Hudson, J R; Ide, S E; Iorio, K R; Lee, W Y; Seki, N; Nagase, T; Ishikawa, K; Nomura, N; Phillips, C; Polymeropoulos, M H; Sandusky, M; Schmitt, K; Berry, R; Swanson, K; Torres, R; Venter, J C; Sikela, J M; Beckmann, J S; Weissenbach, J; Myers, R M; Cox, D R; James, M R; Bentley, D; Deloukas, P; Lander, E S; Hudson, T J

    1996-10-25

    The human genome is thought to harbor 50,000 to 100,000 genes, of which about half have been sampled to date in the form of expressed sequence tags. An international consortium was organized to develop and map gene-based sequence tagged site markers on a set of two radiation hybrid panels and a yeast artificial chromosome library. More than 16,000 human genes have been mapped relative to a framework map that contains about 1000 polymorphic genetic markers. The gene map unifies the existing genetic and physical maps with the nucleotide and protein sequence databases in a fashion that should speed the discovery of genes underlying inherited human disease. The integrated resource is available through a site on the World Wide Web at http://www.ncbi.nlm.nih.gov/SCIENCE96/.

  8. GOseek: a gene ontology search engine using enhanced keywords.

    Science.gov (United States)

    Taha, Kamal

    2013-01-01

    We propose in this paper a biological search engine called GOseek, which overcomes the limitation of current gene similarity tools. Given a set of genes, GOseek returns the most significant genes that are semantically related to the given genes. These returned genes are usually annotated to one of the Lowest Common Ancestors (LCA) of the Gene Ontology (GO) terms annotating the given genes. Most genes have several annotation GO terms. Therefore, there may be more than one LCA for the GO terms annotating the given genes. The LCA annotating the genes that are most semantically related to the given gene is the one that receives the most aggregate semantic contribution from the GO terms annotating the given genes. To identify this LCA, GOseek quantifies the contribution of the GO terms annotating the given genes to the semantics of their LCAs. That is, it encodes the semantic contribution into a numeric format. GOseek uses microarray experiment data to rank result genes based on their significance. We evaluated GOseek experimentally and compared it with a comparable gene prediction tool. Results showed marked improvement over the tool.

  9. Identification and Analysis of the SET-Domain Family in Silkworm, Bombyx mori

    Directory of Open Access Journals (Sweden)

    Hailong Zhao

    2015-01-01

    Full Text Available As an important economic insect, Bombyx mori is also a useful model organism for lepidopteran insect. SET-domain-containing proteins belong to a group of enzymes named after a common domain that utilizes the cofactor S-adenosyl-L-methionine (SAM to achieve methylation of its substrates. Many SET-domain-containing proteins have been shown to display catalytic activity towards particular lysine residues on histones, but emerging evidence also indicates that various nonhistone proteins are specifically targeted by this clade of enzymes. To explore their diverse functions of SET-domain superfamily in insect, we identified, cloned, and analyzed the SET-domains proteins in silkworm, Bombyx mori. Firstly, 24 genes containing SET domain from silkworm genome were characterized and 17 of them belonged to six subfamilies of SUV39, SET1, SET2, SUV4-20, EZ, and SMYD. Secondly, SET domains of silkworm SET-domain family were intraspecifically and interspecifically conserved, especially for the catalytic core “NHSC” motif, substrate binding site, and catalytic site in the SET domain. Lastly, further analyses indicated that silkworm SET-domain gene BmSu(var3-9 owned different characterization and expression profiles compared to other invertebrates. Overall, our results provide a new insight into the functional and evolutionary features of SET-domain family.

  10. Identification and validation of suitable endogenous reference genes for gene expression studies in human peripheral blood.

    Science.gov (United States)

    Stamova, Boryana S; Apperson, Michelle; Walker, Wynn L; Tian, Yingfang; Xu, Huichun; Adamczy, Peter; Zhan, Xinhua; Liu, Da-Zhi; Ander, Bradley P; Liao, Isaac H; Gregg, Jeffrey P; Turner, Renee J; Jickling, Glen; Lit, Lisa; Sharp, Frank R

    2009-08-05

    Gene expression studies require appropriate normalization methods. One such method uses stably expressed reference genes. Since suitable reference genes appear to be unique for each tissue, we have identified an optimal set of the most stably expressed genes in human blood that can be used for normalization. Whole-genome Affymetrix Human 2.0 Plus arrays were examined from 526 samples of males and females ages 2 to 78, including control subjects and patients with Tourette syndrome, stroke, migraine, muscular dystrophy, and autism. The top 100 most stably expressed genes with a broad range of expression levels were identified. To validate the best candidate genes, we performed quantitative RT-PCR on a subset of 10 genes (TRAP1, DECR1, FPGS, FARP1, MAPRE2, PEX16, GINS2, CRY2, CSNK1G2 and A4GALT), 4 commonly employed reference genes (GAPDH, ACTB, B2M and HMBS) and PPIB, previously reported to be stably expressed in blood. Expression stability and ranking analysis were performed using GeNorm and NormFinder algorithms. Reference genes were ranked based on their expression stability and the minimum number of genes needed for nomalization as calculated using GeNorm showed that the fewest, most stably expressed genes needed for acurate normalization in RNA expression studies of human whole blood is a combination of TRAP1, FPGS, DECR1 and PPIB. We confirmed the ranking of the best candidate control genes by using an alternative algorithm (NormFinder). The reference genes identified in this study are stably expressed in whole blood of humans of both genders with multiple disease conditions and ages 2 to 78. Importantly, they also have different functions within cells and thus should be expressed independently of each other. These genes should be useful as normalization genes for microarray and RT-PCR whole blood studies of human physiology, metabolism and disease.

  11. Identification and validation of suitable endogenous reference genes for gene expression studies in human peripheral blood

    Directory of Open Access Journals (Sweden)

    Turner Renee J

    2009-08-01

    Full Text Available Abstract Background Gene expression studies require appropriate normalization methods. One such method uses stably expressed reference genes. Since suitable reference genes appear to be unique for each tissue, we have identified an optimal set of the most stably expressed genes in human blood that can be used for normalization. Methods Whole-genome Affymetrix Human 2.0 Plus arrays were examined from 526 samples of males and females ages 2 to 78, including control subjects and patients with Tourette syndrome, stroke, migraine, muscular dystrophy, and autism. The top 100 most stably expressed genes with a broad range of expression levels were identified. To validate the best candidate genes, we performed quantitative RT-PCR on a subset of 10 genes (TRAP1, DECR1, FPGS, FARP1, MAPRE2, PEX16, GINS2, CRY2, CSNK1G2 and A4GALT, 4 commonly employed reference genes (GAPDH, ACTB, B2M and HMBS and PPIB, previously reported to be stably expressed in blood. Expression stability and ranking analysis were performed using GeNorm and NormFinder algorithms. Results Reference genes were ranked based on their expression stability and the minimum number of genes needed for nomalization as calculated using GeNorm showed that the fewest, most stably expressed genes needed for acurate normalization in RNA expression studies of human whole blood is a combination of TRAP1, FPGS, DECR1 and PPIB. We confirmed the ranking of the best candidate control genes by using an alternative algorithm (NormFinder. Conclusion The reference genes identified in this study are stably expressed in whole blood of humans of both genders with multiple disease conditions and ages 2 to 78. Importantly, they also have different functions within cells and thus should be expressed independently of each other. These genes should be useful as normalization genes for microarray and RT-PCR whole blood studies of human physiology, metabolism and disease.

  12. Data Sets from Major NCI Initiaves

    Science.gov (United States)

    The NCI Data Catalog includes links to data collections produced by major NCI initiatives and other widely used data sets, including animal models, human tumor cell lines, epidemiology data sets, genomics data sets from TCGA, TARGET, COSMIC, GSK, NCI60.

  13. Communicating science in social settings

    Science.gov (United States)

    Scheufele, Dietram A.

    2013-01-01

    This essay examines the societal dynamics surrounding modern science. It first discusses a number of challenges facing any effort to communicate science in social environments: lay publics with varying levels of preparedness for fully understanding new scientific breakthroughs; the deterioration of traditional media infrastructures; and an increasingly complex set of emerging technologies that are surrounded by a host of ethical, legal, and social considerations. Based on this overview, I discuss four areas in which empirical social science helps clarify intuitive but sometimes faulty assumptions about the social-level mechanisms of science communication and outline an agenda for bench and social scientists—driven by current social-scientific research in the field of science communication—to guide more effective communication efforts at the societal level in the future. PMID:23940341

  14. Setting a personal career direction.

    Science.gov (United States)

    McCurdy, Fredrick A; Marcdante, Karen

    2003-01-01

    In summary, we believe that both you and your organization should have a set of core values, a well-defined mission (core purpose), and a vision of the future. Ideally, your projects and activities should be congruent with your mission and values, you should be pursuing your vision, and all of this should be congruent with the organization mission and values. Practically speaking, most individuals we have worked with over the years find themselves in two different groups at this point in the exercise. The minority find that their personal mission is not at all similar to the mission of their current organization and they find it necessary to seriously reevaluate their personal career direction. Sometimes, this results in them finding some other place to work. On the other hand, the majority discover their personal mission is in reasonable agreement with that of their organization. For both, this exercise has helped them clarify and better manage their personal career direction.

  15. Communicating science in social settings.

    Science.gov (United States)

    Scheufele, Dietram A

    2013-08-20

    This essay examines the societal dynamics surrounding modern science. It first discusses a number of challenges facing any effort to communicate science in social environments: lay publics with varying levels of preparedness for fully understanding new scientific breakthroughs; the deterioration of traditional media infrastructures; and an increasingly complex set of emerging technologies that are surrounded by a host of ethical, legal, and social considerations. Based on this overview, I discuss four areas in which empirical social science helps clarify intuitive but sometimes faulty assumptions about the social-level mechanisms of science communication and outline an agenda for bench and social scientists--driven by current social-scientific research in the field of science communication--to guide more effective communication efforts at the societal level in the future.

  16. Immunisation in a curative setting

    DEFF Research Database (Denmark)

    Kofoed, Poul-Erik; Nielsen, B; Rahman, A K

    1990-01-01

    OBJECTIVE: To study the uptake of vaccination offered to women and children attending a curative health facility. DESIGN: Prospective survey over eight months of the uptake of vaccination offered to unimmunised women and children attending a diarrhoeal treatment centre as patients or attendants....... SETTING: The International Centre for Diarrhoeal Disease Research, Dhaka, Bangladesh. SUBJECTS: An estimated 19,349 unimmunised women aged 15 to 45 and 17,372 children attending the centre for treatment or accompanying patients between 1 January and 31 August 1989. MAIN OUTCOME MEASURES: The number...... of women and children who were unimmunised or incompletely immunised was calculated and the percentage of this target population accepting vaccination was recorded. RESULTS: 7530 (84.2%) Of 8944 eligible children and 7730 (40.4%) of 19,138 eligible women were vaccinated. Of the children, 63.8% were boys...

  17. Gene expression identifies heterogeneity of metastatic propensity in high-grade soft tissue sarcomas

    DEFF Research Database (Denmark)

    Skubitz, Keith M; Francis, Princy; Skubitz, Amy P N

    2012-01-01

    Metastatic propensity of soft tissue sarcoma (STS) is heterogeneous and may be determined by gene expression patterns that do not correlate well with morphology. The authors have reported gene expression patterns that distinguish 2 broad classes of clear cell renal carcinoma (ccRCC-gene set), and......), and other patterns that can distinguish heterogeneity of serous ovarian carcinoma (OVCA-gene set) and aggressive fibromatosis (AF-gene set); however, clinical follow-up data were not available for these samples.......Metastatic propensity of soft tissue sarcoma (STS) is heterogeneous and may be determined by gene expression patterns that do not correlate well with morphology. The authors have reported gene expression patterns that distinguish 2 broad classes of clear cell renal carcinoma (ccRCC-gene set...

  18. Gene cluster statistics with gene families.

    Science.gov (United States)

    Raghupathy, Narayanan; Durand, Dannie

    2009-05-01

    Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such "gene clusters" is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data. Surprisingly, despite the simplicity of the underlying assumption, our method accurately approximates the true cluster probabilities. It slightly overestimates these probabilities, yielding a conservative test. We present additional simulation results indicating the best choice of parameter values for data

  19. Silence of the Genes

    Indian Academy of Sciences (India)

    Srimath

    The human genome codes for ~35,000 genes and all these genes are not expressed in every cell. The time and site of gene expression is very precisely regulated. In any cell, only. Silence of the Genes. 2006 Nobel Prize in Physiology or Medicine. Utpal Nath and Saumitra Das. Keywords. RNA interference, siRNA,.

  20. Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease

    Science.gov (United States)

    Zhang, Qingzhou; Yang, Bo; Chen, Xujiao; Xu, Jing; Mei, Changlin; Mao, Zhiguo

    2014-01-01

    We present a bioinformatics database named Renal Gene Expression Database (RGED), which contains comprehensive gene expression data sets from renal disease research. The web-based interface of RGED allows users to query the gene expression profiles in various kidney-related samples, including renal cell lines, human kidney tissues and murine model kidneys. Researchers can explore certain gene profiles, the relationships between genes of interests and identify biomarkers or even drug targets in kidney diseases. The aim of this work is to provide a user-friendly utility for the renal disease research community to query expression profiles of genes of their own interest without the requirement of advanced computational skills. Availability and implementation: Website is implemented in PHP, R, MySQL and Nginx and freely available from http://rged.wall-eva.net. Database URL: http://rged.wall-eva.net PMID:25252782