WorldWideScience

Sample records for gene set information

  1. Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

    Science.gov (United States)

    Ganesh Kumar, Pugalendhi; Kavitha, Muthu Subash; Ahn, Byeong-Cheol

    2016-01-01

    This study describes a novel approach to reducing the challenges of highly nonlinear multiclass gene expression values for cancer diagnosis. To build a fruitful system for cancer diagnosis, in this study, we introduced two levels of gene selection such as filtering and embedding for selection of potential genes and the most relevant genes associated with cancer, respectively. The filter procedure was implemented by developing a fuzzy rough set (FR)-based method for redefining the criterion function of f-information (FI) to identify the potential genes without discretizing the continuous gene expression values. The embedded procedure is implemented by means of a water swirl algorithm (WSA), which attempts to optimize the rule set and membership function required to classify samples using a fuzzy-rule-based multiclassification system (FRBMS). Two novel update equations are proposed in WSA, which have better exploration and exploitation abilities while designing a self-learning FRBMS. The efficiency of our new approach was evaluated on 13 multicategory and 9 binary datasets of cancer gene expression. Additionally, the performance of the proposed FRFI-WSA method in designing an FRBMS was compared with existing methods for gene selection and optimization such as genetic algorithm (GA), particle swarm optimization (PSO), and artificial bee colony algorithm (ABC) on all the datasets. In the global cancer map with repeated measurements (GCM_RM) dataset, the FRFI-WSA showed the smallest number of 16 most relevant genes associated with cancer using a minimal number of 26 compact rules with the highest classification accuracy (96.45%). In addition, the statistical validation used in this study revealed that the biological relevance of the most relevant genes associated with cancer and their linguistics detected by the proposed FRFI-WSA approach are better than those in the other methods. The simple interpretable rules with most relevant genes and effectively classified

  2. Gene set analysis for GWAS

    DEFF Research Database (Denmark)

    Debrabant, Birgit; Soerensen, Mette

    2014-01-01

    Abstract We discuss the use of modified Kolmogorov-Smirnov (KS) statistics in the context of gene set analysis and review corresponding null and alternative hypotheses. Especially, we show that, when enhancing the impact of highly significant genes in the calculation of the test statistic...... parameter and the genesis and distribution of the gene-level statistics, and illustrate the effects of differential weighting in a real-life example....

  3. Gene set analysis for longitudinal gene expression data

    Directory of Open Access Journals (Sweden)

    Piepho Hans-Peter

    2011-07-01

    Full Text Available Abstract Background Gene set analysis (GSA has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes. Nowadays, an increasing number of microarray studies are conducted to explore the dynamic changes of gene expression in a variety of species and biological scenarios. In these longitudinal studies, gene expression is repeatedly measured over time such that a GSA needs to take into account the within-gene correlations in addition to possible between-gene correlations. Results We provide a robust nonparametric approach to compare the expressions of longitudinally measured sets of genes under multiple treatments or experimental conditions. The limiting distributions of our statistics are derived when the number of genes goes to infinity while the number of replications can be small. When the number of genes in a gene set is small, we recommend permutation tests based on our nonparametric test statistics to achieve reliable type I error and better power while incorporating unknown correlations between and within-genes. Simulation results demonstrate that the proposed method has a greater power than other methods for various data distributions and heteroscedastic correlation structures. This method was used for an IL-2 stimulation study and significantly altered gene sets were identified. Conclusions The simulation study and the real data application showed that the proposed gene set analysis provides a promising tool for longitudinal microarray analysis. R scripts for simulating longitudinal data and calculating the nonparametric statistics are posted on the North Dakota INBRE website http://ndinbre.org/programs/bioinformatics.php. Raw microarray data is available in Gene Expression Omnibus (National Center for Biotechnology Information with

  4. A Novel Hybrid Dimension Reduction Technique for Undersized High Dimensional Gene Expression Data Sets Using Information Complexity Criterion for Cancer Classification

    Directory of Open Access Journals (Sweden)

    Esra Pamukçu

    2015-01-01

    Full Text Available Gene expression data typically are large, complex, and highly noisy. Their dimension is high with several thousand genes (i.e., features but with only a limited number of observations (i.e., samples. Although the classical principal component analysis (PCA method is widely used as a first standard step in dimension reduction and in supervised and unsupervised classification, it suffers from several shortcomings in the case of data sets involving undersized samples, since the sample covariance matrix degenerates and becomes singular. In this paper we address these limitations within the context of probabilistic PCA (PPCA by introducing and developing a new and novel approach using maximum entropy covariance matrix and its hybridized smoothed covariance estimators. To reduce the dimensionality of the data and to choose the number of probabilistic PCs (PPCs to be retained, we further introduce and develop celebrated Akaike’s information criterion (AIC, consistent Akaike’s information criterion (CAIC, and the information theoretic measure of complexity (ICOMP criterion of Bozdogan. Six publicly available undersized benchmark data sets were analyzed to show the utility, flexibility, and versatility of our approach with hybridized smoothed covariance matrix estimators, which do not degenerate to perform the PPCA to reduce the dimension and to carry out supervised classification of cancer groups in high dimensions.

  5. Multidimensional gene set analysis of genomic data.

    Directory of Open Access Journals (Sweden)

    David Montaner

    Full Text Available Understanding the functional implications of changes in gene expression, mutations, etc., is the aim of most genomic experiments. To achieve this, several functional profiling methods have been proposed. Such methods study the behaviour of different gene modules (e.g. gene ontology terms in response to one particular variable (e.g. differential gene expression. In spite to the wealth of information provided by functional profiling methods, a common limitation to all of them is their inherent unidimensional nature. In order to overcome this restriction we present a multidimensional logistic model that allows studying the relationship of gene modules with different genome-scale measurements (e.g. differential expression, genotyping association, methylation, copy number alterations, heterozygosity, etc. simultaneously. Moreover, the relationship of such functional modules with the interactions among the variables can also be studied, which produces novel results impossible to be derived from the conventional unidimensional functional profiling methods. We report sound results of gene sets associations that remained undetected by the conventional one-dimensional gene set analysis in several examples. Our findings demonstrate the potential of the proposed approach for the discovery of new cell functionalities with complex dependences on more than one variable.

  6. Gene set analysis of the EADGENE chicken data-set

    DEFF Research Database (Denmark)

    Skarman, Axel; Jiang, Li; Hornshøj, Henrik

    2009-01-01

     Abstract Background: Gene set analysis is considered to be a way of improving our biological interpretation of the observed expression patterns. This paper describes different methods applied to analyse expression data from a chicken DNA microarray dataset. Results: Applying different gene set...... analyses to the chicken expression data led to different ranking of the Gene Ontology terms tested. A method for prediction of possible annotations was applied. Conclusion: Biological interpretation based on gene set analyses dependent on the statistical method used. Methods for predicting the possible...

  7. Gene set analysis using variance component tests

    Science.gov (United States)

    2013-01-01

    Background Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. Results We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). Conclusion We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data. PMID:23806107

  8. Information sets from defining sets in abelian codes

    CERN Document Server

    Bernal, José Joaquín

    2011-01-01

    We describe a technique to construct a set of check positions (and hence an information set) for every abelian code solely in terms of its defining set. This generalizes that given by Imai in \\cite{Imai} in the case of binary TDC codes.

  9. Semantic particularity measure for functional characterization of gene sets using gene ontology.

    Science.gov (United States)

    Bettembourg, Charles; Diot, Christian; Dameron, Olivier

    2014-01-01

    Genetic and genomic data analyses are outputting large sets of genes. Functional comparison of these gene sets is a key part of the analysis, as it identifies their shared functions, and the functions that distinguish each set. The Gene Ontology (GO) initiative provides a unified reference for analyzing the genes molecular functions, biological processes and cellular components. Numerous semantic similarity measures have been developed to systematically quantify the weight of the GO terms shared by two genes. We studied how gene set comparisons can be improved by considering gene set particularity in addition to gene set similarity. We propose a new approach to compute gene set particularities based on the information conveyed by GO terms. A GO term informativeness can be computed using either its information content based on the term frequency in a corpus, or a function of the term's distance to the root. We defined the semantic particularity of a set of GO terms Sg1 compared to another set of GO terms Sg2. We combined our particularity measure with a similarity measure to compare gene sets. We demonstrated that the combination of semantic similarity and semantic particularity measures was able to identify genes with particular functions from among similar genes. This differentiation was not recognized using only a semantic similarity measure. Semantic particularity should be used in conjunction with semantic similarity to perform functional analysis of GO-annotated gene sets. The principle is generalizable to other ontologies.

  10. Gene set analyses for interpreting microarray experiments on prokaryotic organisms.

    Energy Technology Data Exchange (ETDEWEB)

    Tintle, Nathan; Best, Aaron; Dejongh, Matthew; VanBruggen, Dirk; Heffron, Fred; Porwollik, Steffen; Taylor, Ronald C.

    2008-11-05

    Background: Recent advances in microarray technology have brought with them the need for enhanced methods of biologically interpreting gene expression data. Recently, methods like Gene Set Enrichment Analysis (GSEA) and variants of Fisher’s exact test have been proposed which utilize a priori biological information. Typically, these methods are demonstrated with a priori biological information from the Gene Ontology. Results: Alternative gene set definitions are presented based on gene sets inferred from the SEED: open-source software environment for comparative genome annotation and analysis of microbial organisms. Many of these gene sets are then shown to provide consistent expression across a series of experiments involving Salmonella Typhimurium. Implementation of the gene sets in an analysis of microarray data is then presented for the Salmonella Typhimurium data. Conclusions: SEED inferred gene sets can be naturally defined based on subsystems in the SEED. The consistent expression values of these SEED inferred gene sets suggest their utility for statistical analyses of gene expression data based on a priori biological information

  11. Vocabulary Mining for Information Retrieval: Rough Sets and Fuzzy Sets.

    Science.gov (United States)

    Srinivasan, Padmini; Ruiz, Miguel E.; Kraft, Donald H.; Chen, Jianhua

    2001-01-01

    Explains vocabulary mining in information retrieval and describes a framework for vocabulary mining that allows the use of rough set-based approximations even when documents and queries are described using weighted, or fuzzy, representations. Examines coordination between multiple vocabulary views and applies the framework to the Unified Medical…

  12. Information Flow Settings in Building Rehabilitation

    Science.gov (United States)

    Elwazani, S.; Gandikota, S.

    2017-08-01

    Apart from the usual field technical survey information for establishing the building configurations and fabric conditions, information flow for a rehabilitation project begins earlier with the need for authenticating the building as a significant heritage item and ends subsequently with validating the rehabilitation of the building. These three genres of the information are recognized under three information settings. This study investigates the first, the setting associated with authenticating the significance of the building. The discussion is structured around the process of evaluating building significance for the purpose of listing the building on the National Register of Historic Places (NRHP) and, accordingly, recognizes the NRHP framework for nominating properties to the Register. With due consideration to the concomitant information and documentation along the nomination process, and with the "historic context" as a core significance assessment strategy, the study aims at: a) explaining the configuration of the historic context; b) clarifying the role of the building itself in developing the historic context; and, c) identifying the attributes of information flow. The study arrived at the following conclusions. Investigating "the information associated with authenticating the significance of the building," the focus of this study, as an information setting of a spectrum of three helps define the global information flow in building rehabilitation. Steeped in research, the historic context configuration and development steps regulate the information flow of this setting. The knowledge and dexterity of the researcher in configuring and developing the historic context enhances the clarity and characteristics of information flow.

  13. Evaluating the consistency of gene sets used in the analysis of bacterial gene expression data

    Directory of Open Access Journals (Sweden)

    Tintle Nathan L

    2012-08-01

    Full Text Available Abstract Background Statistical analyses of whole genome expression data require functional information about genes in order to yield meaningful biological conclusions. The Gene Ontology (GO and Kyoto Encyclopedia of Genes and Genomes (KEGG are common sources of functionally grouped gene sets. For bacteria, the SEED and MicrobesOnline provide alternative, complementary sources of gene sets. To date, no comprehensive evaluation of the data obtained from these resources has been performed. Results We define a series of gene set consistency metrics directly related to the most common classes of statistical analyses for gene expression data, and then perform a comprehensive analysis of 3581 Affymetrix® gene expression arrays across 17 diverse bacteria. We find that gene sets obtained from GO and KEGG demonstrate lower consistency than those obtained from the SEED and MicrobesOnline, regardless of gene set size. Conclusions Despite the widespread use of GO and KEGG gene sets in bacterial gene expression data analysis, the SEED and MicrobesOnline provide more consistent sets for a wide variety of statistical analyses. Increased use of the SEED and MicrobesOnline gene sets in the analysis of bacterial gene expression data may improve statistical power and utility of expression data.

  14. Multi-edge gene set networks reveal novel insights into global relationships between biological themes.

    Directory of Open Access Journals (Sweden)

    Jignesh R Parikh

    Full Text Available Curated gene sets from databases such as KEGG Pathway and Gene Ontology are often used to systematically organize lists of genes or proteins derived from high-throughput data. However, the information content inherent to some relationships between the interrogated gene sets, such as pathway crosstalk, is often underutilized. A gene set network, where nodes representing individual gene sets such as KEGG pathways are connected to indicate a functional dependency, is well suited to visualize and analyze global gene set relationships. Here we introduce a novel gene set network construction algorithm that integrates gene lists derived from high-throughput experiments with curated gene sets to construct co-enrichment gene set networks. Along with previously described co-membership and linkage algorithms, we apply the co-enrichment algorithm to eight gene set collections to construct integrated multi-evidence gene set networks with multiple edge types connecting gene sets. We demonstrate the utility of approach through examples of novel gene set networks such as the chromosome map co-differential expression gene set network. A total of twenty-four gene set networks are exposed via a web tool called MetaNet, where context-specific multi-edge gene set networks are constructed from enriched gene sets within user-defined gene lists. MetaNet is freely available at http://blaispathways.dfci.harvard.edu/metanet/.

  15. Multi-edge gene set networks reveal novel insights into global relationships between biological themes.

    Science.gov (United States)

    Parikh, Jignesh R; Xia, Yu; Marto, Jarrod A

    2012-01-01

    Curated gene sets from databases such as KEGG Pathway and Gene Ontology are often used to systematically organize lists of genes or proteins derived from high-throughput data. However, the information content inherent to some relationships between the interrogated gene sets, such as pathway crosstalk, is often underutilized. A gene set network, where nodes representing individual gene sets such as KEGG pathways are connected to indicate a functional dependency, is well suited to visualize and analyze global gene set relationships. Here we introduce a novel gene set network construction algorithm that integrates gene lists derived from high-throughput experiments with curated gene sets to construct co-enrichment gene set networks. Along with previously described co-membership and linkage algorithms, we apply the co-enrichment algorithm to eight gene set collections to construct integrated multi-evidence gene set networks with multiple edge types connecting gene sets. We demonstrate the utility of approach through examples of novel gene set networks such as the chromosome map co-differential expression gene set network. A total of twenty-four gene set networks are exposed via a web tool called MetaNet, where context-specific multi-edge gene set networks are constructed from enriched gene sets within user-defined gene lists. MetaNet is freely available at http://blaispathways.dfci.harvard.edu/metanet/.

  16. Zebrafish Expression Ontology of Gene Sets (ZEOGS): a tool to analyze enrichment of zebrafish anatomical terms in large gene sets.

    Science.gov (United States)

    Prykhozhij, Sergey V; Marsico, Annalisa; Meijsing, Sebastiaan H

    2013-09-01

    The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene expression

  17. Genes2GO: A web application for querying gene sets for specific GO terms.

    Science.gov (United States)

    Chawla, Konika; Kuiper, Martin

    2016-01-01

    Gene ontology annotations have become an essential resource for biological interpretations of experimental findings. The process of gathering basic annotation information in tables that link gene sets with specific gene ontology terms can be cumbersome, in particular if it requires above average computer skills or bioinformatics expertise. We have therefore developed Genes2GO, an intuitive R-based web application. Genes2GO uses the biomaRt package of Bioconductor in order to retrieve custom sets of gene ontology annotations for any list of genes from organisms covered by the Ensembl database. Genes2GO produces a binary matrix file, indicating for each gene the presence or absence of specific annotations for a gene. It should be noted that other GO tools do not offer this user-friendly access to annotations. Genes2GO is freely available and listed under http://www.semantic-systems-biology.org/tools/externaltools/.

  18. Gene set analysis for interpreting genetic studies

    DEFF Research Database (Denmark)

    Pers, Tune H

    2016-01-01

    Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways and func......Interpretation of genome-wide association study (GWAS) results is lacking behind the discovery of new genetic associations. Consequently, there is an urgent need for data-driven methods for interpreting genetic association studies. Gene set analysis (GSA) can identify aetiologic pathways...

  19. Informed consent comprehension in African research settings.

    Science.gov (United States)

    Afolabi, Muhammed O; Okebe, Joseph U; McGrath, Nuala; Larson, Heidi J; Bojang, Kalifa; Chandramohan, Daniel

    2014-06-01

    Previous reviews on participants' comprehension of informed consent information have focused on developed countries. Experience has shown that ethical standards developed on Western values may not be appropriate for African settings where research concepts are unfamiliar. We undertook this review to describe how informed consent comprehension is defined and measured in African research settings. We conducted a comprehensive search involving five electronic databases: Medline, Embase, Global Health, EthxWeb and Bioethics Literature Database (BELIT). We also examined African Index Medicus and Google Scholar for relevant publications on informed consent comprehension in clinical studies conducted in sub-Saharan Africa. 29 studies satisfied the inclusion criteria; meta-analysis was possible in 21 studies. We further conducted a direct comparison of participants' comprehension on domains of informed consent in all eligible studies. Comprehension of key concepts of informed consent varies considerably from country to country and depends on the nature and complexity of the study. Meta-analysis showed that 47% of a total of 1633 participants across four studies demonstrated comprehension about randomisation (95% CI 13.9-80.9%). Similarly, 48% of 3946 participants in six studies had understanding about placebo (95% CI 19.0-77.5%), while only 30% of 753 participants in five studies understood the concept of therapeutic misconception (95% CI 4.6-66.7%). Measurement tools for informed consent comprehension were developed with little or no validation. Assessment of comprehension was carried out at variable times after disclosure of study information. No uniform definition of informed consent comprehension exists to form the basis for development of an appropriate tool to measure comprehension in African participants. Comprehension of key concepts of informed consent is poor among study participants across Africa. There is a vital need to develop a uniform definition for

  20. A DSRPCL-SVM Approach to Informative Gene Analysis

    Institute of Scientific and Technical Information of China (English)

    Wei Xiong; Zhibin Cai; Jinwen Ma

    2008-01-01

    Microarray data based tumor diagnosis is a very interesting topic in bioinformatics. One of the key problems is the discovery and analysis of informative genes of a tumor. Although there are many elaborate approaches to this problem, it is still difficult to select a reasonable set of informative genes for tumor diagnosis only with microarray data. In this paper, we classify the genes expressed through microarray data into a number of clusters via the distance sensitive rival penalized competitive learning (DSRPCL) algorithm and then detect the informative gene cluster or set with the help of support vector machine (SVM). Moreover, the critical or powerful informative genes can be found through further classifications and detections on the obtained informative gene clusters. It is well demonstrated by experiments on the colon, leukemia, and breast cancer datasets that our proposed DSRPCL-SVM approach leads to a reasonable selection of informative genes for tumor diagnosis.

  1. A gene pattern mining algorithm using interchangeable gene sets for prokaryotes

    Directory of Open Access Journals (Sweden)

    Kim Sun

    2008-02-01

    Full Text Available Abstract Background Mining gene patterns that are common to multiple genomes is an important biological problem, which can lead us to novel biological insights. When family classification of genes is available, this problem is similar to the pattern mining problem in the data mining community. However, when family classification information is not available, mining gene patterns is a challenging problem. There are several well developed algorithms for predicting gene patterns in a pair of genomes, such as FISH and DAGchainer. These algorithms use the optimization problem formulation which is solved using the dynamic programming technique. Unfortunately, extending these algorithms to multiple genome cases is not trivial due to the rapid increase in time and space complexity. Results In this paper, we propose a novel algorithm for mining gene patterns in more than two prokaryote genomes using interchangeable sets. The basic idea is to extend the pattern mining technique from the data mining community to handle the situation where family classification information is not available using interchangeable sets. In an experiment with four newly sequenced genomes (where the gene annotation is unavailable, we show that the gene pattern can capture important biological information. To examine the effectiveness of gene patterns further, we propose an ortholog prediction method based on our gene pattern mining algorithm and compare our method to the bi-directional best hit (BBH technique in terms of COG orthologous gene classification information. The experiment show that our algorithm achieves a 3% increase in recall compared to BBH without sacrificing the precision of ortholog detection. Conclusion The discovered gene patterns can be used for the detecting of ortholog and genes that collaborate for a common biological function.

  2. Gene set analyses for interpreting microarray experiments on prokaryotic organisms

    OpenAIRE

    Heffron Fred; Van Bruggen Dirk; DeJongh Matthew; Best Aaron A; Tintle Nathan L; Porwollik Steffen; Taylor Ronald C

    2008-01-01

    Abstract Background Despite the widespread usage of DNA microarrays, questions remain about how best to interpret the wealth of gene-by-gene transcriptional levels that they measure. Recently, methods have been proposed which use biologically defined sets of genes in interpretation, instead of examining results gene-by-gene. Despite a serious limitation, a method based on Fisher's exact test remains one of the few plausible options for gene set analysis when an experiment has few replicates, ...

  3. Studying the Complex Expression Dependences between Sets of Coexpressed Genes

    Directory of Open Access Journals (Sweden)

    Mario Huerta

    2014-01-01

    Full Text Available Organisms simplify the orchestration of gene expression by coregulating genes whose products function together in the cell. The use of clustering methods to obtain sets of coexpressed genes from expression arrays is very common; nevertheless there are no appropriate tools to study the expression networks among these sets of coexpressed genes. The aim of the developed tools is to allow studying the complex expression dependences that exist between sets of coexpressed genes. For this purpose, we start detecting the nonlinear expression relationships between pairs of genes, plus the coexpressed genes. Next, we form networks among sets of coexpressed genes that maintain nonlinear expression dependences between all of them. The expression relationship between the sets of coexpressed genes is defined by the expression relationship between the skeletons of these sets, where this skeleton represents the coexpressed genes with a well-defined nonlinear expression relationship with the skeleton of the other sets. As a result, we can study the nonlinear expression relationships between a target gene and other sets of coexpressed genes, or start the study from the skeleton of the sets, to study the complex relationships of activation and deactivation between the sets of coexpressed genes that carry out the different cellular processes present in the expression experiments.

  4. Self-Contained Statistical Analysis of Gene Sets

    Science.gov (United States)

    Cannon, Judy L.; Ricoy, Ulises M.; Johnson, Christopher

    2016-01-01

    Microarrays are a powerful tool for studying differential gene expression. However, lists of many differentially expressed genes are often generated, and unraveling meaningful biological processes from the lists can be challenging. For this reason, investigators have sought to quantify the statistical probability of compiled gene sets rather than individual genes. The gene sets typically are organized around a biological theme or pathway. We compute correlations between different gene set tests and elect to use Fisher’s self-contained method for gene set analysis. We improve Fisher’s differential expression analysis of a gene set by limiting the p-value of an individual gene within the gene set to prevent a small percentage of genes from determining the statistical significance of the entire set. In addition, we also compute dependencies among genes within the set to determine which genes are statistically linked. The method is applied to T-ALL (T-lineage Acute Lymphoblastic Leukemia) to identify differentially expressed gene sets between T-ALL and normal patients and T-ALL and AML (Acute Myeloid Leukemia) patients. PMID:27711232

  5. GSMA: Gene Set Matrix Analysis, An Automated Method for Rapid Hypothesis Testing of Gene Expression Data

    Directory of Open Access Journals (Sweden)

    Chris Cheadle

    2007-01-01

    Full Text Available Background: Microarray technology has become highly valuable for identifying complex global changes in gene expression patterns. The assignment of functional information to these complex patterns remains a challenging task in effectively interpreting data and correlating results from across experiments, projects and laboratories. Methods which allow the rapid and robust evaluation of multiple functional hypotheses increase the power of individual researchers to data mine gene expression data more efficiently.Results: We have developed (gene set matrix analysis GSMA as a useful method for the rapid testing of group-wise up- or downregulation of gene expression simultaneously for multiple lists of genes (gene sets against entire distributions of gene expression changes (datasets for single or multiple experiments. The utility of GSMA lies in its flexibility to rapidly poll gene sets related by known biological function or as designated solely by the end-user against large numbers of datasets simultaneously.Conclusions: GSMA provides a simple and straightforward method for hypothesis testing in which genes are tested by groups across multiple datasets for patterns of expression enrichment.

  6. Textrous!: extracting semantic textual meaning from gene sets.

    Directory of Open Access Journals (Sweden)

    Hongyu Chen

    Full Text Available The un-biased and reproducible interpretation of high-content gene sets from large-scale genomic experiments is crucial to the understanding of biological themes, validation of experimental data, and the eventual development of plans for future experimentation. To derive biomedically-relevant information from simple gene lists, a mathematical association to scientific language and meaningful words or sentences is crucial. Unfortunately, existing software for deriving meaningful and easily-appreciable scientific textual 'tokens' from large gene sets either rely on controlled vocabularies (Medical Subject Headings, Gene Ontology, BioCarta or employ Boolean text searching and co-occurrence models that are incapable of detecting indirect links in the literature. As an improvement to existing web-based informatic tools, we have developed Textrous!, a web-based framework for the extraction of biomedical semantic meaning from a given input gene set of arbitrary length. Textrous! employs natural language processing techniques, including latent semantic indexing (LSI, sentence splitting, word tokenization, parts-of-speech tagging, and noun-phrase chunking, to mine MEDLINE abstracts, PubMed Central articles, articles from the Online Mendelian Inheritance in Man (OMIM, and Mammalian Phenotype annotation obtained from Jackson Laboratories. Textrous! has the ability to generate meaningful output data with even very small input datasets, using two different text extraction methodologies (collective and individual for the selecting, ranking, clustering, and visualization of English words obtained from the user data. Textrous!, therefore, is able to facilitate the output of quantitatively significant and easily appreciable semantic words and phrases linked to both individual gene and batch genomic data.

  7. General Information about MRSA in Healthcare Settings

    Science.gov (United States)

    ... patient threat, a CDC study published in the Journal of the American Medical Association Internal Medicine showed that invasive (life-threatening) MRSA infections in healthcare settings are declining. ...

  8. GO-based Functional Dissimilarity of Gene Sets

    Directory of Open Access Journals (Sweden)

    Aguilar-Ruiz Jesús S

    2011-09-01

    Full Text Available Abstract Background The Gene Ontology (GO provides a controlled vocabulary for describing the functions of genes and can be used to evaluate the functional coherence of gene sets. Many functional coherence measures consider each pair of gene functions in a set and produce an output based on all pairwise distances. A single gene can encode multiple proteins that may differ in function. For each functionality, other proteins that exhibit the same activity may also participate. Therefore, an identification of the most common function for all of the genes involved in a biological process is important in evaluating the functional similarity of groups of genes and a quantification of functional coherence can helps to clarify the role of a group of genes working together. Results To implement this approach to functional assessment, we present GFD (GO-based Functional Dissimilarity, a novel dissimilarity measure for evaluating groups of genes based on the most relevant functions of the whole set. The measure assigns a numerical value to the gene set for each of the three GO sub-ontologies. Conclusions Results show that GFD performs robustly when applied to gene set of known functionality (extracted from KEGG. It performs particularly well on randomly generated gene sets. An ROC analysis reveals that the performance of GFD in evaluating the functional dissimilarity of gene sets is very satisfactory. A comparative analysis against other functional measures, such as GS2 and those presented by Resnik and Wang, also demonstrates the robustness of GFD.

  9. Minimum Data Set Active Resident Information Report

    Data.gov (United States)

    U.S. Department of Health & Human Services — The MDS Active Resident Report summarizes information for residents currently in nursing homes. The source of these counts is the residents MDS assessment record....

  10. Entrez Gene: gene-centered information at NCBI

    OpenAIRE

    Maglott, Donna; Ostell, Jim; Pruitt, Kim D; Tatusova, Tatiana

    2006-01-01

    Entrez Gene () is NCBI's database for gene-specific information. Entrez Gene includes records from genomes that have been completely sequenced, that have an active research community to contribute gene-specific information or that are scheduled for intense sequence analysis. The content of Entrez Gene represents the result of both curation and automated integration of data from NCBI's Reference Sequence project (RefSeq), from collaborating model organism databases and from other databases wit...

  11. A general modular framework for gene set enrichment analysis

    Directory of Open Access Journals (Sweden)

    Strimmer Korbinian

    2009-02-01

    Full Text Available Abstract Background Analysis of microarray and other high-throughput data on the basis of gene sets, rather than individual genes, is becoming more important in genomic studies. Correspondingly, a large number of statistical approaches for detecting gene set enrichment have been proposed, but both the interrelations and the relative performance of the various methods are still very much unclear. Results We conduct an extensive survey of statistical approaches for gene set analysis and identify a common modular structure underlying most published methods. Based on this finding we propose a general framework for detecting gene set enrichment. This framework provides a meta-theory of gene set analysis that not only helps to gain a better understanding of the relative merits of each embedded approach but also facilitates a principled comparison and offers insights into the relative interplay of the methods. Conclusion We use this framework to conduct a computer simulation comparing 261 different variants of gene set enrichment procedures and to analyze two experimental data sets. Based on the results we offer recommendations for best practices regarding the choice of effective procedures for gene set enrichment analysis.

  12. LEGO: a novel method for gene set over-representation analysis by incorporating network-based gene weights.

    Science.gov (United States)

    Dong, Xinran; Hao, Yun; Wang, Xiao; Tian, Weidong

    2016-01-11

    Pathway or gene set over-representation analysis (ORA) has become a routine task in functional genomics studies. However, currently widely used ORA tools employ statistical methods such as Fisher's exact test that reduce a pathway into a list of genes, ignoring the constitutive functional non-equivalent roles of genes and the complex gene-gene interactions. Here, we develop a novel method named LEGO (functional Link Enrichment of Gene Ontology or gene sets) that takes into consideration these two types of information by incorporating network-based gene weights in ORA analysis. In three benchmarks, LEGO achieves better performance than Fisher and three other network-based methods. To further evaluate LEGO's usefulness, we compare LEGO with five gene expression-based and three pathway topology-based methods using a benchmark of 34 disease gene expression datasets compiled by a recent publication, and show that LEGO is among the top-ranked methods in terms of both sensitivity and prioritization for detecting target KEGG pathways. In addition, we develop a cluster-and-filter approach to reduce the redundancy among the enriched gene sets, making the results more interpretable to biologists. Finally, we apply LEGO to two lists of autism genes, and identify relevant gene sets to autism that could not be found by Fisher.

  13. Core set approach to reduce uncertainty of gene trees

    Directory of Open Access Journals (Sweden)

    Okuhara Yoshiyasu

    2006-05-01

    Full Text Available Abstract Background A genealogy based on gene sequences within a species plays an essential role in the estimation of the character, structure, and evolutionary history of that species. Because intraspecific sequences are more closely related than interspecific ones, detailed information on the evolutionary process may be available by determining all the node sequences of trees and provide insight into functional constraints and adaptations. However, strong evolutionary correlations on a few lineages make this determination difficult as a whole, and the maximum parsimony (MP method frequently allows a number of topologies with a same total branching length. Results Kitazoe et al. developed multidimensional vector-space representation of phylogeny. It converts additivity of evolutionary distances to orthogonality among the vectors expressing branches, and provides a unified index to measure deviations from the orthogoality. In this paper, this index is used to detect and exclude sequences with large deviations from orthogonality, and then selects a maximum subset ("core set" of sequences for which MP generates a single solution. Once the core set tree is formed whose all the node sequences are given, the excluded sequences are found to have basically two phylogenetic positions on this tree, respectively. Fortunately, since multiple substitutions are rare in intra-species sequences, the variance of nucleotide transitions is confined to a small range. By applying the core set approach to 38 partial env sequences of HIV-1 in a single patient and also 198 mitochondrial COI and COII DNA sequences of Anopheles dirus, we demonstrate how consistently this approach constructs the tree. Conclusion In the HIV dataset, we confirmed that the obtained core set tree is the unique maximum set for which MP proposes a single tree. In the mosquito data set, the fluctuation of nucleotide transitions caused by the sequences excluded from the core set was very small

  14. Irreducible descriptive sets of attributes for information systems

    KAUST Repository

    Moshkov, Mikhail

    2010-01-01

    The maximal consistent extension Ext(S) of a given information system S consists of all objects corresponding to attribute values from S which are consistent with all true and realizable rules extracted from the original information system S. An irreducible descriptive set for the considered information system S is a minimal (relative to the inclusion) set B of attributes which defines exactly the set Ext(S) by means of true and realizable rules constructed over attributes from the considered set B. We show that there exists only one irreducible descriptive set of attributes. We present a polynomial algorithm for this set construction. We also study relationships between the cardinality of irreducible descriptive set of attributes and the number of attributes in S. The obtained results will be useful for the design of concurrent data models from experimental data. © 2010 Springer-Verlag.

  15. Entrez Gene: gene-centered information at NCBI.

    Science.gov (United States)

    Maglott, Donna; Ostell, Jim; Pruitt, Kim D; Tatusova, Tatiana

    2011-01-01

    Entrez Gene (http://www.ncbi.nlm.nih.gov/gene) is National Center for Biotechnology Information (NCBI)'s database for gene-specific information. Entrez Gene maintains records from genomes which have been completely sequenced, which have an active research community to submit gene-specific information, or which are scheduled for intense sequence analysis. The content represents the integration of curation and automated processing from NCBI's Reference Sequence project (RefSeq), collaborating model organism databases, consortia such as Gene Ontology and other databases within NCBI. Records in Entrez Gene are assigned unique, stable and tracked integers as identifiers. The content (nomenclature, genomic location, gene products and their attributes, markers, phenotypes and links to citations, sequences, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities) and for bulk transfer by FTP.

  16. A Bayesian variable selection procedure to rank overlapping gene sets

    Directory of Open Access Journals (Sweden)

    Skarman Axel

    2012-05-01

    Full Text Available Abstract Background Genome-wide expression profiling using microarrays or sequence-based technologies allows us to identify genes and genetic pathways whose expression patterns influence complex traits. Different methods to prioritize gene sets, such as the genes in a given molecular pathway, have been described. In many cases, these methods test one gene set at a time, and therefore do not consider overlaps among the pathways. Here, we present a Bayesian variable selection method to prioritize gene sets that overcomes this limitation by considering all gene sets simultaneously. We applied Bayesian variable selection to differential expression to prioritize the molecular and genetic pathways involved in the responses to Escherichia coli infection in Danish Holstein cows. Results We used a Bayesian variable selection method to prioritize Kyoto Encyclopedia of Genes and Genomes pathways. We used our data to study how the variable selection method was affected by overlaps among the pathways. In addition, we compared our approach to another that ignores the overlaps, and studied the differences in the prioritization. The variable selection method was robust to a change in prior probability and stable given a limited number of observations. Conclusions Bayesian variable selection is a useful way to prioritize gene sets while considering their overlaps. Ignoring the overlaps gives different and possibly misleading results. Additional procedures may be needed in cases of highly overlapping pathways that are hard to prioritize.

  17. Identifying the optimal gene and gene set in hepatocellular carcinoma based on differential expression and differential co-expression algorithm.

    Science.gov (United States)

    Dong, Li-Yang; Zhou, Wei-Zhong; Ni, Jun-Wei; Xiang, Wei; Hu, Wen-Hao; Yu, Chang; Li, Hai-Yan

    2017-02-01

    The objective of this study was to identify the optimal gene and gene set for hepatocellular carcinoma (HCC) utilizing differential expression and differential co-expression (DEDC) algorithm. The DEDC algorithm consisted of four parts: calculating differential expression (DE) by absolute t-value in t-statistics; computing differential co-expression (DC) based on Z-test; determining optimal thresholds on the basis of Chi-squared (χ2) maximization and the corresponding gene was the optimal gene; and evaluating functional relevance of genes categorized into different partitions to determine the optimal gene set with highest mean minimum functional information (FI) gain (Δ*G). The optimal thresholds divided genes into four partitions, high DE and high DC (HDE-HDC), high DE and low DC (HDE-LDC), low DE and high DC (LDE‑HDC), and low DE and low DC (LDE-LDC). In addition, the optimal gene was validated by conducting reverse transcription-polymerase chain reaction (RT-PCR) assay. The optimal threshold for DC and DE were 1.032 and 1.911, respectively. Using the optimal gene, the genes were divided into four partitions including: HDE-HDC (2,053 genes), HED-LDC (2,822 genes), LDE-HDC (2,622 genes), and LDE-LDC (6,169 genes). The optimal gene was microtubule‑associated protein RP/EB family member 1 (MAPRE1), and RT-PCR assay validated the significant difference between the HCC and normal state. The optimal gene set was nucleoside metabolic process (GO\\GO:0009116) with Δ*G = 18.681 and 24 HDE-HDC partitions in total. In conclusion, we successfully investigated the optimal gene, MAPRE1, and gene set, nucleoside metabolic process, which may be potential biomarkers for targeted therapy and provide significant insight for revealing the pathological mechanism underlying HCC.

  18. Involvement of astrocyte and oligodendrocyte gene sets in migraine.

    Science.gov (United States)

    Eising, Else; de Leeuw, Christiaan; Min, Josine L; Anttila, Verneri; Verheijen, Mark Hg; Terwindt, Gisela M; Dichgans, Martin; Freilinger, Tobias; Kubisch, Christian; Ferrari, Michel D; Smit, August B; de Vries, Boukje; Palotie, Aarno; van den Maagdenberg, Arn Mjm; Posthuma, Danielle

    2016-06-01

    Migraine is a common episodic brain disorder characterized by recurrent attacks of severe unilateral headache and additional neurological symptoms. Two main migraine types can be distinguished based on the presence of aura symptoms that can accompany the headache: migraine with aura and migraine without aura. Multiple genetic and environmental factors confer disease susceptibility. Recent genome-wide association studies (GWAS) indicate that migraine susceptibility genes are involved in various pathways, including neurotransmission, which have already been implicated in genetic studies of monogenic familial hemiplegic migraine, a subtype of migraine with aura. To further explore the genetic background of migraine, we performed a gene set analysis of migraine GWAS data of 4954 clinic-based patients with migraine, as well as 13,390 controls. Curated sets of synaptic genes and sets of genes predominantly expressed in three glial cell types (astrocytes, microglia and oligodendrocytes) were investigated. Our results show that gene sets containing astrocyte- and oligodendrocyte-related genes are associated with migraine, which is especially true for gene sets involved in protein modification and signal transduction. Observed differences between migraine with aura and migraine without aura indicate that both migraine types, at least in part, seem to have a different genetic background. © International Headache Society 2015.

  19. Discovering highly informative feature set over high dimensions

    KAUST Repository

    Zhang, Chongsheng

    2012-11-01

    For many textual collections, the number of features is often overly large. These features can be very redundant, it is therefore desirable to have a small, succinct, yet highly informative collection of features that describes the key characteristics of a dataset. Information theory is one such tool for us to obtain this feature collection. With this paper, we mainly contribute to the improvement of efficiency for the process of selecting the most informative feature set over high-dimensional unlabeled data. We propose a heuristic theory for informative feature set selection from high dimensional data. Moreover, we design data structures that enable us to compute the entropies of the candidate feature sets efficiently. We also develop a simple pruning strategy that eliminates the hopeless candidates at each forward selection step. We test our method through experiments on real-world data sets, showing that our proposal is very efficient. © 2012 IEEE.

  20. Information sets as permutation cycles for quadratic residue codes

    Directory of Open Access Journals (Sweden)

    Richard A. Jenson

    1982-01-01

    Full Text Available The two cases p=7 and p=23 are the only known cases where the automorphism group of the [p+1,   (p+1/2] extended binary quadratic residue code, O(p, properly contains PSL(2,p. These codes have some of their information sets represented as permutation cycles from Aut(Q(p. Analysis proves that all information sets of Q(7 are so represented but those of Q(23 are not.

  1. A Bayesian variable selection procedure for ranking overlapping gene sets

    DEFF Research Database (Denmark)

    Skarman, Axel; Mahdi Shariati, Mohammad; Janss, Luc

    2012-01-01

    described. In many cases, these methods test one gene set at a time, and therefore do not consider overlaps among the pathways. Here, we present a Bayesian variable selection method to prioritize gene sets that overcomes this limitation by considering all gene sets simultaneously. We applied Bayesian...... variable selection to differential expression to prioritize the molecular and genetic pathways involved in the responses to Escherichia coli infection in Danish Holstein cows. Results We used a Bayesian variable selection method to prioritize Kyoto Encyclopedia of Genes and Genomes pathways. We used our...... data to study how the variable selection method was affected by overlaps among the pathways. In addition, we compared our approach to another that ignores the overlaps, and studied the differences in the prioritization. The variable selection method was robust to a change in prior probability...

  2. Gene set enrichment analysis for non-monotone association and multiple experimental categories

    OpenAIRE

    Heinloth Alexandra N; Irwin Richard D; Dai Shuangshuang; Lin Rongheng; Boorman Gary A; Li Leping

    2008-01-01

    Abstract Background Recently, microarray data analyses using functional pathway information, e.g., gene set enrichment analysis (GSEA) and significance analysis of function and expression (SAFE), have gained recognition as a way to identify biological pathways/processes associated with a phenotypic endpoint. In these analyses, a local statistic is used to assess the association between the expression level of a gene and the value of a phenotypic endpoint. Then these gene-specific local statis...

  3. Time-Course Gene Set Analysis for Longitudinal Gene Expression Data.

    Directory of Open Access Journals (Sweden)

    Boris P Hejblum

    2015-06-01

    Full Text Available Gene set analysis methods, which consider predefined groups of genes in the analysis of genomic data, have been successfully applied for analyzing gene expression data in cross-sectional studies. The time-course gene set analysis (TcGSA introduced here is an extension of gene set analysis to longitudinal data. The proposed method relies on random effects modeling with maximum likelihood estimates. It allows to use all available repeated measurements while dealing with unbalanced data due to missing at random (MAR measurements. TcGSA is a hypothesis driven method that identifies a priori defined gene sets with significant expression variations over time, taking into account the potential heterogeneity of expression within gene sets. When biological conditions are compared, the method indicates if the time patterns of gene sets significantly differ according to these conditions. The interest of the method is illustrated by its application to two real life datasets: an HIV therapeutic vaccine trial (DALIA-1 trial, and data from a recent study on influenza and pneumococcal vaccines. In the DALIA-1 trial TcGSA revealed a significant change in gene expression over time within 69 gene sets during vaccination, while a standard univariate individual gene analysis corrected for multiple testing as well as a standard a Gene Set Enrichment Analysis (GSEA for time series both failed to detect any significant pattern change over time. When applied to the second illustrative data set, TcGSA allowed the identification of 4 gene sets finally found to be linked with the influenza vaccine too although they were found to be associated to the pneumococcal vaccine only in previous analyses. In our simulation study TcGSA exhibits good statistical properties, and an increased power compared to other approaches for analyzing time-course expression patterns of gene sets. The method is made available for the community through an R package.

  4. Modeling Multisource-heterogeneous Information Based on Random Set and Fuzzy Set Theory

    Institute of Scientific and Technical Information of China (English)

    WEN Cheng-lin; XU Xiao-bin

    2006-01-01

    This paper presents a new idea, named as modeling multisensor-heterogeneous information, to incorporate the fuzzy logic methodologies with mulitsensor-multitarget system under the framework of random set theory. Firstly, based on strong random set and weak random set, the unified form to describe both data (unambiguous information) and fuzzy evidence (uncertain information) is introduced. Secondly, according to signatures of fuzzy evidence, two Bayesian-markov nonlinear measurement models are proposed to fuse effectively data and fuzzy evidence. Thirdly, by use of "the models-based signature-matching scheme", the operation of the statistics of fuzzy evidence defined as random set can be translated into that of the membership functions of relative point state variables. These works are the basis to construct qualitative measurement models and to fuse data and fuzzy evidence.

  5. GeneBrowser 2: an application to explore and identify common biological traits in a set of genes

    Directory of Open Access Journals (Sweden)

    Oliveira José

    2010-07-01

    Full Text Available Abstract Background The development of high-throughput laboratory techniques created a demand for computer-assisted result analysis tools. Many of these techniques return lists of genes whose interpretation requires finding relevant biological roles for the problem at hand. The required information is typically available in public databases, and usually, this information must be manually retrieved to complement the analysis. This process is a very time-consuming task that should be automated as much as possible. Results GeneBrowser is a web-based tool that, for a given list of genes, combines data from several public databases with visualisation and analysis methods to help identify the most relevant and common biological characteristics. The functionalities provided include the following: a central point with the most relevant biological information for each inserted gene; a list of the most related papers in PubMed and gene expression studies in ArrayExpress; and an extended approach to functional analysis applied to Gene Ontology, homologies, gene chromosomal localisation and pathways. Conclusions GeneBrowser provides a unique entry point to several visualisation and analysis methods, providing fast and easy analysis of a set of genes. GeneBrowser fills the gap between Web portals that analyse one gene at a time and functional analysis tools that are limited in scope and usually desktop-based.

  6. The limitations of simple gene set enrichment analysis assuming gene independence.

    Science.gov (United States)

    Tamayo, Pablo; Steinhardt, George; Liberzon, Arthur; Mesirov, Jill P

    2016-02-01

    Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods.

  7. WhichGenes: a web-based tool for gathering, building, storing and exporting gene sets with application in gene set enrichment analysis.

    Science.gov (United States)

    Glez-Peña, Daniel; Gómez-López, Gonzalo; Pisano, David G; Fdez-Riverola, Florentino

    2009-07-01

    WhichGenes is a web-based interactive gene set building tool offering a very simple interface to extract always-updated gene lists from multiple databases and unstructured biological data sources. While the user can specify new gene sets of interest by following a simple four-step wizard, the tool is able to run several queries in parallel. Every time a new set is generated, it is automatically added to the private gene-set cart and the user is notified by an e-mail containing a direct link to the new set stored in the server. WhichGenes provides functionalities to edit, delete and rename existing sets as well as the capability of generating new ones by combining previous existing sets (intersection, union and difference operators). The user can export his sets configuring the output format and selecting among multiple gene identifiers. In addition to the user-friendly environment, WhichGenes allows programmers to access its functionalities in a programmatic way through a Representational State Transfer web service. WhichGenes front-end is freely available at http://www.whichgenes.org/, WhichGenes API is accessible at http://www.whichgenes.org/api/.

  8. Genes with minimal phylogenetic information are problematic for coalescent analyses when gene tree estimation is biased.

    Science.gov (United States)

    Xi, Zhenxiang; Liu, Liang; Davis, Charles C

    2015-11-01

    The development and application of coalescent methods are undergoing rapid changes. One little explored area that bears on the application of gene-tree-based coalescent methods to species tree estimation is gene informativeness. Here, we investigate the accuracy of these coalescent methods when genes have minimal phylogenetic information, including the implementation of the multilocus bootstrap approach. Using simulated DNA sequences, we demonstrate that genes with minimal phylogenetic information can produce unreliable gene trees (i.e., high error in gene tree estimation), which may in turn reduce the accuracy of species tree estimation using gene-tree-based coalescent methods. We demonstrate that this problem can be alleviated by sampling more genes, as is commonly done in large-scale phylogenomic analyses. This applies even when these genes are minimally informative. If gene tree estimation is biased, however, gene-tree-based coalescent analyses will produce inconsistent results, which cannot be remedied by increasing the number of genes. In this case, it is not the gene-tree-based coalescent methods that are flawed, but rather the input data (i.e., estimated gene trees). Along these lines, the commonly used program PhyML has a tendency to infer one particular bifurcating topology even though it is best represented as a polytomy. We additionally corroborate these findings by analyzing the 183-locus mammal data set assembled by McCormack et al. (2012) using ultra-conserved elements (UCEs) and flanking DNA. Lastly, we demonstrate that when employing the multilocus bootstrap approach on this 183-locus data set, there is no strong conflict between species trees estimated from concatenation and gene-tree-based coalescent analyses, as has been previously suggested by Gatesy and Springer (2014).

  9. New cyt b gene universal primer set for forensic analysis.

    Science.gov (United States)

    Lopez-Oceja, A; Gamarra, D; Borragan, S; Jiménez-Moreno, S; de Pancorbo, M M

    2016-07-01

    Analysis of mitochondrial DNA, and in particular the cytochrome b gene (cyt b), has become an essential tool for species identification in routine forensic practice. In cases of degraded samples, where the DNA is fractionated, universal primers that are highly efficient for the amplification of the target region are necessary. Therefore, in the present study a new universal cyt b primer set with high species identification capabilities, even in samples with highly degraded DNA, has been developed. In order to achieve this objective, the primers were designed following the alignment of complete sequences of the cyt b from 751 species from the Class of Mammalia listed in GenBank. A highly variable region of 148bp flanked by highly conserved sequences was chosen for placing the primers. The effectiveness of the new pair of primers was examined in 63 animal species belonging to 38 Families from 14 Orders and 5 Classes (Mammalia, Aves, Reptilia, Actinopterygii, and Malacostraca). Species determination was possible in all cases, which shows that the fragment analyzed provided a high capability for species identification. Furthermore, to ensure the efficiency of the 148bp fragment, the intraspecific variability was analyzed by calculating the concordance between individuals with the BLAST tool from the NCBI (National Center for Biotechnological Information). The intraspecific concordance levels were superior to 97% in all species. Likewise, the phylogenetic information from the selected fragment was confirmed by obtaining the phylogenetic tree from the sequences of the species analyzed. Evidence of the high power of phylogenetic discrimination of the analyzed fragment of the cyt b was obtained, as 93.75% of the species were grouped within their corresponding Orders. Finally, the analysis of 40 degraded samples with small-size DNA fragments showed that the new pair of primers permits identifying the species, even when the DNA is highly degraded as it is very common in

  10. Gene: a gene-centered information resource at NCBI.

    Science.gov (United States)

    Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D

    2015-01-01

    The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP.

  11. Analysis of gene set using shrinkage covariance matrix approach

    Science.gov (United States)

    Karjanto, Suryaefiza; Aripin, Rasimah

    2013-09-01

    Microarray methodology has been exploited for different applications such as gene discovery and disease diagnosis. This technology is also used for quantitative and highly parallel measurements of gene expression. Recently, microarrays have been one of main interests of statisticians because they provide a perfect example of the paradigms of modern statistics. In this study, the alternative approach to estimate the covariance matrix has been proposed to solve the high dimensionality problem in microarrays. The extension of traditional Hotelling's T2 statistic is constructed for determining the significant gene sets across experimental conditions using shrinkage approach. Real data sets were used as illustrations to compare the performance of the proposed methods with other methods. The results across the methods are consistent, implying that this approach provides an alternative to existing techniques.

  12. Informal Language Learning Setting: Technology or Social Interaction?

    Science.gov (United States)

    Bahrani, Taher; Sim, Tam Shu

    2012-01-01

    Based on the informal language learning theory, language learning can occur outside the classroom setting unconsciously and incidentally through interaction with the native speakers or exposure to authentic language input through technology. However, an EFL context lacks the social interaction which naturally occurs in an ESL context. To explore…

  13. ROUGH SET BASED CLUSTERING OF GENE EXPRESSION DATA: A SURVEY

    Directory of Open Access Journals (Sweden)

    J.JEBA EMILYN

    2010-12-01

    Full Text Available Microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. But the high dimensionality property of gene expression data makes it difficult to be analyzed. Lot of clustering algorithms are available for clustering. In this paper we first briefly introduce the concepts of microarray technology and discuss the basic elements of clustering on gene expression data. Then we introduce rough clustering and itsadvantage over strict and fuzzy clustering is explored. We also explain why rough clustering is preferred over other conventional methods by presenting a survey on few clustering algorithms based on rough set theory for gene expression data. We conclude by stating that this area proves to be potential research field for the researchcommunity.

  14. Information Measures of Roughness of Knowledge and Rough Sets for Incomplete Information Systems

    Institute of Scientific and Technical Information of China (English)

    LIANG Ji-ye; QU Kai-she

    2001-01-01

    In this paper we address information measures of roughness of knowledge and rough sets for incomplete information systems. The definition of rough entropy of knowledge and its important properties are given. In particular, the relationship between rough entropy of knowledge and the Hartley measure of uncertainty is established. We show that rough entropy of knowledge decreases monotonously as granularity of information become smaller. This gives an information interpretation for roughness of knowledge. Based on rough entropy of knowledge and roughness of rough set. a definition of rough entropy of rough set is proposed, and we show that rough entropy of rough set decreases monotonously as granularity of information become smaller. This gives more accurate measure for roughness of rough set.

  15. A Rough Set based Gene Expression Clustering Algorithm

    Directory of Open Access Journals (Sweden)

    J. J. Emilyn

    2011-01-01

    Full Text Available Problem statement: Microarray technology helps in monitoring the expression levels of thousands of genes across collections of related samples. Approach: The main goal in the analysis of large and heterogeneous gene expression datasets was to identify groups of genes that get expressed in a set of experimental conditions. Results: Several clustering techniques have been proposed for identifying gene signatures and to understand their role and many of them have been applied to gene expression data, but with partial success. The main aim of this work was to develop a clustering algorithm that would successfully indentify gene patterns. The proposed novel clustering technique (RCGED provides an efficient way of finding the hidden and unique gene expression patterns. It overcomes the restriction of one object being placed in only one cluster. Conclusion/Recommendations: The proposed algorithm is termed intelligent because it automatically determines the optimum number of clusters. The proposed algorithm was experimented with colon cancer dataset and the results were compared with Rough Fuzzy K Means algorithm.

  16. Transcriptome profiling of Set5 and Set1 methyltransferases: Tools for visualization of gene expression

    Directory of Open Access Journals (Sweden)

    Glòria Mas Martín

    2014-12-01

    Full Text Available Cells regulate transcription by coordinating the activities of multiple histone modifying complexes. We recently identified the yeast histone H4 methyltransferase Set5 and discovered functional overlap with the histone H3 methyltransferase Set1 in gene expression. Specifically, using next-generation RNA sequencing (RNA-Seq, we found that Set5 and Set1 function synergistically to regulate specific transcriptional programs at subtelomeres and transposable elements. Here we provide a comprehensive description of the methodology and analysis tools corresponding to the data deposited in NCBI's Gene Expression Omnibus (GEO under the accession number GSE52086. This data complements the experimental methods described in Mas Martín G et al. (2014 and provides the means to explore the cooperative functions of histone H3 and H4 methyltransferases in the regulation of transcription. Furthermore, a fully annotated R code is included to enable researchers to use the following computational tools: comparison of significant differential expression (SDE profiles; gene ontology enrichment of SDE; and enrichment of SDE relative to chromosomal features, such as centromeres, telomeres, and transposable elements. Overall, we present a bioinformatics platform that can be generally implemented for similar analyses with different datasets and in different organisms.

  17. Rough Set Approach to Incomplete Multiscale Information System

    Science.gov (United States)

    Yang, Xibei; Qi, Yong; Yu, Dongjun; Yu, Hualong; Song, Xiaoning; Yang, Jingyu

    2014-01-01

    Multiscale information system is a new knowledge representation system for expressing the knowledge with different levels of granulations. In this paper, by considering the unknown values, which can be seen everywhere in real world applications, the incomplete multiscale information system is firstly investigated. The descriptor technique is employed to construct rough sets at different scales for analyzing the hierarchically structured data. The problem of unravelling decision rules at different scales is also addressed. Finally, the reduct descriptors are formulated to simplify decision rules, which can be derived from different scales. Some numerical examples are employed to substantiate the conceptual arguments. PMID:25276852

  18. Minimum Information Dominating Set for Critical Sampling over Graphs

    Science.gov (United States)

    2015-04-01

    information on IDS uniquely determines the information in the rest of the graph. The minimum vertex cover ( MVC ) [10, 11] and the mini- mum dominating set (MDS...12, 13] are related to the IDS problem. The MVC asks for a minimum subset of vertices such that each edge in the original graph is adjacent to at...vertex in this subset. The minimum IDS problem is inherently more complex than MVC and MDS. For instance, as shown in this paper, it is co-NP-complete to

  19. Causal Information Approach to Partial Conditioning in Multivariate Data Sets

    Directory of Open Access Journals (Sweden)

    D. Marinazzo

    2012-01-01

    Full Text Available When evaluating causal influence from one time series to another in a multivariate data set it is necessary to take into account the conditioning effect of the other variables. In the presence of many variables and possibly of a reduced number of samples, full conditioning can lead to computational and numerical problems. In this paper, we address the problem of partial conditioning to a limited subset of variables, in the framework of information theory. The proposed approach is tested on simulated data sets and on an example of intracranial EEG recording from an epileptic subject. We show that, in many instances, conditioning on a small number of variables, chosen as the most informative ones for the driver node, leads to results very close to those obtained with a fully multivariate analysis and even better in the presence of a small number of samples. This is particularly relevant when the pattern of causalities is sparse.

  20. Argudas: arguing with gene expression information

    CERN Document Server

    McLeod, Kenneth; Burger, Albert

    2010-01-01

    In situ hybridisation gene expression information helps biologists identify where a gene is expressed. However, the databases that republish the experimental information are often both incomplete and inconsistent. This paper examines a system, Argudas, designed to help tackle these issues. Argudas is an evolution of an existing system, and so that system is reviewed as a means of both explaining and justifying the behaviour of Argudas. Throughout the discussion of Argudas a number of issues will be raised including the appropriateness of argumentation in biology and the challenges faced when integrating apparently similar online biological databases.

  1. Information behavior versus communication: application models in multidisciplinary settings

    Directory of Open Access Journals (Sweden)

    Cecília Morena Maria da Silva

    2015-05-01

    Full Text Available This paper deals with the information behavior as support for models of communication design in the areas of Information Science, Library and Music. The communication models proposition is based on models of Tubbs and Moss (2003, Garvey and Griffith (1972, adapted by Hurd (1996 and Wilson (1999. Therefore, the questions arose: (i what are the informational skills required of librarians who act as mediators in scholarly communication process and informational user behavior in the educational environment?; (ii what are the needs of music related researchers and as produce, seek, use and access the scientific knowledge of your area?; and (iii as the contexts involved in scientific collaboration processes influence in the scientific production of information science field in Brazil? The article includes a literature review on the information behavior and its insertion in scientific communication considering the influence of context and/or situation of the objects involved in motivating issues. The hypothesis is that the user information behavior in different contexts and situations influence the definition of a scientific communication model. Finally, it is concluded that the same concept or a set of concepts can be used in different perspectives, reaching up, thus, different results.

  2. Constructing Minimal Spanning Tree Based on Rough Set Theory for Gene Selection

    Directory of Open Access Journals (Sweden)

    Soumen Kumar Pati

    2012-11-01

    Full Text Available Microarray gene dataset often contains high dimensionalities which cause difficulty in clustering and classification. Datasets containing huge number of genes lead to increased complexity and therefore, degradation of dataset handling performance. Often, all the measured features of these high-dimensional datasets are not relevant for understanding the underlying phenomena of interest. Dimensionality reduction by reduct generation is hence performed as an important step before clustering and classification. The reduced attribute set has the same characteristics as the entire set of attributes in the information system. In this paper, a new attribute reduction technique, based on directed minimal spanning tree and rough set theory is done, for unsupervised learning. The method, firstly, computes a similarity factor between each pair of attributes using indiscernibility relation, a concept of rough set theory. Based on the similarity factors, an attribute similarity set is formed from which a directed weighted graph with vertices as attributes and edge weights as the inverse of the similarity factor is constructed. Then, all possible minimal spanning trees of the graph are generated. From each tree, iteratively, the most important vertex is included in the reduct set and all its out-going edges are removed. The process stops when the edge set is empty, thus producing multiple reducts. The proposed method and some well-known attribute reduction techniques have been applied on several microarray gene datasets for gene selection. The results obtained show the effectiveness of the method.

  3. CONSTRUCTING MINIMAL SPANNING TREE BASED ON ROUGH SET THEORY FOR GENE SELECTION

    Directory of Open Access Journals (Sweden)

    Soumen Kumar Pati

    2013-01-01

    Full Text Available Microarray gene dataset often contains high dimensionalities which cause difficulty in clustering and classification. Datasets containing huge number of genes lead to increased complexity and therefore, degradation of dataset handling performance. Often, all the measured features of these high-dimensional datasets are not relevant for understanding the underlying phenomena of interest. Dimensionality reduction by reduct generation is hence performed as an important step before clustering and classification. The reduced attribute set has the same characteristics as the entire set of attributes in the information system. In this paper, a new attribute reduction technique, based on directed minimal spanning tree and rough set theory is done, for unsupervised learning. The method, firstly, computes a similarity factor between each pair of attributes using indiscernibility relation, a concept of rough set theory. Based on the similarity factors, an attribute similarity set is formed from which a directed weighted graph with vertices as attributes and edge weights as the inverse of the similarity factor is constructed. Then, all possible minimal spanning trees of the graph are generated. From each tree, iteratively, the most important vertex is included in the reduct set and all its out-going edges are removed. The process stops when the edge set is empty, thus producing multiple reducts. The proposed method and some well-known attribute reduction techniques have been applied on several microarray gene datasets for gene selection. The results obtained show the effectiveness of the method.

  4. Minimum Information about a Biosynthetic Gene cluster

    NARCIS (Netherlands)

    Medema, M.H.; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, J.B.; Blin, Kai; Bruijn, De Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R.C.; Cruz-Morales, Pablo; Duddela, Srikanth; Düsterhus, Stephanie; Edwards, Daniel J.; Fewer, David P.; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S.; Helfrich, Eric J.N.; Hillwig, Matthew L.; Ishida, Keishi; Jones, Adam C.; Jones, Carla S.; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kötter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V.; Mantovani, Simone M.; Monroe, Emily A.; Moore, Marcus; Moss, Nathan; Nützmann, Hans Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F.J.; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J.; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K.; Balibar, Carl J.; Balskus, Emily P.; Barona-Gómez, Francisco; Bechthold, Andreas; Bode, Helge B.; Borriss, Rainer; Brady, Sean F.; Brakhage, Axel A.; Caffrey, Patrick; Cheng, Yi Qiang; Clardy, Jon; Cox, Russell J.; Mot, De René; Donadio, Stefano; Donia, Mohamed S.; Donk, Van Der Wilfred A.; Dorrestein, Pieter C.; Doyle, Sean; Driessen, Arnold J.M.; Ehling-Schulz, Monika; Entian, Karl Dieter; Fischbach, Michael A.; Gerwick, Lena; Gerwick, William H.; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Höfte, Monica; Jensen, Susan E.; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L.; Keller, Nancy P.; Kormanec, Jan; Kuipers, Oscar P.; Kuzuyama, Tomohisa; Kyrpides, Nikos C.; Kwon, Hyung Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y.; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Méndez, Carmen; Metsä-Ketelä, Mikko; Micklefield, Jason; Mitchell, Douglas A.; Moore, Bradley S.; Moreira, Leonilde M.; Müller, Rolf; Neilan, Brett A.; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S.; Ostash, Bohdan; Payne, Shelley M.; Pernodet, Jean Luc; Petricek, Miroslav; Piel, Jörn; Ploux, Olivier; Raaijmakers, Jos M.; Salas, José A.; Schmitt, Esther K.; Scott, Barry; Seipke, Ryan F.; Shen, Ben; Sherman, David H.; Sivonen, Kaarina; Smanski, Michael J.; Sosio, Margherita; Stegmann, Evi; Süssmuth, Roderich D.; Tahlan, Kapil; Thomas, Christopher M.; Tang, Yi; Truman, Andrew W.; Viaud, Muriel; Walton, Jonathan D.; Walsh, Christopher T.; Weber, Tilmann; Wezel, Van Gilles P.; Wilkinson, Barrie; Willey, Joanne M.; Wohlleben, Wolfgang; Wright, Gerard D.; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B.; Breitling, Rainer; Takano, Eriko; Glöckner, Frank Oliver

    2015-01-01

    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploi

  5. Generating route choice sets with operation information on metro networks

    Directory of Open Access Journals (Sweden)

    Wei Zhu

    2016-06-01

    Full Text Available In recent years, the metro system has advanced into an efficient transport system and become the mainstay of urban passenger transport in many mega-cities. Passenger flow is the foundation of making and coordinating operation plans for the metro system, and therefore, a variety of studies were conducted on transit assignment models. Nevertheless route choice sets of passengers also play a paramount role in flow estimation and demand prediction. This paper first discusses the main route constraints of which the train schedule is the most important, that distinguish rail networks from road networks. Then, a two-step approach to generate route choice set in a metro network is proposed. Particularly, the improved approach introduces a route filtering with train operational information based on the conventional method. An initial numerical test shows that the proposed approach gives more reasonable route choice sets for scheduled metro networks, and, consequently, obtains more accurate results from passenger flow assignment. Recommendations for possible opportunities to apply this approach to metro operations are also provided, including its integration into a metro passenger flow assignment and simulation system in practice to help metro authorities provide more precise guidance information for passengers to travel.

  6. A brain region-specific predictive gene map for autism derived by profiling a reference gene set.

    Directory of Open Access Journals (Sweden)

    Ajay Kumar

    Full Text Available Molecular underpinnings of complex psychiatric disorders such as autism spectrum disorders (ASD remain largely unresolved. Increasingly, structural variations in discrete chromosomal loci are implicated in ASD, expanding the search space for its disease etiology. We exploited the high genetic heterogeneity of ASD to derive a predictive map of candidate genes by an integrated bioinformatics approach. Using a reference set of 84 Rare and Syndromic candidate ASD genes (AutRef84, we built a composite reference profile based on both functional and expression analyses. First, we created a functional profile of AutRef84 by performing Gene Ontology (GO enrichment analysis which encompassed three main areas: 1 neurogenesis/projection, 2 cell adhesion, and 3 ion channel activity. Second, we constructed an expression profile of AutRef84 by conducting DAVID analysis which found enrichment in brain regions critical for sensory information processing (olfactory bulb, occipital lobe, executive function (prefrontal cortex, and hormone secretion (pituitary. Disease specificity of this dual AutRef84 profile was demonstrated by comparative analysis with control, diabetes, and non-specific gene sets. We then screened the human genome with the dual AutRef84 profile to derive a set of 460 potential ASD candidate genes. Importantly, the power of our predictive gene map was demonstrated by capturing 18 existing ASD-associated genes which were not part of the AutRef84 input dataset. The remaining 442 genes are entirely novel putative ASD risk genes. Together, we used a composite ASD reference profile to generate a predictive map of novel ASD candidate genes which should be prioritized for future research.

  7. Querying Large Physics Data Sets Over an Information Grid

    Institute of Scientific and Technical Information of China (English)

    NigelBaker; ZsoltKovacs; 等

    2001-01-01

    Optimising use of the Web(WWW) for LHC data analysis is a complex problem and illustrates the challenges arising from the integration of and computation across massive ampunts of information distributed worldwide.Finding the right piece of information can,at times,be extremely time-consuming,if not impossible,SO-called Grids have been proposed to facilitate LHC computing and many groups have embarked on studies of data replication,data migration and netwroking plhilosophies.Other aspects such as the role of moddleware' for Grids are emerging as requiring research.This paper positions the need for appropriate middleware that enables users to resolve physics queries across massive data sets.It identifies the role of meta-data for query resolution and the importance of Information Grids for high-energy physics analysis rather than just computational or Data Grids,This paper identifies software that is being implemented at CERN to enable the querying of very large collaborating HEP data-sets,initially being employed for the construction of CMS detectors.

  8. Constructing Minimal Spanning Tree Based on Rough Set Theory for Gene Selection

    Directory of Open Access Journals (Sweden)

    Soumen Kumar Pati

    2013-02-01

    Full Text Available Microarray gene dataset often contains high dimensionalities which cause difficulty in clustering andclassification. Datasets containing huge number of genes lead to increased complexity and therefore,degradation of dataset handling performance. Often, all the measured features of these high-dimensionaldatasets are not relevant for understanding the underlying phenomena of interest. Dimensionality reductionby reduct generation is hence performed as an important step before clustering and classification. Thereduced attribute set has the same characteristics as the entire set of attributes in the information system.In this paper, a new attribute reduction technique, based on directed minimal spanning tree and rough settheory is done, for unsupervised learning. The method, firstly, computes a similarity factor between eachpair of attributes using indiscernibility relation, a concept of rough set theory. Based on the similarityfactors, an attribute similarity set is formed from which a directed weighted graph with vertices asattributes and edge weights as the inverse of the similarity factor is constructed. Then, all possible minimalspanning trees of the graph are generated. From each tree, iteratively, the most important vertex isincluded in the reduct set and all its out-going edges are removed. The process stops when the edge set isempty, thus producing multiple reducts. The proposed method and some well-known attribute reductiontechniques have been applied on several microarray gene datasets for gene selection. The results obtainedshow the effectiveness of the method.

  9. Algorithm for Finding Optimal Gene Sets in Microarray Prediction

    CERN Document Server

    Deutsch, J M

    2001-01-01

    Motivation: Microarray data has been recently been shown to be efficacious in distinguishing closely related cell types that often appear in the diagnosis of cancer. It is useful to determine the minimum number of genes needed to do such a diagnosis both for clinical use and to determine the importance of specific genes for cancer. Here a replication algorithm is used for this purpose. It evolves an ensemble of predictors, all using different combinations of genes to generate a set of optimal predictors. Results: We apply this method to the leukemia data of the Whitehead/MIT group that attempts to differentially diagnose two kinds of leukemia, and also to data of Khan et. al. to distinguish four different kinds of childhood cancers. In the latter case we were able to reduce the number of genes needed from 96 down to 15, while at the same time being able to perfectly classify all of their test data. Availability: http://stravinsky.ucsc.edu/josh/gesses/ Contact: josh@physics.ucsc.edu

  10. Coverage and characteristics of the Affymetrix GeneChip Human Mapping 100K SNP set.

    Directory of Open Access Journals (Sweden)

    2006-05-01

    Full Text Available Improvements in technology have made it possible to conduct genome-wide association mapping at costs within reach of academic investigators, and experiments are currently being conducted with a variety of high-throughput platforms. To provide an appropriate context for interpreting results of such studies, we summarize here results of an investigation of one of the first of these technologies to be publicly available, the Affymetrix GeneChip Human Mapping 100K set of single nucleotide polymorphisms (SNPs. In a systematic analysis of the pattern and distribution of SNPs in the Mapping 100K set, we find that SNPs in this set are undersampled from coding regions (both nonsynonymous and synonymous and oversampled from regions outside genes, relative to SNPs in the overall HapMap database. In addition, we utilize a novel multilocus linkage disequilibrium (LD coefficient based on information content (analogous to the information content scores commonly used for linkage mapping that is equivalent to the familiar measure r2 in the special case of two loci. Using this approach, we are able to summarize for any subset of markers, such as the Affymetrix Mapping 100K set, the information available for association mapping in that subset, relative to the information available in the full set of markers included in the HapMap, and highlight circumstances in which this multilocus measure of LD provides substantial additional insight about the haplotype structure in a region over pairwise measures of LD.

  11. Theoretical perspectives on learning in an informal setting

    Science.gov (United States)

    Anderson, David; Lucas, Keith B.; Ginns, Ian S.

    2003-02-01

    Research into learning in informal settings such as museums has been in a formative state during the past decade, and much of that research has been descriptive and lacking a theory base. In this article, it is proposed that the human constructivist view of learning can guide research and assist the interpretation of research data because it recognizes an individual's prior knowledge and active involvement in knowledge construction during a museum visit. This proposal is supported by reference to the findings of a previously reported interpretive case study, which included concept mapping and semistructured interviews, of the knowledge transformations of three Year 7 students who had participated in a class visit to a science museum and associated postvisit activities. The findings from that study are shown in this report to be consistent with the human constructivist view of learning in that for all three students, learning was found to be at times incremental and at other times to involve substantial restructuring of knowledge. Thus, we regard that the human constructivist view of learning has much merit and utility for researchers investigating the development of knowledge and understanding emergent from experiences in informal settings. The theoretical and practical implications of these findings for teachers and staff of museums and similar institutions are also discussed.

  12. Information retrieval pathways for health information exchange in multiple care settings

    DEFF Research Database (Denmark)

    Kierkegaard, Patrick; Kaushal, Rainu; Vest, Joshua R.

    2014-01-01

    Objectives To determine which health information exchange (HIE) technologies and information retrieval pathways healthcare professionals relied on to meet their information needs in the context of laboratory test results, radiological images and reports, and medication histories. Study Design...... Primary data was collected over a 2-month period across 3 emergency departments, 7 primary care practices, and 2 public health clinics in New York state. Methods Qualitative research methods were used to collect and analyze data from semi-structured interviews and participant observation. Results...... The study reveals that healthcare professionals used a complex combination of information retrieval pathways for HIE to obtain clinical information from external organizations. The choice for each approach was setting- and information-specific, but was also highly dynamic across users and their information...

  13. Gene Selection Integrated with Biological Knowledge for Plant Stress Response Using Neighborhood System and Rough Set Theory.

    Science.gov (United States)

    Meng, Jun; Zhang, Jing; Luan, Yushi

    2015-01-01

    Mining knowledge from gene expression data is a hot research topic and direction of bioinformatics. Gene selection and sample classification are significant research trends, due to the large amount of genes and small size of samples in gene expression data. Rough set theory has been successfully applied to gene selection, as it can select attributes without redundancy. To improve the interpretability of the selected genes, some researchers introduced biological knowledge. In this paper, we first employ neighborhood system to deal directly with the new information table formed by integrating gene expression data with biological knowledge, which can simultaneously present the information in multiple perspectives and do not weaken the information of individual gene for selection and classification. Then, we give a novel framework for gene selection and propose a significant gene selection method based on this framework by employing reduction algorithm in rough set theory. The proposed method is applied to the analysis of plant stress response. Experimental results on three data sets show that the proposed method is effective, as it can select significant gene subsets without redundancy and achieve high classification accuracy. Biological analysis for the results shows that the interpretability is well.

  14. For your information. Management in the school setting: position statement.

    Science.gov (United States)

    Zacharski, Susan; DeSisto, Marie; Pontius, Deborah; Sheets, Jodi; Richesin, Cynthia

    2013-09-01

    It is the position of the National Association of School Nurses (NASN) that the safe and effective management of allergies and anaphylaxis in schools requires a collaborative, multidisciplinary team approach. The registered professional school nurse (hereinafter referred to as the school nurse) is the leader in a comprehensive management approach that includes planning and coordination of care, educating staff, providing a safe environment, and ensuring prompt emergency response should exposure to a life-threatening allergen occur. Furthermore, NASN supports, in states where laws and regulations allow, the maintenance of stock nonpatient-specific epinephrine and physician-standing orders for school nurses to administer epinephrine in life-threatening situations in the school setting. School districts must have a clear, concise, all-inclusive policy in place to address the management of allergies in the school setting that should be reviewed annually (National School Boards Association [NSBA], 2012). This policy shall be consistent with federal and state laws, nursing practice standards, and established safe practices in accordance with evidence-based information and include development of a developmentally appropriate Individualized Healthcare Plan (IHP) and Emergency Care Plan (ECP).

  15. Grouping Gene Ontology terms to improve the assessment of gene set enrichment in microarray data.

    Science.gov (United States)

    Lewin, Alex; Grieve, Ian C

    2006-10-03

    Gene Ontology (GO) terms are often used to assess the results of microarray experiments. The most common way to do this is to perform Fisher's exact tests to find GO terms which are over-represented amongst the genes declared to be differentially expressed in the analysis of the microarray experiment. However, due to the high degree of dependence between GO terms, statistical testing is conservative, and interpretation is difficult. We propose testing groups of GO terms rather than individual terms, to increase statistical power, reduce dependence between tests and improve the interpretation of results. We use the publicly available package POSOC to group the terms. Our method finds groups of GO terms significantly over-represented amongst differentially expressed genes which are not found by Fisher's tests on individual GO terms. Grouping Gene Ontology terms improves the interpretation of gene set enrichment for microarray data.

  16. Grouping Gene Ontology terms to improve the assessment of gene set enrichment in microarray data

    Directory of Open Access Journals (Sweden)

    Grieve Ian C

    2006-10-01

    Full Text Available Abstract Background Gene Ontology (GO terms are often used to assess the results of microarray experiments. The most common way to do this is to perform Fisher's exact tests to find GO terms which are over-represented amongst the genes declared to be differentially expressed in the analysis of the microarray experiment. However, due to the high degree of dependence between GO terms, statistical testing is conservative, and interpretation is difficult. Results We propose testing groups of GO terms rather than individual terms, to increase statistical power, reduce dependence between tests and improve the interpretation of results. We use the publicly available package POSOC to group the terms. Our method finds groups of GO terms significantly over-represented amongst differentially expressed genes which are not found by Fisher's tests on individual GO terms. Conclusion Grouping Gene Ontology terms improves the interpretation of gene set enrichment for microarray data.

  17. Transcriptomic Analysis Identifies Candidate Genes and Gene Sets Controlling the Response of Porcine Peripheral Blood Mononuclear Cells to Poly I:C Stimulation

    Directory of Open Access Journals (Sweden)

    Jiying Wang

    2016-05-01

    Full Text Available Polyinosinic-polycytidylic acid (poly I:C, a synthetic dsRNA analog, has been demonstrated to have stimulatory effects similar to viral dsRNA. To gain deep knowledge of the host transcriptional response of pigs to poly I:C stimulation, in the present study, we cultured and stimulated peripheral blood mononuclear cells (PBMC of piglets of one Chinese indigenous breed (Dapulian and one modern commercial breed (Landrace with poly I:C, and compared their transcriptional profiling using RNA-sequencing (RNA-seq. Our results indicated that poly I:C stimulation can elicit significantly differentially expressed (DE genes in Dapulian (g = 290 as well as Landrace (g = 85. We also performed gene set analysis using the Gene Set Enrichment Analysis (GSEA package, and identified some significantly enriched gene sets in Dapulian (g = 18 and Landrace (g = 21. Most of the shared DE genes and gene sets were immune-related, and may play crucial rules in the immune response of poly I:C stimulation. In addition, we detected large sets of significantly DE genes and enriched gene sets when comparing the gene expression profile between the two breeds, including control and poly I:C stimulation groups. Besides immune-related functions, some of the DE genes and gene sets between the two breeds were involved in development and growth of various tissues, which may be correlated with the different characteristics of the two breeds. The DE genes and gene sets detected herein provide crucial information towards understanding the immune regulation of antiviral responses, and the molecular mechanisms of different genetic resistance to viral infection, in modern and indigenous pigs.

  18. Evolutionary signatures amongst disease genes permit novel methods for gene prioritization and construction of informative gene-based networks.

    Directory of Open Access Journals (Sweden)

    Nolan Priedigkeit

    2015-02-01

    Full Text Available Genes involved in the same function tend to have similar evolutionary histories, in that their rates of evolution covary over time. This coevolutionary signature, termed Evolutionary Rate Covariation (ERC, is calculated using only gene sequences from a set of closely related species and has demonstrated potential as a computational tool for inferring functional relationships between genes. To further define applications of ERC, we first established that roughly 55% of genetic diseases posses an ERC signature between their contributing genes. At a false discovery rate of 5% we report 40 such diseases including cancers, developmental disorders and mitochondrial diseases. Given these coevolutionary signatures between disease genes, we then assessed ERC's ability to prioritize known disease genes out of a list of unrelated candidates. We found that in the presence of an ERC signature, the true disease gene is effectively prioritized to the top 6% of candidates on average. We then apply this strategy to a melanoma-associated region on chromosome 1 and identify MCL1 as a potential causative gene. Furthermore, to gain global insight into disease mechanisms, we used ERC to predict molecular connections between 310 nominally distinct diseases. The resulting "disease map" network associates several diseases with related pathogenic mechanisms and unveils many novel relationships between clinically distinct diseases, such as between Hirschsprung's disease and melanoma. Taken together, these results demonstrate the utility of molecular evolution as a gene discovery platform and show that evolutionary signatures can be used to build informative gene-based networks.

  19. Annotating novel genes by integrating synthetic lethals and genomic information

    Directory of Open Access Journals (Sweden)

    Faty Mahamadou

    2008-01-01

    Full Text Available Abstract Background Large scale screening for synthetic lethality serves as a common tool in yeast genetics to systematically search for genes that play a role in specific biological processes. Often the amounts of data resulting from a single large scale screen far exceed the capacities of experimental characterization of every identified target. Thus, there is need for computational tools that select promising candidate genes in order to reduce the number of follow-up experiments to a manageable size. Results We analyze synthetic lethality data for arp1 and jnm1, two spindle migration genes, in order to identify novel members in this process. To this end, we use an unsupervised statistical method that integrates additional information from biological data sources, such as gene expression, phenotypic profiling, RNA degradation and sequence similarity. Different from existing methods that require large amounts of synthetic lethal data, our method merely relies on synthetic lethality information from two single screens. Using a Multivariate Gaussian Mixture Model, we determine the best subset of features that assign the target genes to two groups. The approach identifies a small group of genes as candidates involved in spindle migration. Experimental testing confirms the majority of our candidates and we present she1 (YBL031W as a novel gene involved in spindle migration. We applied the statistical methodology also to TOR2 signaling as another example. Conclusion We demonstrate the general use of Multivariate Gaussian Mixture Modeling for selecting candidate genes for experimental characterization from synthetic lethality data sets. For the given example, integration of different data sources contributes to the identification of genetic interaction partners of arp1 and jnm1 that play a role in the same biological process.

  20. Jetset: selecting the optimal microarray probe set to represent a gene

    DEFF Research Database (Denmark)

    Li, Qiyuan; Birkbak, Nicolai Juul; Gyorffy, Balazs

    2011-01-01

    Background: Interpretation of gene expression microarrays requires a mapping from probe set to gene. On many Affymetrix gene expression microarrays, a given gene may be detected by multiple probe sets, which may deliver inconsistent or even contradictory measurements. Therefore, obtaining...... an unambiguous expression estimate of a pre-specified gene can be a nontrivial but essential task. Results: We developed scoring methods to assess each probe set for specificity, splice isoform coverage, and robustness against transcript degradation. We used these scores to select a single representative probe...... set for each gene, thus creating a simple one-to-one mapping between gene and probe set. To test this method, we evaluated concordance between protein measurements and gene expression values, and between sets of genes whose expression is known to be correlated. For both test cases, we identified genes...

  1. SEARCHING PROXIES OF INVESTMENT OPPORTUNITY SETS AND IDENTIFYING INFORMATION CONTENT

    Directory of Open Access Journals (Sweden)

    Hermeindito Kaaro

    2002-01-01

    Full Text Available The concept of the investment opportunity set (IOS, which was first noted by Myers (1977, plays an important role in the capital market because it implies future growth, which is relevant in predicting the shareholder's expected wealth. Unfortunately, IOS cannot be observed directly. Because IOS is an unobservable construct, the researcher must find appropriate proxies for IOS to capture Myers' idea. A number of studies have been done to obtain appropriate proxies of IOS. One major finding is presented by Kallapur and Trombley (1999, that suggests appropriate proxies of IOS and attempting to identify whether the IOS of the firms are credible for representing the future (realized growth. This paper attempts to review IOS literature, especially Kallapur and Trombley's study. Some limitations of their study are noted here. This paper proposes a method to confirm the constructs of IOS and develops a new model in searching appropriate proxies for IOS. The present model also allows researchers not only to determine appropriate proxies for IOS but also to identify which proxies of IOS have good or bad information content. Abstract in Bahasa Indonesia : Konsep investment opportunity set (IOS, yang pertama diungkapkan oleh Myers (1977, memiliki peranan penting dalam pasar modal karena berimplikasi adanya pertumbuhan di masa depan, yang relevan untuk meramalkan ekspektasi kekayaan pemegang saham. Namun, kelemahannya adalah IOS tidak dapat diamati langsung. Oleh karena itu, para peneliti harus mencari data yang tetap yang dapat mewakili variabel IOS agar sesuai dengan gagasan Myers. Beberapa penelitian telah dilakukan untuk itu termasuk penemuan dari Kallapur dan Trombley (1999 yang mengusulkan beberapa variabel yang tepat untuk mewakili IOS dan juga mencoba menentukan apakah IOS dari perusahaan-perusahaan cocok untuk menggambarkan pertumbuhan masa depan yang actual. Penelitian ini bertujuan mengkaji literature IOS, terutama penelitian Kallapur dan

  2. Information content of partially rank-ordered set samples

    OpenAIRE

    Hatefi, Armin; Jozani, Mohammad Jafari

    2015-01-01

    Partially rank-ordered set (PROS) sampling is a generalization of ranked set sampling in which rankers are not required to fully rank the sampling units in each set, hence having more flexibility to perform the necessary judgemental ranking process. The PROS sampling has a wide range of applications in different fields ranging from environmental and ecological studies to medical research and it has been shown to be superior over ranked set sampling and simple random sampling for estimating th...

  3. goSTAG: gene ontology subtrees to tag and annotate genes within a set.

    Science.gov (United States)

    Bennett, Brian D; Bushel, Pierre R

    2017-01-01

    Over-representation analysis (ORA) detects enrichment of genes within biological categories. Gene Ontology (GO) domains are commonly used for gene/gene-product annotation. When ORA is employed, often times there are hundreds of statistically significant GO terms per gene set. Comparing enriched categories between a large number of analyses and identifying the term within the GO hierarchy with the most connections is challenging. Furthermore, ascertaining biological themes representative of the samples can be highly subjective from the interpretation of the enriched categories. We developed goSTAG for utilizing GO Subtrees to Tag and Annotate Genes that are part of a set. Given gene lists from microarray, RNA sequencing (RNA-Seq) or other genomic high-throughput technologies, goSTAG performs GO enrichment analysis and clusters the GO terms based on the p-values from the significance tests. GO subtrees are constructed for each cluster, and the term that has the most paths to the root within the subtree is used to tag and annotate the cluster as the biological theme. We tested goSTAG on a microarray gene expression data set of samples acquired from the bone marrow of rats exposed to cancer therapeutic drugs to determine whether the combination or the order of administration influenced bone marrow toxicity at the level of gene expression. Several clusters were labeled with GO biological processes (BPs) from the subtrees that are indicative of some of the prominent pathways modulated in bone marrow from animals treated with an oxaliplatin/topotecan combination. In particular, negative regulation of MAP kinase activity was the biological theme exclusively in the cluster associated with enrichment at 6 h after treatment with oxaliplatin followed by control. However, nucleoside triphosphate catabolic process was the GO BP labeled exclusively at 6 h after treatment with topotecan followed by control. goSTAG converts gene lists from genomic analyses into biological themes

  4. Core gene set as the basis of multilocus sequence analysis of the subclass Actinobacteridae.

    Directory of Open Access Journals (Sweden)

    Toïdi Adékambi

    Full Text Available Comparative genomic sequencing is shedding new light on bacterial identification, taxonomy and phylogeny. An in silico assessment of a core gene set necessary for cellular functioning was made to determine a consensus set of genes that would be useful for the identification, taxonomy and phylogeny of the species belonging to the subclass Actinobacteridae which contained two orders Actinomycetales and Bifidobacteriales. The subclass Actinobacteridae comprised about 85% of the actinobacteria families. The following recommended criteria were used to establish a comprehensive gene set; the gene should (i be long enough to contain phylogenetically useful information, (ii not be subject to horizontal gene transfer, (iii be a single copy (iv have at least two regions sufficiently conserved that allow the design of amplification and sequencing primers and (v predict whole-genome relationships. We applied these constraints to 50 different Actinobacteridae genomes and made 1,224 pairwise comparisons of the genome conserved regions and gene fragments obtained by using Sequence VARiability Analysis Program (SVARAP, which allow designing the primers. Following a comparative statistical modeling phase, 3 gene fragments were selected, ychF, rpoB, and secY with R2>0.85. Selected sets of broad range primers were tested from the 3 gene fragments and were demonstrated to be useful for amplification and sequencing of 25 species belonging to 9 genera of Actinobacteridae. The intraspecies similarities were 96.3-100% for ychF, 97.8-100% for rpoB and 96.9-100% for secY among 73 strains belonging to 15 species of the subclass Actinobacteridae compare to 99.4-100% for 16S rRNA. The phylogenetic topology obtained from the combined datasets ychF+rpoB+secY was globally similar to that inferred from the 16S rRNA but with higher confidence. It was concluded that multi-locus sequence analysis using core gene set might represent the first consensus and valid approach for

  5. An Ethnographically Informed Participatory Design of Primary Healthcare Information Technology in a Developing Country Setting.

    Science.gov (United States)

    Shidende, Nima Herman; Igira, Faraja Teddy; Mörtberg, Christina Margaret

    2017-01-01

    Ethnography, with its emphasis on understanding activities where they occur, and its use of qualitative data gathering techniques rich in description, has a long tradition in Participatory Design (PD). Yet there are limited methodological insights in its application in developing countries. This paper proposes an ethnographically informed PD approach, which can be applied when designing Primary Healthcare Information Technology (PHIT). We use findings from a larger multidisciplinary project, Health Information Systems Project (HISP) to elaborate how ethnography can be used to facilitate participation of health practitioners in developing countries settings as well as indicating the importance of ethnographic approach to participatory Health Information Technology (HIT) designers. Furthermore, the paper discusses the pros and cons of using an ethnographic approach in designing HIT.

  6. Risk and information evaluation of prioritized genes for complex traits: application to bipolar disorder.

    Science.gov (United States)

    Kao, Chung-Feng; Chuang, Li-Chung; Kuo, Po-Hsiu

    2014-10-01

    Many susceptibility genes for complex traits were identified without conclusive findings. There is a strong need to integrate rapidly accumulated genomic data from multi-dimensional platforms, and to conduct risk evaluation for potential therapeutic and diagnostic usages. We set up an algorithm to computationally search for optimal weight-vector for various data sources, while minimized potential noises. Through gene-prioritization framework, combined scores for the resulting prioritized gene-set were calculated using a genome-wide association (GWA) dataset, following with evaluation using weighted genetic risk score and risk-attributed information using an independent GWA dataset. The significance of association of GWA data was corrected for gene length. Enriched functional pathways were identified for the prioritized gene-set using the Gene Ontology analysis. We illustrated our framework with bipolar disorder. 233 prioritized genes were identified from 10,830 candidates that curated from six platforms. The prioritized genes were significantly enriched (P(adjusted) evaluation demonstrated higher weighted genetic risk score in bipolar patients than controls (P-values ranged from 0.002 to 3.8 × 10(-6)). Substantial risk-information (71%) was extracted from prioritized genes for bipolar illness than other candidate-gene sets. Our evidence-based prioritized gene-set provides opportunity to explore the complex network and to conduct follow-up basic and clinical studies for complex traits.

  7. Beyond main effects of gene-sets: harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior

    NARCIS (Netherlands)

    J. Windhorst (Judith); V. Mileva-Seitz; R.C.A. Rippe (Ralph C.A.); H.W. Tiemeier (Henning); V.W.V. Jaddoe (Vincent); F.C. Verhulst (Frank); M.H. van IJzendoorn (Marinus); M.J. Bakermans-Kranenburg (Marian)

    2016-01-01

    textabstractBackground: In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene-based and gene-se

  8. Beyond main effects of gene-sets: Harsh parenting moderates the association between a dopamine gene-set and child externalizing behavior

    OpenAIRE

    Windhorst, D.A.; Mileva, V.R.; Rippe, R.C.A.; Tiemeier, H; Jaddoe, V. W. V.; Verhulst, F. C.; IJzendoorn, van, M.H.; Bakermans, M.J.

    2016-01-01

    Abstract Background In a longitudinal cohort study, we investigated the interplay of harsh parenting and genetic variation across a set of functionally related dopamine genes, in association with children's externalizing behavior. This is one of the first studies to employ gene‐based and gene‐set approaches in tests of Gene by Environment (G × E) effects on complex behavior. This approach can offer an important alternative or complement to candidate gene and genome‐wide environmental interact...

  9. Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions

    Directory of Open Access Journals (Sweden)

    Hongyan Zhang

    2014-01-01

    Full Text Available In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ2-IRG-DC. First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ2-DC within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above with χ2-DC. Furthermore, we analyzed the robustness of χ2-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed that χ2-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected by χ2-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance in χ2-DC.

  10. Combining evidence, biomedical literature and statistical dependence: new insights for functional annotation of gene sets

    Directory of Open Access Journals (Sweden)

    Burgun Anita

    2006-05-01

    Full Text Available Abstract Background Large-scale genomic studies based on transcriptome technologies provide clusters of genes that need to be functionally annotated. The Gene Ontology (GO implements a controlled vocabulary organised into three hierarchies: cellular components, molecular functions and biological processes. This terminology allows a coherent and consistent description of the knowledge about gene functions. The GO terms related to genes come primarily from semi-automatic annotations made by trained biologists (annotation based on evidence or text-mining of the published scientific literature (literature profiling. Results We report an original functional annotation method based on a combination of evidence and literature that overcomes the weaknesses and the limitations of each approach. It relies on the Gene Ontology Annotation database (GOA Human and the PubGene biomedical literature index. We support these annotations with statistically associated GO terms and retrieve associative relations across the three GO hierarchies to emphasise the major pathways involved by a gene cluster. Both annotation methods and associative relations were quantitatively evaluated with a reference set of 7397 genes and a multi-cluster study of 14 clusters. We also validated the biological appropriateness of our hybrid method with the annotation of a single gene (cdc2 and that of a down-regulated cluster of 37 genes identified by a transcriptome study of an in vitro enterocyte differentiation model (CaCo-2 cells. Conclusion The combination of both approaches is more informative than either separate approach: literature mining can enrich an annotation based only on evidence. Text-mining of the literature can also find valuable associated MEDLINE references that confirm the relevance of the annotation. Eventually, GO terms networks can be built with associative relations in order to highlight cooperative and competitive pathways and their connected molecular functions.

  11. Goal setting and action planning in the rehabilitation setting: development of a theoretically informed practice framework.

    Science.gov (United States)

    Scobbie, Lesley; Dixon, Diane; Wyke, Sally

    2011-05-01

    Setting and achieving goals is fundamental to rehabilitation practice but has been criticized for being a-theoretical and the key components of replicable goal-setting interventions are not well established. To describe the development of a theory-based goal setting practice framework for use in rehabilitation settings and to detail its component parts. Causal modelling was used to map theories of behaviour change onto the process of setting and achieving rehabilitation goals, and to suggest the mechanisms through which patient outcomes are likely to be affected. A multidisciplinary task group developed the causal model into a practice framework for use in rehabilitation settings through iterative discussion and implementation with six patients. Four components of a goal-setting and action-planning practice framework were identified: (i) goal negotiation, (ii) goal identification, (iii) planning, and (iv) appraisal and feedback. The variables hypothesized to effect change in patient outcomes were self-efficacy and action plan attainment. A theory-based goal setting practice framework for use in rehabilitation settings is described. The framework requires further development and systematic evaluation in a range of rehabilitation settings.

  12. Human Effector / Initiator Gene Sets That Regulate Myometrial Contractility During Term and Preterm Labor

    Science.gov (United States)

    WEINER, Carl P.; MASON, Clifford W.; DONG, Yafeng; BUHIMSCHI, Irina A.; SWAAN, Peter W.; BUHIMSCHI, Catalin S.

    2010-01-01

    Objective Distinct processes govern transition from quiescence to activation during term (TL) and preterm labor (PTL). We sought gene sets responsible for TL and PTL, along with the effector genes necessary for labor independent of gestation and underlying trigger. Methods Expression was analyzed in term and preterm +/− labor (n =6 subjects/group). Gene sets were generated using logic operations. Results 34 genes were similarly expressed in PTL/TL but absent from nonlabor samples (Effector Set). 49 genes were specific to PTL (Preterm Initiator Set) and 174 to TL (Term Initiator Set). The gene ontogeny processes comprising Term Initiator and Effector Sets were diverse, though inflammation was represented in 4 of the top 10; inflammation dominated the Preterm Initiator Set. Comments TL and PTL differ dramatically in initiator profiles. Though inflammation is part of the Term Initiator and the Effector Sets, it is an overwhelming part of PTL associated with intraamniotic inflammation. PMID:20452493

  13. Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.

    Directory of Open Access Journals (Sweden)

    Lijing Xu

    Full Text Available High-throughput genomic technologies enable researchers to identify genes that are co-regulated with respect to specific experimental conditions. Numerous statistical approaches have been developed to identify differentially expressed genes. Because each approach can produce distinct gene sets, it is difficult for biologists to determine which statistical approach yields biologically relevant gene sets and is appropriate for their study. To address this issue, we implemented Latent Semantic Indexing (LSI to determine the functional coherence of gene sets. An LSI model was built using over 1 million Medline abstracts for over 20,000 mouse and human genes annotated in Entrez Gene. The gene-to-gene LSI-derived similarities were used to calculate a literature cohesion p-value (LPv for a given gene set using a Fisher's exact test. We tested this method against genes in more than 6,000 functional pathways annotated in Gene Ontology (GO and found that approximately 75% of gene sets in GO biological process category and 90% of the gene sets in GO molecular function and cellular component categories were functionally cohesive (LPv<0.05. These results indicate that the LPv methodology is both robust and accurate. Application of this method to previously published microarray datasets demonstrated that LPv can be helpful in selecting the appropriate feature extraction methods. To enable real-time calculation of LPv for mouse or human gene sets, we developed a web tool called Gene-set Cohesion Analysis Tool (GCAT. GCAT can complement other gene set enrichment approaches by determining the overall functional cohesion of data sets, taking into account both explicit and implicit gene interactions reported in the biomedical literature.GCAT is freely available at http://binf1.memphis.edu/gcat.

  14. JAG: A Computational Tool to Evaluate the Role of Gene-Sets in Complex Traits.

    Science.gov (United States)

    Lips, Esther S; Kooyman, Maarten; de Leeuw, Christiaan; Posthuma, Danielle

    2015-05-14

    Gene-set analysis has been proposed as a powerful tool to deal with the highly polygenic architecture of complex traits, as well as with the small effect sizes typically found in GWAS studies for complex traits. We developed a tool, Joint Association of Genetic variants (JAG), which can be applied to Genome Wide Association (GWA) data and tests for the joint effect of all single nucleotide polymorphisms (SNPs) located in a user-specified set of genes or biological pathway. JAG assigns SNPs to genes and incorporates self-contained and/or competitive tests for gene-set analysis. JAG uses permutation to evaluate gene-set significance, which implicitly controls for linkage disequilibrium, sample size, gene size, the number of SNPs per gene and the number of genes in the gene-set. We conducted a power analysis using the Wellcome Trust Case Control Consortium (WTCCC) Crohn's disease data set and show that JAG correctly identifies validated gene-sets for Crohn's disease and has more power than currently available tools for gene-set analysis. JAG is a powerful, novel tool for gene-set analysis, and can be freely downloaded from the CTG Lab website.

  15. JAG: A Computational Tool to Evaluate the Role of Gene-Sets in Complex Traits

    Directory of Open Access Journals (Sweden)

    Esther S. Lips

    2015-05-01

    Full Text Available Gene-set analysis has been proposed as a powerful tool to deal with the highly polygenic architecture of complex traits, as well as with the small effect sizes typically found in GWAS studies for complex traits. We developed a tool, Joint Association of Genetic variants (JAG, which can be applied to Genome Wide Association (GWA data and tests for the joint effect of all single nucleotide polymorphisms (SNPs located in a user-specified set of genes or biological pathway. JAG assigns SNPs to genes and incorporates self-contained and/or competitive tests for gene-set analysis. JAG uses permutation to evaluate gene-set significance, which implicitly controls for linkage disequilibrium, sample size, gene size, the number of SNPs per gene and the number of genes in the gene-set. We conducted a power analysis using the Wellcome Trust Case Control Consortium (WTCCC Crohn’s disease data set and show that JAG correctly identifies validated gene-sets for Crohn’s disease and has more power than currently available tools for gene-set analysis. JAG is a powerful, novel tool for gene-set analysis, and can be freely downloaded from the CTG Lab website.

  16. Annotation of gene function in citrus using gene expression information and co-expression networks.

    Science.gov (United States)

    Wong, Darren C J; Sweetman, Crystal; Ford, Christopher M

    2014-07-15

    The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world's most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a "guilt-by-association" principle whereby genes encoding proteins involved in similar and/or related biological processes may exhibit similar expression patterns across diverse sets of experimental conditions. While bioinformatics resources such as GCN analysis are widely available for efficient gene function prediction in model plant species including Arabidopsis, soybean and rice, in citrus these tools are not yet developed. We have constructed a comprehensive GCN for citrus inferred from 297 publicly available Affymetrix Genechip Citrus Genome microarray datasets, providing gene co-expression relationships at a genome-wide scale (33,000 transcripts). The comprehensive citrus GCN consists of a global GCN (condition-independent) and four condition-dependent GCNs that survey the sweet orange species only, all citrus fruit tissues, all citrus leaf tissues, or stress-exposed plants. All of these GCNs are clustered using genome-wide, gene-centric (guide) and graph clustering algorithms for flexibility of gene function prediction. For each putative cluster, gene ontology (GO) enrichment and gene expression specificity analyses were performed to enhance gene function, expression and regulation pattern prediction. The guide-gene approach was used to infer novel roles of genes involved in disease susceptibility and vitamin C metabolism, and graph-clustering approaches were used to investigate isoprenoid/phenylpropanoid metabolism in citrus peel, and citric acid catabolism via the GABA shunt in citrus fruit. Integration of citrus gene co-expression networks, functional enrichment analysis and gene

  17. Electronic Health Information Legal Epidemiology Data Set 2014

    Data.gov (United States)

    U.S. Department of Health & Human Services — Authors: Cason Schmit, JD, Gregory Sunshine, JD, Dawn Pepin, JD, MPH, Tara Ramanathan, JD, MPH, Akshara Menon, JD, MPH, Matthew Penn, JD, MLIS This legal data set...

  18. Gene-set meta-analysis of lung cancer identifies pathway related to systemic lupus erythematosus.

    Science.gov (United States)

    Rosenberger, Albert; Sohns, Melanie; Friedrichs, Stefanie; Hung, Rayjean J; Fehringer, Gord; McLaughlin, John; Amos, Christopher I; Brennan, Paul; Risch, Angela; Brüske, Irene; Caporaso, Neil E; Landi, Maria Teresa; Christiani, David C; Wei, Yongyue; Bickeböller, Heike

    2017-01-01

    Gene-set analysis (GSA) is an approach using the results of single-marker genome-wide association studies when investigating pathways as a whole with respect to the genetic basis of a disease. We performed a meta-analysis of seven GSAs for lung cancer, applying the method META-GSA. Overall, the information taken from 11,365 cases and 22,505 controls from within the TRICL/ILCCO consortia was used to investigate a total of 234 pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. META-GSA reveals the systemic lupus erythematosus KEGG pathway hsa05322, driven by the gene region 6p21-22, as also implicated in lung cancer (p = 0.0306). This gene region is known to be associated with squamous cell lung carcinoma. The most important genes driving the significance of this pathway belong to the genomic areas HIST1-H4L, -1BN, -2BN, -H2AK, -H4K and C2/C4A/C4B. Within these areas, the markers most significantly associated with LC are rs13194781 (located within HIST12BN) and rs1270942 (located between C2 and C4A). We have discovered a pathway currently marked as specific to systemic lupus erythematosus as being significantly implicated in lung cancer. The gene region 6p21-22 in this pathway appears to be more extensively associated with lung cancer than previously assumed. Given wide-stretched linkage disequilibrium to the area APOM/BAG6/MSH5, there is currently simply not enough information or evidence to conclude whether the potential pleiotropy of lung cancer and systemic lupus erythematosus is spurious, biological, or mediated. Further research into this pathway and gene region will be necessary.

  19. A method for developing regulatory gene set networks to characterize complex biological systems.

    Science.gov (United States)

    Suphavilai, Chayaporn; Zhu, Liugen; Chen, Jake Y

    2015-01-01

    Traditional approaches to studying molecular networks are based on linking genes or proteins. Higher-level networks linking gene sets or pathways have been proposed recently. Several types of gene set networks have been used to study complex molecular networks such as co-membership gene set networks (M-GSNs) and co-enrichment gene set networks (E-GSNs). Gene set networks are useful for studying biological mechanism of diseases and drug perturbations. In this study, we proposed a new approach for constructing directed, regulatory gene set networks (R-GSNs) to reveal novel relationships among gene sets or pathways. We collected several gene set collections and high-quality gene regulation data in order to construct R-GSNs in a comparative study with co-membership gene set networks (M-GSNs). We described a method for constructing both global and disease-specific R-GSNs and determining their significance. To demonstrate the potential applications to disease biology studies, we constructed and analysed an R-GSN specifically built for Alzheimer's disease. R-GSNs can provide new biological insights complementary to those derived at the protein regulatory network level or M-GSNs. When integrated properly to functional genomics data, R-GSNs can help enable future research on systems biology and translational bioinformatics.

  20. Privacy and Information Sharing in Judicial Setting : A Wicked Problem

    NARCIS (Netherlands)

    Bargh, M.S.; Choenni, R.; Meijer, R.

    2015-01-01

    Information sharing has become a means of gaining public trust for institutions such as governmental and scientific organizations. The transparency sought through information sharing contributes to the trust of various stakeholders such as citizens, other organizations and enterprises in such instit

  1. Gene set-based module discovery in the breast cancer transcriptome

    Directory of Open Access Journals (Sweden)

    Zhang Michael Q

    2009-02-01

    Full Text Available Abstract Background Although microarray-based studies have revealed global view of gene expression in cancer cells, we still have little knowledge about regulatory mechanisms underlying the transcriptome. Several computational methods applied to yeast data have recently succeeded in identifying expression modules, which is defined as co-expressed gene sets under common regulatory mechanisms. However, such module discovery methods are not applied cancer transcriptome data. Results In order to decode oncogenic regulatory programs in cancer cells, we developed a novel module discovery method termed EEM by extending a previously reported module discovery method, and applied it to breast cancer expression data. Starting from seed gene sets prepared based on cis-regulatory elements, ChIP-chip data, and gene locus information, EEM identified 10 principal expression modules in breast cancer based on their expression coherence. Moreover, EEM depicted their activity profiles, which predict regulatory programs in each subtypes of breast tumors. For example, our analysis revealed that the expression module regulated by the Polycomb repressive complex 2 (PRC2 is downregulated in triple negative breast cancers, suggesting similarity of transcriptional programs between stem cells and aggressive breast cancer cells. We also found that the activity of the PRC2 expression module is negatively correlated to the expression of EZH2, a component of PRC2 which belongs to the E2F expression module. E2F-driven EZH2 overexpression may be responsible for the repression of the PRC2 expression modules in triple negative tumors. Furthermore, our network analysis predicts regulatory circuits in breast cancer cells. Conclusion These results demonstrate that the gene set-based module discovery approach is a powerful tool to decode regulatory programs in cancer cells.

  2. Identifying the genetic variation of gene expression using gene sets: application of novel gene Set eQTL approach to PharmGKB and KEGG.

    Directory of Open Access Journals (Sweden)

    Ryan Abo

    Full Text Available Genetic variation underlying the regulation of mRNA gene expression in humans may provide key insights into the molecular mechanisms of human traits and complex diseases. Current statistical methods to map genetic variation associated with mRNA gene expression have typically applied standard linkage and/or association methods; however, when genome-wide SNP and mRNA expression data are available performing all pair wise comparisons is computationally burdensome and may not provide optimal power to detect associations. Consideration of different approaches to account for the high dimensionality and multiple testing issues may provide increased efficiency and statistical power. Here we present a novel approach to model and test the association between genetic variation and mRNA gene expression levels in the context of gene sets (GSs and pathways, referred to as gene set - expression quantitative trait loci analysis (GS-eQTL. The method uses GSs to initially group SNPs and mRNA expression, followed by the application of principal components analysis (PCA to collapse the variation and reduce the dimensionality within the GSs. We applied GS-eQTL to assess the association between SNP and mRNA expression level data collected from a cell-based model system using PharmGKB and KEGG defined GSs. We observed a large number of significant GS-eQTL associations, in which the most significant associations arose between genetic variation and mRNA expression from the same GS. However, a number of associations involving genetic variation and mRNA expression from different GSs were also identified. Our proposed GS-eQTL method effectively addresses the multiple testing limitations in eQTL studies and provides biological context for SNP-expression associations.

  3. A novel fuzzy set based multifactor dimensionality reduction method for detecting gene-gene interaction.

    Science.gov (United States)

    Jung, Hye-Young; Leem, Sangseob; Lee, Sungyoung; Park, Taesung

    2016-12-01

    Gene-gene interaction (GGI) is one of the most popular approaches for finding the missing heritability of common complex traits in genetic association studies. The multifactor dimensionality reduction (MDR) method has been widely studied for detecting GGIs. In order to identify the best interaction model associated with disease susceptibility, MDR compares all possible genotype combinations in terms of their predictability of disease status from a simple binary high(H) and low(L) risk classification. However, this simple binary classification does not reflect the uncertainty of H/L classification. We regard classifying H/L as equivalent to defining the degree of membership of two risk groups H/L. By adopting the fuzzy set theory, we propose Fuzzy MDR which takes into account the uncertainty of H/L classification. Fuzzy MDR allows the possibility of partial membership of H/L through a membership function which transforms the degree of uncertainty into a [0,1] scale. The best genotype combinations can be selected which maximizes a new fuzzy set based accuracy measure. Two simulation studies are conducted to compare the power of the proposed Fuzzy MDR with that of MDR. Our results show that Fuzzy MDR has higher power than MDR. We illustrate the proposed Fuzzy MDR by analysing bipolar disorder (BD) trait of the WTCCC dataset to detect GGI associated with BD. We propose a novel Fuzzy MDR method to detect gene-gene interaction by taking into account the uncertainly of H/L classification and show that it has higher power than MDR. Fuzzy MDR can be easily extended to handle continuous phenotypes as well. The program written in R for the proposed Fuzzy MDR is available at https://statgen.snu.ac.kr/software/FuzzyMDR. Copyright © 2016 Elsevier Ltd. All rights reserved.

  4. Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome

    Institute of Scientific and Technical Information of China (English)

    Heng Li; Tao Liu; Hai-Hong Li; Yan Li; Li-Jun Fang; Hui-Min Xie; Wei-Mou Zheng; Bai-Lin Hao; Jin-Song Liu; Zhao Xu; Jiao Jin; Lin Fang; Lei Gao; Yu-Dong Li; Zi-Xing Xing; Shao-Gen Gao

    2005-01-01

    With several rice genome projects approaching completion gene prediction/finding by computer algorithms has become an urgent task. Two test sets were constructed by mapping the newly published 28,469 full-length KOME rice cDNA to the RGP BAC clone sequences of Oryza sativa ssp. japonica: a single-gene set of 550 sequences and a multi-gene set of 62 sequences with 271 genes. These data sets were used to evaluate five ab initio gene prediction programs: RiceHMM,GlimmerR, GeneMark, FGENSH and BGF. The predictions were compared on nucleotide, exon and whole gene structure levels using commonly accepted measures and several new measures. The test results show a progress in performance in chronological order. At the same time complementarity of the programs hints on the possibility of further improvement and on the feasibility of reaching better performance by combining several gene-finders.

  5. Comparison of gene sets for expression profiling: prediction of metastasis from low-malignant breast cancer

    DEFF Research Database (Denmark)

    Thomassen, Mads; Tan, Qihua; Eiriksdottir, Freyja;

    2007-01-01

    -six tumors from low-risk patients and 34 low-malignant T2 tumors from patients with slightly higher risk have been examined by genome-wide gene expression analysis. Nine prognostic gene sets were tested in this data set. RESULTS: A 32-gene profile (HUMAC32) that accurately predicts metastasis has previously...... sets, mainly developed in high-risk cancers, predict metastasis from low-malignant cancer....

  6. TSG: a new algorithm for binary and multi-class cancer classification and informative genes selection

    Directory of Open Access Journals (Sweden)

    Wang Haiyan

    2013-01-01

    Full Text Available Abstract Background One of the challenges in classification of cancer tissue samples based on gene expression data is to establish an effective method that can select a parsimonious set of informative genes. The Top Scoring Pair (TSP, k-Top Scoring Pairs (k-TSP, Support Vector Machines (SVM, and prediction analysis of microarrays (PAM are four popular classifiers that have comparable performance on multiple cancer datasets. SVM and PAM tend to use a large number of genes and TSP, k-TSP always use even number of genes. In addition, the selection of distinct gene pairs in k-TSP simply combined the pairs of top ranking genes without considering the fact that the gene set with best discrimination power may not be the combined pairs. The k-TSP algorithm also needs the user to specify an upper bound for the number of gene pairs. Here we introduce a computational algorithm to address the problems. The algorithm is named Chisquare-statistic-based Top Scoring Genes (Chi-TSG classifier simplified as TSG. Results The TSG classifier starts with the top two genes and sequentially adds additional gene into the candidate gene set to perform informative gene selection. The algorithm automatically reports the total number of informative genes selected with cross validation. We provide the algorithm for both binary and multi-class cancer classification. The algorithm was applied to 9 binary and 10 multi-class gene expression datasets involving human cancers. The TSG classifier outperforms TSP family classifiers by a big margin in most of the 19 datasets. In addition to improved accuracy, our classifier shares all the advantages of the TSP family classifiers including easy interpretation, invariant to monotone transformation, often selects a small number of informative genes allowing follow-up studies, resistant to sampling variations due to within sample operations. Conclusions Redefining the scores for gene set and the classification rules in TSP family

  7. A reference gene set for chemosensory receptor genes of Manduca sexta.

    Science.gov (United States)

    Koenig, Christopher; Hirsh, Ariana; Bucks, Sascha; Klinner, Christian; Vogel, Heiko; Shukla, Aditi; Mansfield, Jennifer H; Morton, Brian; Hansson, Bill S; Grosse-Wilde, Ewald

    2015-11-01

    The order of Lepidoptera has historically been crucial for chemosensory research, with many important advances coming from the analysis of species like Bombyx mori or the tobacco hornworm, Manduca sexta. Specifically M. sexta has long been a major model species in the field, especially regarding the importance of olfaction in an ecological context, mainly the interaction with its host plants. In recent years transcriptomic data has led to the discovery of members of all major chemosensory receptor families in the species, but the data was fragmentary and incomplete. Here we present the analysis of the newly available high-quality genome data for the species, supplemented by additional transcriptome data to generate a high quality reference gene set for the three major chemosensory receptor gene families, the gustatory (GR), olfactory (OR) and antennal ionotropic receptors (IR). Coupled with gene expression analysis our approach allows association of specific receptor types and behaviors, like pheromone and host detection. The dataset will provide valuable support for future analysis of these essential chemosensory modalities in this species and in Lepidoptera in general.

  8. The Generalization of Mutual Information as the Information between a Set of Variables: The Information Correlation Function Hierarchy and the Information Structure of Multi-Agent Systems

    Science.gov (United States)

    Wolf, David R.

    2004-01-01

    The topic of this paper is a hierarchy of information-like functions, here named the information correlation functions, where each function of the hierarchy may be thought of as the information between the variables it depends upon. The information correlation functions are particularly suited to the description of the emergence of complex behaviors due to many- body or many-agent processes. They are particularly well suited to the quantification of the decomposition of the information carried among a set of variables or agents, and its subsets. In more graphical language, they provide the information theoretic basis for understanding the synergistic and non-synergistic components of a system, and as such should serve as a forceful toolkit for the analysis of the complexity structure of complex many agent systems. The information correlation functions are the natural generalization to an arbitrary number of sets of variables of the sequence starting with the entropy function (one set of variables) and the mutual information function (two sets). We start by describing the traditional measures of information (entropy) and mutual information.

  9. Information overload or search-amplified risk? Set size and order effects on decisions from experience.

    Science.gov (United States)

    Hills, Thomas T; Noguchi, Takao; Gibbert, Michael

    2013-10-01

    How do changes in choice-set size influence information search and subsequent decisions? Moreover, does information overload influence information processing with larger choice sets? We investigated these questions by letting people freely explore sets of gambles before choosing one of them, with the choice sets either increasing or decreasing in number for each participant (from two to 32 gambles). Set size influenced information search, with participants taking more samples overall, but sampling a smaller proportion of gambles and taking fewer samples per gamble, when set sizes were larger. The order of choice sets also influenced search, with participants sampling from more gambles and taking more samples overall if they started with smaller as opposed to larger choice sets. Inconsistent with information overload, information processing appeared consistent across set sizes and choice order conditions, reliably favoring gambles with higher sample means. Despite the lack of evidence for information overload, changes in information search did lead to systematic changes in choice: People who started with smaller choice sets were more likely to choose gambles with the highest expected values, but only for small set sizes. For large set sizes, the increase in total samples increased the likelihood of encountering rare events at the same time that the reduction in samples per gamble amplified the effect of these rare events when they occurred-what we call search-amplified risk. This led to riskier choices for individuals whose choices most closely followed the sample mean.

  10. Solving Problems in Library and Information Science Using Fuzzy Set Theory.

    Science.gov (United States)

    Hood, William W.; Wilson, Concepcion S.

    2002-01-01

    Discussion of the use of mathematical tools in library and information science focuses on information retrieval applications of fuzzy set theory. Topics include fuzzy set theory in informetrics and bibliometrics; and the literature of fuzzy set theory. (Contains 87 references.) (LRW)

  11. Setting a research agenda to inform intensive comprehensive aphasia programs.

    Science.gov (United States)

    Hula, William D; Cherney, Leora R; Worrall, Linda E

    2013-01-01

    Research into intensive comprehensive aphasia programs (ICAPs) has yet to show that this service delivery model is efficacious, effective, has cost utility, or can be broadly implemented. This article describes a phased research approach to the study of ICAPs and sets out a research agenda that considers not only the specific issues surrounding ICAPs, but also the phase of the research. Current ICAP research is in the early phases, with dosing and outcome measurement as prime considerations as well as refinement of the best treatment protocol. Later phases of ICAP research are outlined, and the need for larger scale collaborative funded research is recognized. The need for more rapid translation into practice is also acknowledged, and the use of hybrid models of phased research is encouraged within the ICAP research agenda.

  12. Construction and expression of SET gene and siRNA recombinant adenovirus vectors

    Institute of Scientific and Technical Information of China (English)

    Xu Bo-qun; Lu Pin-hong; Li Ying; Xue Kai; Li Mei; Ma Xiang; Diao Fei-yan; Cui Yu-gui; Liu Jia-yin

    2010-01-01

    Objective: To construct SET gene recombinant adenovirus vector and SET gene small interfering RNA (SiRNA) recombinant adenovirus vector for over-expression or knock-down of SET levels.Methods: The cDNA sequence of SET was cloned by reverse transcriptive polymerase chain reaction (RT-PCR) and the SET gene fragment was subcloned into adenovirus shuttle plasmid pAdTrack-CMV to construct the shuttle plasmid pAdTrack-SET. The shuttle plasmid pAdtrack-SET was transformed into BJ5183 cells with the adenoviral backbone pAdEasy-1 to obtain the homologous recombinant Ad-CMV-SET and the recombinant Ad-CMV-SET was packaged and amplified in the AD293 cells. The expression of SET in AD293 cells was detected by Western blot. In addition, we constructed SET gene SiRNA recombinant adenovirus vector (Ad-H1-SiRNA/SET) and its efficacy of knockdown of SET protein was detected in infected GC-2spd(ts) cells by Western blot. Results: The recombinant adenovirus vectors, both SET gene recombinant adenovirus vector Ad-CMV-SET and SET gene SiRNA recombinant adenovirus vector Ad-H1-SiRNA/SET, were proven to be constructed successfully by the evidence of endonulease digestion and sequencing. AD293 cells infected with either recombinant adenovirus vector of Ad-CMV-SET or Ad-H1-SiRNA/SET were observed to express GFP. The expression of SET protein was up-regulated significantly in AD293 cells infected with SET gene recombinant adenovirus vector. On the contrast, SET protein was significantly down-regulated in the GC-2spd(ts) cells infected with Ad-H1-SiRNA/SET (P<0.05) and the knockdown efficiency was approximately 50%-70%. Conclusion: The recombinant adenovirus vector Ad-CMV-SET and Ad-H1-SiRNA/SET were successfully constructed and effectively expressed in germ cells and somatic cells. It provides an experimental tool for further study of SET gene in the physiological and pathophysiological mechanism of reproduction-related diseases.

  13. Gene set of nuclear-encoded mitochondrial regulators is enriched for common inherited variation in obesity.

    Directory of Open Access Journals (Sweden)

    Nadja Knoll

    Full Text Available There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1 16 nuclear regulators of mitochondrial genes, (2 91 genes for oxidative phosphorylation and (3 966 nuclear-encoded mitochondrial genes. Gene set enrichment analysis (GSEA showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents and a population-based GWAS sample (KORA F4, n = 1,743. A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50(th and 95(th percentile of the set of all gene-wise corrected p-values as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50(th percentile for the set of the 16 nuclear regulators of mitochondrial genes (p(GSEA,50 = 0.0103. This finding was not confirmed in the trios (p(GSEA,50 = 0.5991, but in KORA (p(GSEA,50 = 0.0398. The meta-analysis again indicated a trend for enrichment (p(MAGENTA,50 = 0.1052, p(MAGENTA,75 = 0.0251. The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes.

  14. Gene set of nuclear-encoded mitochondrial regulators is enriched for common inherited variation in obesity.

    Science.gov (United States)

    Knoll, Nadja; Jarick, Ivonne; Volckmar, Anna-Lena; Klingenspor, Martin; Illig, Thomas; Grallert, Harald; Gieger, Christian; Wichmann, Heinz-Erich; Peters, Annette; Hebebrand, Johannes; Scherag, André; Hinney, Anke

    2013-01-01

    There are hints of an altered mitochondrial function in obesity. Nuclear-encoded genes are relevant for mitochondrial function (3 gene sets of known relevant pathways: (1) 16 nuclear regulators of mitochondrial genes, (2) 91 genes for oxidative phosphorylation and (3) 966 nuclear-encoded mitochondrial genes). Gene set enrichment analysis (GSEA) showed no association with type 2 diabetes mellitus in these gene sets. Here we performed a GSEA for the same gene sets for obesity. Genome wide association study (GWAS) data from a case-control approach on 453 extremely obese children and adolescents and 435 lean adult controls were used for GSEA. For independent confirmation, we analyzed 705 obesity GWAS trios (extremely obese child and both biological parents) and a population-based GWAS sample (KORA F4, n = 1,743). A meta-analysis was performed on all three samples. In each sample, the distribution of significance levels between the respective gene set and those of all genes was compared using the leading-edge-fraction-comparison test (cut-offs between the 50(th) and 95(th) percentile of the set of all gene-wise corrected p-values) as implemented in the MAGENTA software. In the case-control sample, significant enrichment of associations with obesity was observed above the 50(th) percentile for the set of the 16 nuclear regulators of mitochondrial genes (p(GSEA,50) = 0.0103). This finding was not confirmed in the trios (p(GSEA,50) = 0.5991), but in KORA (p(GSEA,50) = 0.0398). The meta-analysis again indicated a trend for enrichment (p(MAGENTA,50) = 0.1052, p(MAGENTA,75) = 0.0251). The GSEA revealed that weak association signals for obesity might be enriched in the gene set of 16 nuclear regulators of mitochondrial genes.

  15. Gene-Set Local Hierarchical Clustering (GSLHC--A Gene Set-Based Approach for Characterizing Bioactive Compounds in Terms of Biological Functional Groups.

    Directory of Open Access Journals (Sweden)

    Feng-Hsiang Chung

    Full Text Available Gene-set-based analysis (GSA, which uses the relative importance of functional gene-sets, or molecular signatures, as units for analysis of genome-wide gene expression data, has exhibited major advantages with respect to greater accuracy, robustness, and biological relevance, over individual gene analysis (IGA, which uses log-ratios of individual genes for analysis. Yet IGA remains the dominant mode of analysis of gene expression data. The Connectivity Map (CMap, an extensive database on genomic profiles of effects of drugs and small molecules and widely used for studies related to repurposed drug discovery, has been mostly employed in IGA mode. Here, we constructed a GSA-based version of CMap, Gene-Set Connectivity Map (GSCMap, in which all the genomic profiles in CMap are converted, using gene-sets from the Molecular Signatures Database, to functional profiles. We showed that GSCMap essentially eliminated cell-type dependence, a weakness of CMap in IGA mode, and yielded significantly better performance on sample clustering and drug-target association. As a first application of GSCMap we constructed the platform Gene-Set Local Hierarchical Clustering (GSLHC for discovering insights on coordinated actions of biological functions and facilitating classification of heterogeneous subtypes on drug-driven responses. GSLHC was shown to tightly clustered drugs of known similar properties. We used GSLHC to identify the therapeutic properties and putative targets of 18 compounds of previously unknown characteristics listed in CMap, eight of which suggest anti-cancer activities. The GSLHC website http://cloudr.ncu.edu.tw/gslhc/ contains 1,857 local hierarchical clusters accessible by querying 555 of the 1,309 drugs and small molecules listed in CMap. We expect GSCMap and GSLHC to be widely useful in providing new insights in the biological effect of bioactive compounds, in drug repurposing, and in function-based classification of complex diseases.

  16. Construction of a Bacterial Cell that Contains Only the Set of Essential Genes Necessary to Impart Life

    Science.gov (United States)

    2014-11-11

    information gleaned from these transposon studies was used to inform our next set of designs by predicting genes switching from N to E or I as paralogous ...remaining in RGD and homologs found in other organisms. A BLASTp score of 1e-5 was used as the similarity cutoff. Functional classifications... homologs to RGD in that organism. Inside the dashed circle is for prokaryotes and archea. Those outside are for eukaryotes.

  17. Sustainable mobile information infrastructures in low resource settings.

    Science.gov (United States)

    Braa, Kristin; Purkayastha, Saptarshi

    2010-01-01

    Developing countries represent the fastest growing mobile markets in the world. For people with no computing access, a mobile will be their first computing device. Mobile technologies offer a significant potential to strengthen health systems in developing countries with respect to community based monitoring, reporting, feedback to service providers, and strengthening communication and coordination between different health functionaries, medical officers and the community. However, there are various challenges in realizing this potential including technological such as lack of power, social, institutional and use issues. In this paper a case study from India on mobile health implementation and use will be reported. An underlying principle guiding this paper is to see mobile technology not as a "stand alone device" but potentially an integral component of an integrated mobile supported health information infrastructure.

  18. Setting risk-informed environmental standards for Bacillus anthracis spores.

    Science.gov (United States)

    Hong, Tao; Gurian, Patrick L; Ward, Nicholas F Dudley

    2010-10-01

    In many cases, human health risk from biological agents is associated with aerosol exposures. Because air concentrations decline rapidly after a release, it may be necessary to use concentrations found in other environmental media to infer future or past aerosol exposures. This article presents an approach for linking environmental concentrations of Bacillus. anthracis (B. anthracis) spores on walls, floors, ventilation system filters, and in human nasal passages with human health risk from exposure to B. anthracis spores. This approach is then used to calculate example values of risk-informed concentration standards for both retrospective risk mitigation (e.g., prophylactic antibiotics) and prospective risk mitigation (e.g., environmental clean up and reoccupancy). A large number of assumptions are required to calculate these values, and the resulting values have large uncertainties associated with them. The values calculated here suggest that documenting compliance with risks in the range of 10(-4) to 10(-6) would be challenging for small diameter (respirable) spore particles. For less stringent risk targets and for releases of larger diameter particles (which are less respirable and hence less hazardous), environmental sampling would be more promising.

  19. SNP sets selection under mutual information criterion, application to F7/FVII dataset.

    Science.gov (United States)

    Brunel, H; Perera, A; Buil, A; Sabater-Lleal, M; Souto, J C; Fontcuberta, J; Vallverdu, M; Soria, J M; Caminal, P

    2008-01-01

    One of the main goals of human genetics is to find genetic markers related to complex diseases. In blood coagulation process, it is known that genetic variability in F7 gene is the most responsible for observed variations in FVII levels in blood. In this work, we propose a method for selecting sets of Single Nucleotide Polymorphisms (SNPs) significantly correlated with a phenotype (FVII levels). This method employs a feature selection algorithm (variant of Sequential Forward Selection, SFS) based on a criterion of statistical significance of a mutual information functional. This algorithm is applied to a sample of independent individuals from the GAIT project. Main SNPs found by the algorithm are in correspondence with previous results published using family-based techniques.

  20. GeneLibrarian: an effective gene-information summarization and visualization system

    Directory of Open Access Journals (Sweden)

    Liu Heng-Hui

    2006-08-01

    Full Text Available Abstract Background Abundant information about gene products is stored in online searchable databases such as annotation or literature. To efficiently obtain and digest such information, there is a pressing need for automated information-summarization and functional-similarity clustering of genes. Results We have developed a novel method for semantic measurement of annotation and integrated it with a biomedical literature summarization system to establish a platform, GeneLibrarian, to provide users well-organized information about any specific group of genes (e.g. one cluster of genes from a microarray chip they might be interested in. The GeneLibrarian generates a summarized viewgraph of candidate genes for a user based on his/her preference and delivers the desired background information effectively to the user. The summarization technique involves optimizing the text mining algorithm and Gene Ontology-based clustering method to enable the discovery of gene relations. Conclusion GeneLibrarian is a Java-based web application that automates the process of retrieving critical information from the literature and expanding the number of potential genes for further analysis. This study concentrates on providing well organized information to users and we believe that will be useful in their researches. GeneLibrarian is available on http://gen.csie.ncku.edu.tw/GeneLibrarian/

  1. MeInfoText: associated gene methylation and cancer information from text mining

    Directory of Open Access Journals (Sweden)

    Juan Hsueh-Fen

    2008-01-01

    Full Text Available Abstract Background DNA methylation is an important epigenetic modification of the genome. Abnormal DNA methylation may result in silencing of tumor suppressor genes and is common in a variety of human cancer cells. As more epigenetics research is published electronically, it is desirable to extract relevant information from biological literature. To facilitate epigenetics research, we have developed a database called MeInfoText to provide gene methylation information from text mining. Description MeInfoText presents comprehensive association information about gene methylation and cancer, the profile of gene methylation among human cancer types and the gene methylation profile of a specific cancer type, based on association mining from large amounts of literature. In addition, MeInfoText offers integrated protein-protein interaction and biological pathway information collected from the Internet. MeInfoText also provides pathway cluster information regarding to a set of genes which may contribute the development of cancer due to aberrant methylation. The extracted evidence with highlighted keywords and the gene names identified from each methylation-related abstract is also retrieved. The database is now available at http://mit.lifescience.ntu.edu.tw/. Conclusion MeInfoText is a unique database that provides comprehensive gene methylation and cancer association information. It will complement existing DNA methylation information and will be useful in epigenetics research and the prevention of cancer.

  2. Evaluating Phylogenetic Informativeness as a Predictor of Phylogenetic Signal for Metazoan, Fungal, and Mammalian Phylogenomic Data Sets

    Directory of Open Access Journals (Sweden)

    Francesc López-Giráldez

    2013-01-01

    Full Text Available Phylogenetic research is often stymied by selection of a marker that leads to poor phylogenetic resolution despite considerable cost and effort. Profiles of phylogenetic informativeness provide a quantitative measure for prioritizing gene sampling to resolve branching order in a particular epoch. To evaluate the utility of these profiles, we analyzed phylogenomic data sets from metazoans, fungi, and mammals, thus encompassing diverse time scales and taxonomic groups. We also evaluated the utility of profiles created based on simulated data sets. We found that genes selected via their informativeness dramatically outperformed haphazard sampling of markers. Furthermore, our analyses demonstrate that the original phylogenetic informativeness method can be extended to trees with more than four taxa. Thus, although the method currently predicts phylogenetic signal without specifically accounting for the misleading effects of stochastic noise, it is robust to the effects of homoplasy. The phylogenetic informativeness rankings obtained will allow other researchers to select advantageous genes for future studies within these clades, maximizing return on effort and investment. Genes identified might also yield efficient experimental designs for phylogenetic inference for many sister clades and outgroup taxa that are closely related to the diverse groups of organisms analyzed.

  3. Understanding Nurses’ Information Needs and Searching Behavior in Acute Care Settings

    OpenAIRE

    2005-01-01

    We report the results of a pilot study designed to describe nurses’ information needs and searching behavior in acute care settings. Several studies have indicated that nurses have unmet information needs while delivering care to patients. AIM: Identify the information needs of nurses in acute care settings. METHODS: Nurses at three hospitals were asked to use an information retrieval tool (CPG Viewer). A detailed log of their interactions with the tool was generated. RESULT...

  4. Information retrieval pathways for health information exchange in multiple care settings

    DEFF Research Database (Denmark)

    Kierkegaard, Patrick; Kaushal, Rainu; Vest, Joshua R.

    2014-01-01

    Objectives To determine which health information exchange (HIE) technologies and information retrieval pathways healthcare professionals relied on to meet their information needs in the context of laboratory test results, radiological images and reports, and medication histories. Study Design...

  5. Transcriptome Analysis Reveals Candidate Genes Involved in Gibberellin-Induced Fruit Setting in Triploid Loquat (Eriobotrya japonica)

    Science.gov (United States)

    Jiang, Shuang; Luo, Jun; Xu, Fanjie; Zhang, Xueying

    2016-01-01

    The triploid loquat (Eriobotrya japonica) is a new germplasm with a high edible fruit rate. Under natural conditions, the triploid loquat has a low fruit setting ratio (not more than 10 fruits in a tree), reflecting fertilization failure. To unravel the molecular mechanism of gibberellin (GA) treatment to induce parthenocarpy in triploid loquats, a transcriptome analysis of fruit setting induced by GA3 was analyzed using RNA-seq at four different stages during the development of young fruit. Approximately 344 million high quality reads in seven libraries were de novo assembled, yielding 153,900 unique transcripts with more than 79.9% functionally annotated transcripts. A total of 2,220, 2,974, and 1,614 differentially expressed genes (DEGs) were observed at 3, 7, and 14 days after GA treatment, respectively. The weighted gene co-expression network and Venn diagram analysis of DEGs revealed that sixteen candidate genes may play critical roles in the fruit setting after GA treatment. Five genes were related to auxin, in which one auxin synthesis gene of yucca was upregulated, suggesting that auxin may act as a signal for fruit setting. Furthermore, ABA 8′-hydroxylase was upregulated, while ethylene-forming enzyme was downregulated, suggesting that multiple hormones may be involved in GA signaling. Four transcription factors, NAC7, NAC23, bHLH35, and HD16, were potentially negatively regulated in fruit setting, and two cell division-related genes, arr9 and CYCA3, were upregulated. In addition, the expression of the GA receptor gid1 was downregulated by GA treatment, suggesting that the negative feedback mechanism in GA signaling may be regulated by gid1. Altogether, the results of the present study provide information from a comprehensive gene expression analysis and insight into the molecular mechanism underlying fruit setting under GA treatment in E. japonica. PMID:28066478

  6. Can Power Laws Help Us Understand Gene and Proteome Information?

    Directory of Open Access Journals (Sweden)

    J. A. Tenreiro Machado

    2013-01-01

    Full Text Available Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide is a single linear sequence of amino acids that are biochemically bonded together. The amino acid sequence in a protein is defined by the sequence of a gene or several genes encoded in the DNA-based genetic code. This genetic code typically uses twenty amino acids, but in certain organisms the genetic code can also include two other amino acids. After linking the amino acids during protein synthesis, each amino acid becomes a residue in a protein, which is then chemically modified, ultimately changing and defining the protein function. In this study, the authors analyze the amino acid sequence using alignment-free methods, aiming to identify structural patterns in sets of proteins and in the proteome, without any other previous assumptions. The paper starts by analyzing amino acid sequence data by means of histograms using fixed length amino acid words (tuples. After creating the initial relative frequency histograms, they are transformed and processed in order to generate quantitative results for information extraction and graphical visualization. Selected samples from two reference datasets are used, and results reveal that the proposed method is able to generate relevant outputs in accordance with current scientific knowledge in domains like protein sequence/proteome analysis.

  7. STUDY OF SETTING UP THE FOREST RESOURCES MANAGEMENT INFORMATION SYSTEM BASED ON WEBGIS

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    Based on an analysis of the characteristics of the Forest Resources Management Information System of each development phase, and consideration of the technical trend in Geographic Information System (GIS) in the Internet Age, this paper explores the importance and the feasibility of setting up Forest Resources Management Information System based on the WEBGIS. At the same time, based on the experience of our study, the paper explores the function, structure and method of developing the Forest Resources Management Information System based on WEBGIS. With the technology of WEBGIS, the Forest Resources Management Information System with data from Huoditang Farm was set up, which makes a great impact on forest resources management. So setting up the Forest Resources Management Information System based on WEBGIS is a trend of forest resources management. In the course of setting up this system, we must pay attention to following questions: 1) unify data standard and information encoding; 2) change mind.

  8. The Use of Fuzzy Set Theory in Information Retrieval and Databases: A Survey.

    Science.gov (United States)

    Kerre, Etienne E.; And Others

    1986-01-01

    Briefly surveys the numerous applications of fuzzy set theory on data representation and information retrieval. The importance of fuzzy set theory with respect to information systems is illustrated with a bibliography of 86 papers that describe data systems that are somehow "fuzzy." (Author/EM)

  9. Strong Similarity Measures for Ordered Sets of Documents in Information Retrieval.

    Science.gov (United States)

    Egghe, L.; Michel, Christine

    2002-01-01

    Presents a general method to construct ordered similarity measures in information retrieval based on classical similarity measures for ordinary sets. Describes a test of some of these measures in an information retrieval system that extracted ranked document sets and discuses the practical usability of the ordered similarity measures. (Author/LRW)

  10. Gene-based analysis of regionally enriched cortical genes in GWAS data sets of cognitive traits and psychiatric disorders.

    Directory of Open Access Journals (Sweden)

    Kari M Ersland

    Full Text Available BACKGROUND: Despite its estimated high heritability, the genetic architecture leading to differences in cognitive performance remains poorly understood. Different cortical regions play important roles in normal cognitive functioning and impairment. Recently, we reported on sets of regionally enriched genes in three different cortical areas (frontomedial, temporal and occipital cortices of the adult rat brain. It has been suggested that genes preferentially, or specifically, expressed in one region or organ reflect functional specialisation. Employing a gene-based approach to the analysis, we used the regionally enriched cortical genes to mine a genome-wide association study (GWAS of the Norwegian Cognitive NeuroGenetics (NCNG sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ (n = 3 samples and bipolar affective disorder (BP (n = 3 samples, to which cognitive impairment is linked. PRINCIPAL FINDINGS: At the single gene level, the temporal cortex enriched gene RAR-related orphan receptor B (RORB showed the strongest overall association, namely to a test of verbal intelligence (Vocabulary, P = 7.7E-04. We also applied gene set enrichment analysis (GSEA to test the candidate genes, as gene sets, for enrichment of association signal in the NCNG GWAS and in GWASs of BP and of SCZ. We found that genes differentially expressed in the temporal cortex showed a significant enrichment of association signal in a test measure of non-verbal intelligence (Reasoning in the NCNG sample. CONCLUSION: Our gene-based approach suggests that RORB could be involved in verbal intelligence differences, while the genes enriched in the temporal cortex might be important to intellectual functions as measured by a test of reasoning in the healthy population. These findings warrant further replication in independent samples on cognitive traits.

  11. Identification of a core set of genes that signifies pathways underlying cardiac hypertrophy

    DEFF Research Database (Denmark)

    Strom, C.C.; Kruhoffer, M.; Knudsen, Steen

    2004-01-01

    Although the molecular signals underlying cardiac hypertrophy have been the subject of intense investigation, the extent of common and distinct gene regulation between different forms of cardiac hypertrophy remains unclear. We hypothesized that a general and comparative analysis of hypertrophic...... gene expression, using microarray technology in multiple models of cardiac hypertrophy, including aortic banding, myocardial infarction, an arteriovenous shunt and pharmacologically induced hypertrophy, would uncover networks of conserved hypertrophy-specific genes and identify novel genes involved...... in hypertrophic signalling. From gene expression analyses (8740 probe sets, n = 46) of rat ventricular RNA, we identified a core set of 139 genes with consistent differential expression in all hypertrophy models as compared to their controls, including 78 genes not previously associated with hypertrophy and 61...

  12. Identification of a core set of genes that signifies pathways underlying cardiac hypertrophy

    DEFF Research Database (Denmark)

    Strøm, Claes C; Kruhøffer, Mogens; Knudsen, Steen

    2004-01-01

    Although the molecular signals underlying cardiac hypertrophy have been the subject of intense investigation, the extent of common and distinct gene regulation between different forms of cardiac hypertrophy remains unclear. We hypothesized that a general and comparative analysis of hypertrophic...... gene expression, using microarray technology in multiple models of cardiac hypertrophy, including aortic banding, myocardial infarction, an arteriovenous shunt and pharmacologically induced hypertrophy, would uncover networks of conserved hypertrophy-specific genes and identify novel genes involved...... in hypertrophic signalling. From gene expression analyses (8740 probe sets, n = 46) of rat ventricular RNA, we identified a core set of 139 genes with consistent differential expression in all hypertrophy models as compared to their controls, including 78 genes not previously associated with hypertrophy and 61...

  13. The information content of rules and rule sets and its application

    Institute of Scientific and Technical Information of China (English)

    HU Dan; LI HongXing; YU XianChuan

    2008-01-01

    The information content of rules is categorized into inner mutual information content and outer impartation information content. Actually, the conventional objective interestingness measures based on information theory are all inner mutual informarion, which represent the confidence of rules and the mutual information between the antecedent and consequent. Moreover, almost all of these measures lose sight of the outer impartation information, which is conveyed to the user and help the user to make decisions. We put forward the viewpoint that the outer impartation information content of rules and rule sets can be represented by the relations from input universe to output universe. By binary relations, the interaction of rules in a rule set can be easily represented by operators: union and intersection. Based on the entropy of relations, the outer impartation information content of rules and rule sets are well measured. Then, the conditional information content of rules and rule sets, the independence of rules and rule sets and the inconsistent knowledge of rule sets are defined and measured. The properties of these new measures are discussed and some interesting results are proven, such as the information content of a rule set may be bigger than the sum of the information content of rules in the rule set, and the conditional information content of rules may be negative. At last, the applications of these new measures are discussed. The new method for the appraisement of .rule mining algorithm, and two rule pruning algorithms, λ-choice and RPCIC, are put forward. These new methods and algorithms havepredominance in satisfying the need of more efficient decision information.

  14. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, van D.A.M.; Goeman, J.J.; Jong, de E.; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    Background: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set

  15. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, D.A. van; Goeman, J.J.; Jong, Esther de; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set

  16. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, D.A. van; Goeman, J.J.; Jong, Esther de; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    BACKGROUND: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set anal

  17. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    Hettne, K.M.; Boorsma, A.; Dartel, van D.A.M.; Goeman, J.J.; Jong, de E.; Piersma, A.H.; Stierum, R.H.; Kleinjans, J.C.; Kors, J.A.

    2013-01-01

    Background: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with gene set anal

  18. Evaluation of a gene information summarization system by users during the analysis process of microarray datasets.

    Science.gov (United States)

    Yang, Jianji; Cohen, Aaron; Hersh, William

    2009-02-05

    Summarization of gene information in the literature has the potential to help genomics researchers translate basic research into clinical benefits. Gene expression microarrays have been used to study biomarkers for disease and discover novel types of therapeutics and the task of finding information in journal articles on sets of genes is common for translational researchers working with microarray data. However, manually searching and scanning the literature references returned from PubMed is a time-consuming task for scientists. We built and evaluated an automatic summarizer of information on genes studied in microarray experiments. The Gene Information Clustering and Summarization System (GICSS) is a system that integrates two related steps of the microarray data analysis process: functional gene clustering and gene information gathering. The system evaluation was conducted during the process of genomic researchers analyzing their own experimental microarray datasets. The clusters generated by GICSS were validated by scientists during their microarray analysis process. In addition, presenting sentences in the abstract provided significantly more important information to the users than just showing the title in the default PubMed format. The evaluation results suggest that GICSS can be useful for researchers in genomic area. In addition, the hybrid evaluation method, partway between intrinsic and extrinsic system evaluation, may enable researchers to gauge the true usefulness of the tool for the scientists in their natural analysis workflow and also elicit suggestions for future enhancements. GICSS can be accessed online at: http://ir.ohsu.edu/jianji/index.html.

  19. The Current Mind-Set of Federal Information Security Decision-Makers on the Value of Governance: An Informative Study

    Science.gov (United States)

    Stroup, Jay Walter

    2014-01-01

    Understanding the mind-set or perceptions of organizational leaders and decision-makers is important to ascertaining the trends and priorities in policy and governance of the organization. This study finds that a significant shift in the mind-set of government IT and information security leaders has started and will likely result in placing a…

  20. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data.

    Science.gov (United States)

    Ben-Ari Fuchs, Shani; Lieder, Iris; Stelzer, Gil; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-03-01

    Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from "data-to-knowledge-to-innovation," a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ ( geneanalytics.genecards.org ), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®--the human gene database; the MalaCards-the human diseases database; and the PathCards--the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®--the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene-tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell "cards" in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics, pharmacogenomics, vaccinomics

  1. ErmineJ: Tool for functional analysis of gene expression data sets

    Directory of Open Access Journals (Sweden)

    Braynen William

    2005-11-01

    Full Text Available Abstract Background It is common for the results of a microarray study to be analyzed in the context of biologically-motivated groups of genes such as pathways or Gene Ontology categories. The most common method for such analysis uses the hypergeometric distribution (or a related technique to look for "over-representation" of groups among genes selected as being differentially expressed or otherwise of interest based on a gene-by-gene analysis. However, this method suffers from some limitations, and biologist-friendly tools that implement alternatives have not been reported. Results We introduce ErmineJ, a multiplatform user-friendly stand-alone software tool for the analysis of functionally-relevant sets of genes in the context of microarray gene expression data. ErmineJ implements multiple algorithms for gene set analysis, including over-representation and resampling-based methods that focus on gene scores or correlation of gene expression profiles. In addition to a graphical user interface, ErmineJ has a command line interface and an application programming interface that can be used to automate analyses. The graphical user interface includes tools for creating and modifying gene sets, visualizing the Gene Ontology as a table or tree, and visualizing gene expression data. ErmineJ comes with a complete user manual, and is open-source software licensed under the Gnu Public License. Conclusion The availability of multiple analysis algorithms, together with a rich feature set and simple graphical interface, should make ErmineJ a useful addition to the biologist's informatics toolbox. ErmineJ is available from http://microarray.cu.genome.org.

  2. Exploration of data partitioning in an eight-gene data set

    DEFF Research Database (Denmark)

    Rota, Jadranka; Wahlberg, Niklas

    2012-01-01

    Molecular data sets for phylogenetic inference continue to increase in size, especially with respect to the number of genes sampled. As more and more genes are included in analyses, the importance of partitioning the data to avoid problems that can arise from underparameterization becomes more...... apparent. With an eight-gene data set from 38 metalmark moth species (12 genera represented) and three outgroups, we explored different data partitioning strategies and their influence on convergence and mixing of Markov Chains Monte Carlo in a Bayesian setting. We found that in larger data sets......, with an increase in the number of partitions that are made a priori (e.g. by gene and codon position), convergence and mixing become poor. This problem can be overcome by using a recently published algorithm in which homologous sites are grouped into blocks with similar evolutionary rates that can then be modelled...

  3. Fault Diagnosis of a Rotary Machine Based on Information Entropy and Rough Set

    Institute of Scientific and Technical Information of China (English)

    LI Jian-lan; HUANG Shu-hong

    2007-01-01

    There exists some discord or contradiction of information during the process of fault diagnosis for rotary machine. But the traditional methods used in fault diagnosis can not dispose of the information. A model of fault diagnosis for a rotary machine based on information entropy theory and rough set theory is presented in this paper. The model has clear mathematical definition and can dispose both complete unification information and complete inconsistent information of vibration faults. By using the model, decision rules of six typical vibration faults of a steam turbine and electric generating set are deduced from experiment samples. Finally, the decision rules are validated by selected samples and good identification results are acquired.

  4. Information Provision in Emergency Settings: The Experience of Refugee Communities in Zambia

    Science.gov (United States)

    Kanyengo, Brendah Kakulwa; Kanyengo, Christine Wamunyima

    2011-01-01

    This article identifies information provision services in emergency settings using Zambia as a case study by identifying innovative ways of providing library and information services. The thrust of the article is to analyze information management practices of organizations that work within refugee camps and how they take specific cognizance of the…

  5. Intelligent Information Retrieval: Part IV. Testing the Timing of Two Information Retrieval Devices in a Naturalistic Setting.

    Science.gov (United States)

    Cole, Charles

    2001-01-01

    Reports the results of two studies of undergraduates that tested an uncertainty expansion information retrieval device and an uncertainty reduction device in naturalistic settings, designed to be given at different stages of Kuhlthau's information search process. Concludes that the timing of the device interventions is crucial to their potential…

  6. GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data

    Science.gov (United States)

    Ben-Ari Fuchs, Shani; Lieder, Iris; Mazor, Yaron; Buzhor, Ella; Kaplan, Sergey; Bogoch, Yoel; Plaschkes, Inbar; Shitrit, Alina; Rappaport, Noa; Kohn, Asher; Edgar, Ron; Shenhav, Liraz; Safran, Marilyn; Lancet, Doron; Guan-Golan, Yaron; Warshawsky, David; Shtrichman, Ronit

    2016-01-01

    Abstract Postgenomics data are produced in large volumes by life sciences and clinical applications of novel omics diagnostics and therapeutics for precision medicine. To move from “data-to-knowledge-to-innovation,” a crucial missing step in the current era is, however, our limited understanding of biological and clinical contexts associated with data. Prominent among the emerging remedies to this challenge are the gene set enrichment tools. This study reports on GeneAnalytics™ (geneanalytics.genecards.org), a comprehensive and easy-to-apply gene set analysis tool for rapid contextualization of expression patterns and functional signatures embedded in the postgenomics Big Data domains, such as Next Generation Sequencing (NGS), RNAseq, and microarray experiments. GeneAnalytics' differentiating features include in-depth evidence-based scoring algorithms, an intuitive user interface and proprietary unified data. GeneAnalytics employs the LifeMap Science's GeneCards suite, including the GeneCards®—the human gene database; the MalaCards—the human diseases database; and the PathCards—the biological pathways database. Expression-based analysis in GeneAnalytics relies on the LifeMap Discovery®—the embryonic development and stem cells database, which includes manually curated expression data for normal and diseased tissues, enabling advanced matching algorithm for gene–tissue association. This assists in evaluating differentiation protocols and discovering biomarkers for tissues and cells. Results are directly linked to gene, disease, or cell “cards” in the GeneCards suite. Future developments aim to enhance the GeneAnalytics algorithm as well as visualizations, employing varied graphical display items. Such attributes make GeneAnalytics a broadly applicable postgenomics data analyses and interpretation tool for translation of data to knowledge-based innovation in various Big Data fields such as precision medicine, ecogenomics, nutrigenomics

  7. Combining distance matrices on identical taxon sets for multi-gene analysis with singular value decomposition.

    Directory of Open Access Journals (Sweden)

    Melanie Abeysundera

    Full Text Available We present a simple and effective method for combining distance matrices from multiple genes on identical taxon sets to obtain a single representative distance matrix from which to derive a combined-gene phylogenetic tree. The method applies singular value decomposition (SVD to extract the greatest common signal present in the distances obtained from each gene. The first right eigenvector of the SVD, which corresponds to a weighted average of the distance matrices of all genes, can thus be used to derive a representative tree from multiple genes. We apply our method to three well known data sets and estimate the uncertainty using bootstrap methods. Our results show that this method works well for these three data sets and that the uncertainty in these estimates is small. A simulation study is conducted to compare the performance of our method with several other distance based approaches (namely SDM, SDM* and ACS97, and we find the performances of all these approaches are comparable in the consensus setting. The computational complexity of our method is similar to that of SDM. Besides constructing a representative tree from multiple genes, we also demonstrate how the subsequent eigenvalues and eigenvectors may be used to identify if there are conflicting signals in the data and which genes might be influential or outliers for the estimated combined-gene tree.

  8. Primer Sets Developed for Functional Genes Reveal Shifts in Functionality of Fungal Community in Soils

    NARCIS (Netherlands)

    Hannula, S.E.; van Veen, J.A.

    2016-01-01

    Phylogenetic diversity of soil microbes is a hot topic at the moment. However, the molecular tools for the assessment of functional diversity in the fungal community are less developed than tools based on genes encoding the ribosomal operon. Here 20 sets of primers targeting genes involved mainly in

  9. Identification of self-consistent modulons from bacterial microarray expression data with the help of structured regulon gene sets

    KAUST Repository

    Permina, Elizaveta A.

    2013-01-01

    Identification of bacterial modulons from series of gene expression measurements on microarrays is a principal problem, especially relevant for inadequately studied but practically important species. Usage of a priori information on regulatory interactions helps to evaluate parameters for regulatory subnetwork inference. We suggest a procedure for modulon construction where a seed regulon is iteratively updated with genes having expression patterns similar to those for regulon member genes. A set of genes essential for a regulon is used to control modulon updating. Essential genes for a regulon were selected as a subset of regulon genes highly related by different measures to each other. Using Escherichia coli as a model, we studied how modulon identification depends on the data, including the microarray experiments set, the adopted relevance measure and the regulon itself. We have found that results of modulon identification are highly dependent on all parameters studied and thus the resulting modulon varies substantially depending on the identification procedure. Yet, modulons that were identified correctly displayed higher stability during iterations, which allows developing a procedure for reliable modulon identification in the case of less studied species where the known regulatory interactions are sparse. Copyright © 2013 Taylor & Francis.

  10. A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity.

    Directory of Open Access Journals (Sweden)

    Adi L Tarca

    Full Text Available Identification of functional sets of genes associated with conditions of interest from omics data was first reported in 1999, and since, a plethora of enrichment methods were published for systematic analysis of gene sets collections including Gene Ontology and biological pathways. Despite their widespread usage in reducing the complexity of omics experiment results, their performance is poorly understood. Leveraging the existence of disease specific gene sets in KEGG and Metacore® databases, we compared the performance of sixteen methods under relaxed assumptions while using 42 real datasets (over 1,400 samples. Most of the methods ranked high the gene sets designed for specific diseases whenever samples from affected individuals were compared against controls via microarrays. The top methods for gene set prioritization were different from the top ones in terms of sensitivity, and four of the sixteen methods had large false positives rates assessed by permuting the phenotype of the samples. The best overall methods among those that generated reasonably low false positive rates, when permuting phenotypes, were PLAGE, GLOBALTEST, and PADOG. The best method in the category that generated higher than expected false positives was MRGSE.

  11. Gene-Based Analysis of Regionally Enriched Cortical Genes in GWAS Data Sets of Cognitive Traits and Psychiatric Disorders

    DEFF Research Database (Denmark)

    Ersland, Kari M; Christoforou, Andrea; Stansberg, Christine;

    2012-01-01

    the regionally enriched cortical genes to mine a genome-wide association study (GWAS) of the Norwegian Cognitive NeuroGenetics (NCNG) sample of healthy adults for association to nine psychometric tests measures. In addition, we explored GWAS data sets for the serious psychiatric disorders schizophrenia (SCZ) (n......Despite its estimated high heritability, the genetic architecture leading to differences in cognitive performance remains poorly understood. Different cortical regions play important roles in normal cognitive functioning and impairment. Recently, we reported on sets of regionally enriched genes...... in three different cortical areas (frontomedial, temporal and occipital cortices) of the adult rat brain. It has been suggested that genes preferentially, or specifically, expressed in one region or organ reflect functional specialisation. Employing a gene-based approach to the analysis, we used...

  12. Interests-in-motion in an informal, media-rich learning setting

    National Research Council Canada - National Science Library

    Ty Hollett

    2016-01-01

    .... Through a micro-ethnographic analysis of two youth’s Minecraft-centered gameplay in a public library, this article makes two primary contributions to research on learning within, and the design of, informal, media-rich settings...

  13. Incomplete information system and rough set theory models and attribute reductions

    CERN Document Server

    Yang, Xibei

    2012-01-01

    This study of the theory of generalizations of rough-set models in incomplete information systems discusses not only the regular attributes but also the criteria in these systems, and presents practical approaches to computing a number of reducts.

  14. Allele diversity for abiotic stress responsive candidate genes in chickpea reference set using gene based SNP markers

    Directory of Open Access Journals (Sweden)

    Manish eRoorkiwal

    2014-06-01

    Full Text Available Chickpea is an important food legume crop for the semi-arid regions, however, its productivity is adversely affected by various biotic and abiotic stresses. Identification of candidate genes associated with abiotic stress response will help breeding efforts aiming to enhance its productivity. With this objective, 10 abiotic stress responsive candidate genes were selected on the basis of prior knowledge of this complex trait. These 10 genes were subjected to allele specific sequencing across a chickpea reference set comprising 300 genotypes including 211 accessions of chickpea mini core collection. A total of 1.3 Mbp sequence data were generated. Multiple sequence alignment revealed 79 SNPs and 41 indels in nine genes while the CAP2 gene was found to be conserved across all the genotypes. Among ten candidate genes, the maximum number of SNPs (34 was observed in abscisic acid stress and ripening (ASR gene including 22 transitions, 11 transversions and one tri-allelic SNP. Nucleotide diversity varied from 0.0004 to 0.0029 while PIC values ranged from 0.01 (AKIN gene to 0.43 (CAP2 promoter. Haplotype analysis revealed that alleles were represented by more than two haplotype blocks, except alleles of the CAP2 and sucrose synthase (SuSy gene, where only one haplotype was identified. These genes can be used for association analysis and if validated, may be useful for enhancing abiotic stress, including drought tolerance, through molecular breeding.

  15. Identification of a conserved set of upregulated genes in mouse skeletal muscle hypertrophy and regrowth

    Science.gov (United States)

    Chaillou, Thomas; Jackson, Janna R.; England, Jonathan H.; Kirby, Tyler J.; Richards-White, Jena; Esser, Karyn A.; Dupont-Versteegden, Esther E.

    2014-01-01

    The purpose of this study was to compare the gene expression profile of mouse skeletal muscle undergoing two forms of growth (hypertrophy and regrowth) with the goal of identifying a conserved set of differentially expressed genes. Expression profiling by microarray was performed on the plantaris muscle subjected to 1, 3, 5, 7, 10, and 14 days of hypertrophy or regrowth following 2 wk of hind-limb suspension. We identified 97 differentially expressed genes (≥2-fold increase or ≥50% decrease compared with control muscle) that were conserved during the two forms of muscle growth. The vast majority (∼90%) of the differentially expressed genes was upregulated and occurred at a single time point (64 out of 86 genes), which most often was on the first day of the time course. Microarray analysis from the conserved upregulated genes showed a set of genes related to contractile apparatus and stress response at day 1, including three genes involved in mechanotransduction and four genes encoding heat shock proteins. Our analysis further identified three cell cycle-related genes at day and several genes associated with extracellular matrix (ECM) at both days 3 and 10. In conclusion, we have identified a core set of genes commonly upregulated in two forms of muscle growth that could play a role in the maintenance of sarcomere stability, ECM remodeling, cell proliferation, fast-to-slow fiber type transition, and the regulation of skeletal muscle growth. These findings suggest conserved regulatory mechanisms involved in the adaptation of skeletal muscle to increased mechanical loading. PMID:25554798

  16. Identification of a conserved set of upregulated genes in mouse skeletal muscle hypertrophy and regrowth.

    Science.gov (United States)

    Chaillou, Thomas; Jackson, Janna R; England, Jonathan H; Kirby, Tyler J; Richards-White, Jena; Esser, Karyn A; Dupont-Versteegden, Esther E; McCarthy, John J

    2015-01-01

    The purpose of this study was to compare the gene expression profile of mouse skeletal muscle undergoing two forms of growth (hypertrophy and regrowth) with the goal of identifying a conserved set of differentially expressed genes. Expression profiling by microarray was performed on the plantaris muscle subjected to 1, 3, 5, 7, 10, and 14 days of hypertrophy or regrowth following 2 wk of hind-limb suspension. We identified 97 differentially expressed genes (≥2-fold increase or ≥50% decrease compared with control muscle) that were conserved during the two forms of muscle growth. The vast majority (∼90%) of the differentially expressed genes was upregulated and occurred at a single time point (64 out of 86 genes), which most often was on the first day of the time course. Microarray analysis from the conserved upregulated genes showed a set of genes related to contractile apparatus and stress response at day 1, including three genes involved in mechanotransduction and four genes encoding heat shock proteins. Our analysis further identified three cell cycle-related genes at day and several genes associated with extracellular matrix (ECM) at both days 3 and 10. In conclusion, we have identified a core set of genes commonly upregulated in two forms of muscle growth that could play a role in the maintenance of sarcomere stability, ECM remodeling, cell proliferation, fast-to-slow fiber type transition, and the regulation of skeletal muscle growth. These findings suggest conserved regulatory mechanisms involved in the adaptation of skeletal muscle to increased mechanical loading. Copyright © 2015 the American Physiological Society.

  17. ChIP-Enrich: gene set enrichment testing for ChIP-seq data.

    Science.gov (United States)

    Welch, Ryan P; Lee, Chee; Imbriano, Paul M; Patil, Snehal; Weymouth, Terry E; Smith, R Alex; Scott, Laura J; Sartor, Maureen A

    2014-07-01

    Gene set enrichment testing can enhance the biological interpretation of ChIP-seq data. Here, we develop a method, ChIP-Enrich, for this analysis which empirically adjusts for gene locus length (the length of the gene body and its surrounding non-coding sequence). Adjustment for gene locus length is necessary because it is often positively associated with the presence of one or more peaks and because many biologically defined gene sets have an excess of genes with longer or shorter gene locus lengths. Unlike alternative methods, ChIP-Enrich can account for the wide range of gene locus length-to-peak presence relationships (observed in ENCODE ChIP-seq data sets). We show that ChIP-Enrich has a well-calibrated type I error rate using permuted ENCODE ChIP-seq data sets; in contrast, two commonly used gene set enrichment methods, Fisher's exact test and the binomial test implemented in Genomic Regions Enrichment of Annotations Tool (GREAT), can have highly inflated type I error rates and biases in ranking. We identify DNA-binding proteins, including CTCF, JunD and glucocorticoid receptor α (GRα), that show different enrichment patterns for peaks closer to versus further from transcription start sites. We also identify known and potential new biological functions of GRα. ChIP-Enrich is available as a web interface (http://chip-enrich.med.umich.edu) and Bioconductor package. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  18. Integrated analysis of DNA copy number and gene expression microarray data using gene sets

    NARCIS (Netherlands)

    R.X. de Menezes (Renee); M. Boetzer (Marten); M. Sieswerda (Melle); G.J.B. van Ommen; J.M. Boer (Judith)

    2009-01-01

    textabstractBackground: Genes that play an important role in tumorigenesis are expected to show association between DNA copy number and RNA expression. Optimal power to find such associations can only be achieved if analysing copy number and gene expression jointly. Furthermore, some copy number

  19. Bayesian Network Structure Learning Based On Rough Set and Mutual Information

    Directory of Open Access Journals (Sweden)

    Zuhong Feng

    2013-09-01

    Full Text Available Abstract In Bayesian network structure learning for incomplete data set, a common problem is too many attributes causing low efficiency and high computation complexity. In this paper, an algorithm of attribute reduction based on rough set is introduced. The algorithm can effectively reduce the dimension of attributes and quickly determine the network structure using mutual information for Bayesian network structure learning.

  20. Associations between DNA methylation and schizophrenia-related intermediate phenotypes - a gene set enrichment analysis.

    Science.gov (United States)

    Hass, Johanna; Walton, Esther; Wright, Carrie; Beyer, Andreas; Scholz, Markus; Turner, Jessica; Liu, Jingyu; Smolka, Michael N; Roessner, Veit; Sponheim, Scott R; Gollub, Randy L; Calhoun, Vince D; Ehrlich, Stefan

    2015-06-03

    Multiple genetic approaches have identified microRNAs as key effectors in psychiatric disorders as they post-transcriptionally regulate expression of thousands of target genes. However, their role in specific psychiatric diseases remains poorly understood. In addition, epigenetic mechanisms such as DNA methylation, which affect the expression of both microRNAs and coding genes, are critical for our understanding of molecular mechanisms in schizophrenia. Using clinical, imaging, genetic, and epigenetic data of 103 patients with schizophrenia and 111 healthy controls of the Mind Clinical Imaging Consortium (MCIC) study of schizophrenia, we conducted gene set enrichment analysis to identify markers for schizophrenia-associated intermediate phenotypes. Genes were ranked based on the correlation between DNA methylation patterns and each phenotype, and then searched for enrichment in 221 predicted microRNA target gene sets. We found the predicted hsa-miR-219a-5p target gene set to be significantly enriched for genes (EPHA4, PKNOX1, ESR1, among others) whose methylation status is correlated with hippocampal volume independent of disease status. Our results were strengthened by significant associations between hsa-miR-219a-5p target gene methylation patterns and hippocampus-related neuropsychological variables. IPA pathway analysis of the respective predicted hsa-miR-219a-5p target genes revealed associated network functions in behavior and developmental disorders. Altered methylation patterns of predicted hsa-miR-219a-5p target genes are associated with a structural aberration of the brain that has been proposed as a possible biomarker for schizophrenia. The (dys)regulation of microRNA target genes by epigenetic mechanisms may confer additional risk for developing psychiatric symptoms. Further study is needed to understand possible interactions between microRNAs and epigenetic changes and their impact on risk for brain-based disorders such as schizophrenia.

  1. Different gene sets contribute to different symptom dimensions of depression and anxiety.

    Science.gov (United States)

    van Veen, Tineke; Goeman, Jelle J; Monajemi, Ramin; Wardenaar, Klaas J; Hartman, Catharina A; Snieder, Harold; Nolte, Ilja M; Penninx, Brenda W J H; Zitman, Frans G

    2012-07-01

    Although many genetic association studies have been carried out, it remains unclear which genes contribute to depression. This may be due to heterogeneity of the DSM-IV category of depression. Specific symptom-dimensions provide a more homogenous phenotype. Furthermore, as effects of individual genes are small, analysis of genetic data at the pathway-level provides more power to detect associations and yield valuable biological insight. In 1,398 individuals with a Major Depressive Disorder, the symptom dimensions of the tripartite model of anxiety and depression, General Distress, Anhedonic Depression, and Anxious Arousal, were measured with the Mood and Anxiety Symptoms Questionnaire (30-item Dutch adaptation; MASQ-D30). Association of these symptom dimensions with candidate gene sets and gene sets from two public pathway databases was tested using the Global test. One pathway was associated with General Distress, and concerned molecules expressed in the endoplasmatic reticulum lumen. Seven pathways were associated with Anhedonic Depression. Important themes were neurodevelopment, neurodegeneration, and cytoskeleton. Furthermore, three gene sets associated with Anxious Arousal regarded development, morphology, and genetic recombination. The individual pathways explained up to 1.7% of the variance. These data demonstrate mechanisms that influence the specific dimensions. Moreover, they show the value of using dimensional phenotypes on one hand and gene sets on the other hand.

  2. A gene-based information gain method for detecting gene-gene interactions in case-control studies.

    Science.gov (United States)

    Li, Jin; Huang, Dongli; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Zhang, Ruijie; Jiang, Yongshuai; Lv, Hongchao; Wang, Limei

    2015-11-01

    Currently, most methods for detecting gene-gene interactions (GGIs) in genome-wide association studies are divided into SNP-based methods and gene-based methods. Generally, the gene-based methods can be more powerful than SNP-based methods. Some gene-based entropy methods can only capture the linear relationship between genes. We therefore proposed a nonparametric gene-based information gain method (GBIGM) that can capture both linear relationship and nonlinear correlation between genes. Through simulation with different odds ratio, sample size and prevalence rate, GBIGM was shown to be valid and more powerful than classic KCCU method and SNP-based entropy method. In the analysis of data from 17 genes on rheumatoid arthritis, GBIGM was more effective than the other two methods as it obtains fewer significant results, which was important for biological verification. Therefore, GBIGM is a suitable and powerful tool for detecting GGIs in case-control studies.

  3. Informal Music Education: The Nature of a Young Child's Engagement in an Individual Piano Lesson Setting

    Science.gov (United States)

    Kooistra, Lauren

    2016-01-01

    The purpose of this study was to gain insight into the nature of a young child's engagement in an individual music lesson setting based on principles of informal learning. The informal educational space allowed the child to observe, explore, and interact with a musical environment as a process of enculturation and development (Gordon, 2013;…

  4. Informal Music Education: The Nature of a Young Child's Engagement in an Individual Piano Lesson Setting

    Science.gov (United States)

    Kooistra, Lauren

    2016-01-01

    The purpose of this study was to gain insight into the nature of a young child's engagement in an individual music lesson setting based on principles of informal learning. The informal educational space allowed the child to observe, explore, and interact with a musical environment as a process of enculturation and development (Gordon, 2013;…

  5. An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information

    Directory of Open Access Journals (Sweden)

    Ao Li

    2009-04-01

    Full Text Available Motivation: Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods: By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS is introduced to automatically determine the boundary threshold. Results: Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations.

  6. Gene Regulatory Network Reconstruction Using Conditional Mutual Information

    Directory of Open Access Journals (Sweden)

    Xiaodong Wang

    2008-06-01

    Full Text Available The inference of gene regulatory network from expression data is an important area of research that provides insight to the inner workings of a biological system. The relevance-network-based approaches provide a simple and easily-scalable solution to the understanding of interaction between genes. Up until now, most works based on relevance network focus on the discovery of direct regulation using correlation coefficient or mutual information. However, some of the more complicated interactions such as interactive regulation and coregulation are not easily detected. In this work, we propose a relevance network model for gene regulatory network inference which employs both mutual information and conditional mutual information to determine the interactions between genes. For this purpose, we propose a conditional mutual information estimator based on adaptive partitioning which allows us to condition on both discrete and continuous random variables. We provide experimental results that demonstrate that the proposed regulatory network inference algorithm can provide better performance when the target network contains coregulated and interactively regulated genes.

  7. Global adaptive rank truncated product method for gene-set analysis in association studies.

    Science.gov (United States)

    Vilor-Tejedor, Natalia; Calle, M Luz

    2014-09-01

    Gene set analysis (GSA) aims to assess the overall association of a set of genetic variants with a phenotype and has the potential to detect subtle effects of variants in a gene or a pathway that might be missed when assessed individually. We present a new implementation of the Adaptive Rank Truncated Product method (ARTP) for analyzing the association of a set of Single Nucleotide Polymorphisms (SNPs) in a gene or pathway. The new implementation, referred to as globalARTP, improves the original one by allowing the different SNPs in the set to have different modes of inheritance. We perform a simulation study for exploring the power of the proposed methodology in a set of scenarios with different numbers of causal SNPs with different effect sizes. Moreover, we show the advantage of using the gene set approach in the context of an Alzheimer's disease case-control study where we explore the endocytosis pathway. The new method is implemented in the R function globalARTP of the globalGSA package available at http://cran.r-project.org. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. How information systems should support the information needs of general dentists in clinical settings: suggestions from a qualitative study

    Directory of Open Access Journals (Sweden)

    Wali Teena

    2010-02-01

    Full Text Available Abstract Background A major challenge in designing useful clinical information systems in dentistry is to incorporate clinical evidence based on dentists' information needs and then integrate the system seamlessly into the complex clinical workflow. However, little is known about the actual information needs of dentists during treatment sessions. The purpose of this study is to identify general dentists' information needs and the information sources they use to meet those needs in clinical settings so as to inform the design of dental information systems. Methods A semi-structured interview was conducted with a convenience sample of 18 general dentists in the Pittsburgh area during clinical hours. One hundred and five patient cases were reported by these dentists. Interview transcripts were coded and analyzed using thematic analysis with a constant comparative method to identify categories and themes regarding information needs and information source use patterns. Results Two top-level categories of information needs were identified: foreground and background information needs. To meet these needs, dentists used four types of information sources: clinical information/tasks, administrative tasks, patient education and professional development. Major themes of dentists' unmet information needs include: (1 timely access to information on various subjects; (2 better visual representations of dental problems; (3 access to patient-specific evidence-based information; and (4 accurate, complete and consistent documentation of patient records. Resource use patterns include: (1 dentists' information needs matched information source use; (2 little use of electronic sources took place during treatment; (3 source use depended on the nature and complexity of the dental problems; and (4 dentists routinely practiced cross-referencing to verify patient information. Conclusions Dentists have various information needs at the point of care. Among them, the needs

  9. Glutamatergic and GABAergic gene sets in attention-deficit/hyperactivity disorder

    DEFF Research Database (Denmark)

    Naaijen, J; Bralten, J; Poelmans, G

    2017-01-01

    Attention-deficit/hyperactivity disorder (ADHD) and autism spectrum disorders (ASD) often co-occur. Both are highly heritable; however, it has been difficult to discover genetic risk variants. Glutamate and GABA are main excitatory and inhibitory neurotransmitters in the brain; their balance...... within glutamatergic and GABAergic genes were investigated using the MAGMA software in an ADHD case-only sample (n=931), in which we assessed ASD symptoms and response inhibition on a Stop task. Gene set analysis for ADHD symptom severity, divided into inattention and hyperactivity/impulsivity symptoms......, autism symptom severity and inhibition were performed using principal component regression analyses. Subsequently, gene-wide association analyses were performed. The glutamate gene set showed an association with severity of hyperactivity/impulsivity (P=0.009), which was robust to correcting for genome...

  10. The Core Mouse Response to Infection by Neospora Caninum Defined by Gene Set Enrichment Analyses

    Science.gov (United States)

    Ellis, John; Goodswen, Stephen; Kennedy, Paul J; Bush, Stephen

    2012-01-01

    In this study, the BALB/c and Qs mouse responses to infection by the parasite Neospora caninum were investigated in order to identify host response mechanisms. Investigation was done using gene set (enrichment) analyses of microarray data. GSEA, MANOVA, Romer, subGSE and SAM-GS were used to study the contrasts Neospora strain type, Mouse type (BALB/c and Qs) and time post infection (6 hours post infection and 10 days post infection). The analyses show that the major signal in the core mouse response to infection is from time post infection and can be defined by gene ontology terms Protein Kinase Activity, Cell Proliferation and Transcription Initiation. Several terms linked to signaling, morphogenesis, response and fat metabolism were also identified. At 10 days post infection, genes associated with fatty acid metabolism were identified as up regulated in expression. The value of gene set (enrichment) analyses in the analysis of microarray data is discussed. PMID:23012496

  11. Intervene: a tool for intersection and visualization of multiple gene or genomic region sets.

    Science.gov (United States)

    Khan, Aziz; Mathelier, Anthony

    2017-05-31

    A common task for scientists relies on comparing lists of genes or genomic regions derived from high-throughput sequencing experiments. While several tools exist to intersect and visualize sets of genes, similar tools dedicated to the visualization of genomic region sets are currently limited. To address this gap, we have developed the Intervene tool, which provides an easy and automated interface for the effective intersection and visualization of genomic region or list sets, thus facilitating their analysis and interpretation. Intervene contains three modules: venn to generate Venn diagrams of up to six sets, upset to generate UpSet plots of multiple sets, and pairwise to compute and visualize intersections of multiple sets as clustered heat maps. Intervene, and its interactive web ShinyApp companion, generate publication-quality figures for the interpretation of genomic region and list sets. Intervene and its web application companion provide an easy command line and an interactive web interface to compute intersections of multiple genomic and list sets. They have the capacity to plot intersections using easy-to-interpret visual approaches. Intervene is developed and designed to meet the needs of both computer scientists and biologists. The source code is freely available at https://bitbucket.org/CBGR/intervene , with the web application available at https://asntech.shinyapps.io/intervene .

  12. Investigating the effect of paralogs on microarray gene-set analysis

    LENUS (Irish Health Repository)

    Faure, Andre J

    2011-01-24

    Abstract Background In order to interpret the results obtained from a microarray experiment, researchers often shift focus from analysis of individual differentially expressed genes to analyses of sets of genes. These gene-set analysis (GSA) methods use previously accumulated biological knowledge to group genes into sets and then aim to rank these gene sets in a way that reflects their relative importance in the experimental situation in question. We suspect that the presence of paralogs affects the ability of GSA methods to accurately identify the most important sets of genes for subsequent research. Results We show that paralogs, which typically have high sequence identity and similar molecular functions, also exhibit high correlation in their expression patterns. We investigate this correlation as a potential confounding factor common to current GSA methods using Indygene http:\\/\\/www.cbio.uct.ac.za\\/indygene, a web tool that reduces a supplied list of genes so that it includes no pairwise paralogy relationships above a specified sequence similarity threshold. We use the tool to reanalyse previously published microarray datasets and determine the potential utility of accounting for the presence of paralogs. Conclusions The Indygene tool efficiently removes paralogy relationships from a given dataset and we found that such a reduction, performed prior to GSA, has the ability to generate significantly different results that often represent novel and plausible biological hypotheses. This was demonstrated for three different GSA approaches when applied to the reanalysis of previously published microarray datasets and suggests that the redundancy and non-independence of paralogs is an important consideration when dealing with GSA methodologies.

  13. Defining diversity, specialization, and gene specificity in transcriptomes through information theory

    Science.gov (United States)

    Martínez, Octavio; Reyes-Valdés, M. Humberto

    2008-01-01

    The transcriptome is a set of genes transcribed in a given tissue under specific conditions and can be characterized by a list of genes with their corresponding frequencies of transcription. Transcriptome changes can be measured by counting gene tags from mRNA libraries or by measuring light signals in DNA microarrays. In any case, it is difficult to completely comprehend the global changes that occur in the transcriptome, given that thousands of gene expression measurements are involved. We propose an approach to define and estimate the diversity and specialization of transcriptomes and gene specificity. We define transcriptome diversity as the Shannon entropy of its frequency distribution. Gene specificity is defined as the mutual information between the tissues and the corresponding transcript, allowing detection of either housekeeping or highly specific genes and clarifying the meaning of these concepts in the literature. Tissue specialization is measured by average gene specificity. We introduce the formulae using a simple example and show their application in two datasets of gene expression in human tissues. Visualization of the positions of transcriptomes in a system of diversity and specialization coordinates makes it possible to understand at a glance their interrelations, summarizing in a powerful way which transcriptomes are richer in diversity of expressed genes, or which are relatively more specialized. The framework presented enlightens the relation among transcriptomes, allowing a better understanding of their changes through the development of the organism or in response to environmental stimuli. PMID:18606989

  14. Grouping miRNAs of similar functions via weighted information content of gene ontology.

    Science.gov (United States)

    Lan, Chaowang; Chen, Qingfeng; Li, Jinyan

    2016-12-22

    Regulation mechanisms between miRNAs and genes are complicated. To accomplish a biological function, a miRNA may regulate multiple target genes, and similarly a target gene may be regulated by multiple miRNAs. Wet-lab knowledge of co-regulating miRNAs is limited. This work introduces a computational method to group miRNAs of similar functions to identify co-regulating miRNAsfrom a similarity matrix of miRNAs. We define a novel information content of gene ontology (GO) to measure similarity between two sets of GO graphs corresponding to the two sets of target genes of two miRNAs. This between-graph similarity is then transferred as a functional similarity between the two miRNAs. Our definition of the information content is based on the size of a GO term's descendants, but adjusted by a weight derived from its depth level and the GO relationships at its path to the root node or to the most informative common ancestor (MICA). Further, a self-tuning technique and the eigenvalues of the normalized Laplacian matrix are applied to determine the optimal parameters for the spectral clustering of the similarity matrix of the miRNAs. Experimental results demonstrate that our method has better clustering performance than the existing edge-based, node-based or hybrid methods. Our method has also demonstrated a novel usefulness for the function annotation of new miRNAs, as reported in the detailed case studies.

  15. Minimum Information about a Biosynthetic Gene cluster : commentary

    NARCIS (Netherlands)

    Medema, Marnix H; Kottmann, Renzo; Yilmaz, Pelin; Cummings, Matthew; Biggins, John B; Blin, Kai; de Bruijn, Irene; Chooi, Yit Heng; Claesen, Jan; Coates, R Cameron; Cruz-Morales, Pablo; Duddela, Srikanth; Dusterhus, Stephanie; Edwards, Daniel J; Fewer, David P; Garg, Neha; Geiger, Christoph; Gomez-Escribano, Juan Pablo; Greule, Anja; Hadjithomas, Michalis; Haines, Anthony S; Helfrich, Eric J N; Hillwig, Matthew L; Ishida, Keishi; Jones, Adam C; Jones, Carla S; Jungmann, Katrin; Kegler, Carsten; Kim, Hyun Uk; Kotter, Peter; Krug, Daniel; Masschelein, Joleen; Melnik, Alexey V; Mantovani, Simone M; Monroe, Emily A; Moore, Marcus; Moss, Nathan; Nutzmann, Hans-Wilhelm; Pan, Guohui; Pati, Amrita; Petras, Daniel; Reen, F Jerry; Rosconi, Federico; Rui, Zhe; Tian, Zhenhua; Tobias, Nicholas J; Tsunematsu, Yuta; Wiemann, Philipp; Wyckoff, Elizabeth; Yan, Xiaohui; Yim, Grace; Yu, Fengan; Xie, Yunchang; Aigle, Bertrand; Apel, Alexander K; Balibar, Carl J; Balskus, Emily P; Barona-Gomez, Francisco; Bechthold, Andreas; Bode, Helge B; Borriss, Rainer; Brady, Sean F; Brakhage, Axel A; Caffrey, Patrick; Cheng, Yi-Qiang; Clardy, Jon; Cox, Russell J; De Mot, Rene; Donadio, Stefano; Donia, Mohamed S; van der Donk, Wilfred A; Dorrestein, Pieter C; Doyle, Sean; Driessen, Arnold J M; Ehling-Schulz, Monika; Entian, Karl-Dieter; Fischbach, Michael A; Gerwick, Lena; Gerwick, William H; Gross, Harald; Gust, Bertolt; Hertweck, Christian; Hofte, Monica; Jensen, Susan E; Ju, Jianhua; Katz, Leonard; Kaysser, Leonard; Klassen, Jonathan L; Keller, Nancy P; Kormanec, Jan; Kuipers, Oscar P; Kuzuyama, Tomohisa; Kyrpides, Nikos C; Kwon, Hyung-Jin; Lautru, Sylvie; Lavigne, Rob; Lee, Chia Y; Linquan, Bai; Liu, Xinyu; Liu, Wen; Luzhetskyy, Andriy; Mahmud, Taifo; Mast, Yvonne; Mendez, Carmen; Metsa-Ketela, Mikko; Micklefield, Jason; Mitchell, Douglas A; Moore, Bradley S; Moreira, Leonilde M; Muller, Rolf; Neilan, Brett A; Nett, Markus; Nielsen, Jens; O'Gara, Fergal; Oikawa, Hideaki; Osbourn, Anne; Osburne, Marcia S; Ostash, Bohdan; Payne, Shelley M; Pernodet, Jean-Luc; Petricek, Miroslav; Piel, Jorn; Ploux, Olivier; Raaijmakers, Jos M; Salas, Jose A; Schmitt, Esther K; Scott, Barry; Seipke, Ryan F; Shen, Ben; Sherman, David H; Sivonen, Kaarina; Smanski, Michael J; Sosio, Margherita; Stegmann, Evi; Sussmuth, Roderich D; Tahlan, Kapil; Thomas, Christopher M; Tang, Yi; Truman, Andrew W; Viaud, Muriel; Walton, Jonathan D; Walsh, Christopher T; Weber, Tilmann; van Wezel, Gilles P; Wilkinson, Barrie; Willey, Joanne M; Wohlleben, Wolfgang; Wright, Gerard D; Ziemert, Nadine; Zhang, Changsheng; Zotchev, Sergey B; Breitling, Rainer; Takano, Eriko; Glockner, Frank Oliver

    2015-01-01

    A wide variety of enzymatic pathways that produce specialized metabolites in bacteria, fungi and plants are known to be encoded in biosynthetic gene clusters. Information about these clusters, pathways and metabolites is currently dispersed throughout the literature, making it difficult to exploit.

  16. Accurate Gene Expression-Based Biodosimetry Using a Minimal Set of Human Gene Transcripts

    Energy Technology Data Exchange (ETDEWEB)

    Tucker, James D., E-mail: jtucker@biology.biosci.wayne.edu [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Joiner, Michael C. [Department of Radiation Oncology, Wayne State University, Detroit, Michigan (United States); Thomas, Robert A.; Grever, William E.; Bakhmutsky, Marina V. [Department of Biological Sciences, Wayne State University, Detroit, Michigan (United States); Chinkhota, Chantelle N.; Smolinski, Joseph M. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States); Divine, George W. [Department of Public Health Sciences, Henry Ford Hospital, Detroit, Michigan (United States); Auner, Gregory W. [Department of Electrical and Computer Engineering, Wayne State University, Detroit, Michigan (United States)

    2014-03-15

    Purpose: Rapid and reliable methods for conducting biological dosimetry are a necessity in the event of a large-scale nuclear event. Conventional biodosimetry methods lack the speed, portability, ease of use, and low cost required for triaging numerous victims. Here we address this need by showing that polymerase chain reaction (PCR) on a small number of gene transcripts can provide accurate and rapid dosimetry. The low cost and relative ease of PCR compared with existing dosimetry methods suggest that this approach may be useful in mass-casualty triage situations. Methods and Materials: Human peripheral blood from 60 adult donors was acutely exposed to cobalt-60 gamma rays at doses of 0 (control) to 10 Gy. mRNA expression levels of 121 selected genes were obtained 0.5, 1, and 2 days after exposure by reverse-transcriptase real-time PCR. Optimal dosimetry at each time point was obtained by stepwise regression of dose received against individual gene transcript expression levels. Results: Only 3 to 4 different gene transcripts, ASTN2, CDKN1A, GDF15, and ATM, are needed to explain ≥0.87 of the variance (R{sup 2}). Receiver-operator characteristics, a measure of sensitivity and specificity, of 0.98 for these statistical models were achieved at each time point. Conclusions: The actual and predicted radiation doses agree very closely up to 6 Gy. Dosimetry at 8 and 10 Gy shows some effect of saturation, thereby slightly diminishing the ability to quantify higher exposures. Analyses of these gene transcripts may be advantageous for use in a field-portable device designed to assess exposures in mass casualty situations or in clinical radiation emergencies.

  17. Gene regulatory network inference using fused LASSO on multiple data sets.

    Science.gov (United States)

    Omranian, Nooshin; Eloundou-Mbebi, Jeanne M O; Mueller-Roeber, Bernd; Nikoloski, Zoran

    2016-02-11

    Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions.

  18. Primer Sets Developed for Functional Genes Reveal Shifts in Functionality of Fungal Community in Soils

    Science.gov (United States)

    Hannula, S. Emilia; van Veen, Johannes A.

    2016-01-01

    Phylogenetic diversity of soil microbes is a hot topic at the moment. However, the molecular tools for the assessment of functional diversity in the fungal community are less developed than tools based on genes encoding the ribosomal operon. Here 20 sets of primers targeting genes involved mainly in carbon cycling were designed and/or validated and the functioning of soil fungal communities along a chronosequence of land abandonment from agriculture was evaluated using them. We hypothesized that changes in fungal community structure during secondary succession would lead to difference in the types of genes present in soils and that these changes would be directional. We expected an increase in genes involved in degradation of recalcitrant organic matter in time since agriculture. Out of the investigated genes, the richness of the genes related to carbon cycling was significantly higher in fields abandoned for longer time. The composition of six of the genes analyzed revealed significant differences between fields abandoned for shorter and longer time. However, all genes revealed significant variance over the fields studied, and this could be related to other parameters than the time since agriculture such as pH, organic matter, and the amount of available nitrogen. Contrary to our initial hypothesis, the genes significantly different between fields were not related to the decomposition of more recalcitrant matter but rather involved in degradation of cellulose and hemicellulose. PMID:27965632

  19. Primer Sets Developed for Functional Genes Reveal Shifts in Functionality of Fungal Community in Soils.

    Science.gov (United States)

    Hannula, S Emilia; van Veen, Johannes A

    2016-01-01

    Phylogenetic diversity of soil microbes is a hot topic at the moment. However, the molecular tools for the assessment of functional diversity in the fungal community are less developed than tools based on genes encoding the ribosomal operon. Here 20 sets of primers targeting genes involved mainly in carbon cycling were designed and/or validated and the functioning of soil fungal communities along a chronosequence of land abandonment from agriculture was evaluated using them. We hypothesized that changes in fungal community structure during secondary succession would lead to difference in the types of genes present in soils and that these changes would be directional. We expected an increase in genes involved in degradation of recalcitrant organic matter in time since agriculture. Out of the investigated genes, the richness of the genes related to carbon cycling was significantly higher in fields abandoned for longer time. The composition of six of the genes analyzed revealed significant differences between fields abandoned for shorter and longer time. However, all genes revealed significant variance over the fields studied, and this could be related to other parameters than the time since agriculture such as pH, organic matter, and the amount of available nitrogen. Contrary to our initial hypothesis, the genes significantly different between fields were not related to the decomposition of more recalcitrant matter but rather involved in degradation of cellulose and hemicellulose.

  20. Primer sets developed for fungal functional genes reveal shifts in functionality of fungal community in soils

    Directory of Open Access Journals (Sweden)

    Emilia Silja Hannula

    2016-11-01

    Full Text Available Phylogenetic diversity of soil microbes is a hot topic at the moment. However, the molecular tools for the assessment of functional diversity in the fungal community are less developed than tools based on genes encoding the ribosomal operon. Here 20 sets of primers targeting genes involved mainly in carbon cycling were designed and/or validated and the functioning of soil fungal communities along a chronosequence of land abandonment from agriculture was evaluated using them. We hypothesized that changes in fungal community structure during secondary succession would lead to difference in the types of genes present in soils and that these changes would be directional. We expected an increase in genes involved in degradation of recalcitrant organic matter in time since agriculture. Out of the investigated genes, the richness of the genes related to carbon cycling was significantly higher in fields abandoned for longer time. The composition of six of the genes analyzed revealed significant differences between fields abandoned for shorter and longer time. However, all genes revealed significant variance over the fields studied, and this could be related to other parameters than the time since agriculture such as pH, organic matter and the amount of available nitrogen. Contrary to our initial hypothesis, the genes significantly different between fields were not related to the decomposition of more recalcitrant matter but rather involved in degradation of cellulose and hemicellulose.

  1. Health information technology and quality of health care: strategies for reducing disparities in underresourced settings.

    Science.gov (United States)

    Millery, Mari; Kukafka, Rita

    2010-10-01

    Health information technology (health IT) has potential for facilitating quality improvement and reducing quality disparities found in underresourced settings (URSs). With this systematic literature review, complemented by key informant interviews, the authors sought to identify evidence regarding health IT and quality outcomes in URSs. The review included 105 peer-reviewed studies (2004-2009) in all settings. Only 15 studies included URSs, and 8 focused on URSs. Based on literature across settings, most evidence was available for quality impact of order entry, clinical decision support systems, and computerized reminders. Study designs were predominantly quasi-experimental (37%) or descriptive (35%); 90% of the studies focused on the microsystem level of quality improvement, indicating a need for expanding research into patient experience and organizational and environmental levels. Key informants highlighted organizational partnerships and health IT champions and emphasized that for health IT to have an impact on quality, there must be an organizational culture of quality improvement.

  2. Developing the Role of a Health Information Professional in a Clinical Research Setting

    Directory of Open Access Journals (Sweden)

    Helen M. Seeley

    2010-06-01

    Full Text Available Objective ‐ This paper examines the role of a health information professional in a large multidisciplinary project to improve services for head injury.Methods ‐ An action research approach was taken, with the information professional acting as co‐ordinator. Change management processes were guided by theory and evidence. The health information professional was responsible for an ongoing literature review on knowledge management (clinical and political issues, data collection and analysis (from patient records, collating and comparing data (to help develop standards, and devising appropriate dissemination strategies.Results ‐ Important elements of the health information management role proved to be 1 co‐ordination; 2 setting up mechanisms for collaborative learning through information sharing; and 3 using the theoretical frameworks (identified from the literature review to help guide implementation. The role that emerged here has some similarities to the informationist role that stresses domain knowledge, continuous learning and working in context (embedding. This project also emphasised the importance of co‐ordination, and the ability to work across traditional library information analysis (research literature discovery and appraisal and information analysis of patient data sets (the information management role.Conclusion ‐ Experience with this project indicates that health information professionals will need to be prepared to work with patient record data and synthesis of that data, design systems to co‐ordinate patient data collection, as well as critically appraise external evidence.

  3. Correlating Information Contents of Gene Ontology Terms to Infer Semantic Similarity of Gene Products

    Directory of Open Access Journals (Sweden)

    Mingxin Gan

    2014-01-01

    Full Text Available Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may significantly overestimate semantic similarity between genes that are actually not functionally related, thereby yielding misleading results in applications. To overcome this limitation, we propose to represent a gene product as a vector that is composed of information contents of gene ontology terms annotated for the gene product, and we suggest calculating similarity between two gene products as the relatedness of their corresponding vectors using three measures: Pearson’s correlation coefficient, cosine similarity, and the Jaccard index. We focus on the biological process domain of the gene ontology and annotations of yeast proteins to study the effectiveness of the proposed measures. Results show that semantic similarity scores calculated using the proposed measures are more consistent with known biological knowledge than those derived using a list of existing methods, suggesting the effectiveness of our method in characterizing functional relationships between gene products.

  4. Correlating information contents of gene ontology terms to infer semantic similarity of gene products.

    Science.gov (United States)

    Gan, Mingxin

    2014-01-01

    Successful applications of the gene ontology to the inference of functional relationships between gene products in recent years have raised the need for computational methods to automatically calculate semantic similarity between gene products based on semantic similarity of gene ontology terms. Nevertheless, existing methods, though having been widely used in a variety of applications, may significantly overestimate semantic similarity between genes that are actually not functionally related, thereby yielding misleading results in applications. To overcome this limitation, we propose to represent a gene product as a vector that is composed of information contents of gene ontology terms annotated for the gene product, and we suggest calculating similarity between two gene products as the relatedness of their corresponding vectors using three measures: Pearson's correlation coefficient, cosine similarity, and the Jaccard index. We focus on the biological process domain of the gene ontology and annotations of yeast proteins to study the effectiveness of the proposed measures. Results show that semantic similarity scores calculated using the proposed measures are more consistent with known biological knowledge than those derived using a list of existing methods, suggesting the effectiveness of our method in characterizing functional relationships between gene products.

  5. Using Mutual Information Criterion to Design an Efficient Phoneme Set for Chinese Speech Recognition

    Science.gov (United States)

    Zhang, Jin-Song; Hu, Xin-Hui; Nakamura, Satoshi

    Chinese is a representative tonal language, and it has been an attractive topic of how to process tone information in the state-of-the-art large vocabulary speech recognition system. This paper presents a novel way to derive an efficient phoneme set of tone-dependent units to build a recognition system, by iteratively merging a pair of tone-dependent units according to the principle of minimal loss of the Mutual Information (MI). The mutual information is measured between the word tokens and their phoneme transcriptions in a training text corpus, based on the system lexical and language model. The approach has a capability to keep discriminative tonal (and phoneme) contrasts that are most helpful for disambiguating homophone words due to lack of tones, and merge those tonal (and phoneme) contrasts that are not important for word disambiguation for the recognition task. This enables a flexible selection of phoneme set according to a balance between the MI information amount and the number of phonemes. We applied the method to traditional phoneme set of Initial/Finals, and derived several phoneme sets with different number of units. Speech recognition experiments using the derived sets showed its effectiveness.

  6. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    Directory of Open Access Journals (Sweden)

    Hettne Kristina M

    2013-01-01

    Full Text Available Abstract Background Availability of chemical response-specific lists of genes (gene sets for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM, and that these can be used with gene set analysis (GSA methods for chemical treatment identification, for pharmacological mechanism elucidation, and for comparing compound toxicity profiles. Methods We created 30,211 chemical response-specific gene sets for human and mouse by next-gen TM, and derived 1,189 (human and 588 (mouse gene sets from the Comparative Toxicogenomics Database (CTD. We tested for significant differential expression (SDE (false discovery rate -corrected p-values Results Next-gen TM-derived gene sets matching the chemical treatment were significantly altered in three GE data sets, and the corresponding CTD-derived gene sets were significantly altered in five GE data sets. Six next-gen TM-derived and four CTD-derived fibrate gene sets were significantly altered in the PPARA knock-out GE dataset. None of the fibrate signatures in cMap scored significant against the PPARA GE signature. 33 environmental toxicant gene sets were significantly altered in the triazole GE data sets. 21 of these toxicants had a similar toxicity pattern as the triazoles. We confirmed embryotoxic effects, and discriminated triazoles from other chemicals. Conclusions Gene set analysis with next-gen TM-derived chemical response-specific gene sets is a scalable method for identifying similarities in gene responses to other chemicals, from which one may infer potential mode of action and/or toxic effect.

  7. Evaluation of a gene information summarization system by users during the analysis process of microarray datasets

    Directory of Open Access Journals (Sweden)

    Cohen Aaron

    2009-02-01

    Full Text Available Abstract Background Summarization of gene information in the literature has the potential to help genomics researchers translate basic research into clinical benefits. Gene expression microarrays have been used to study biomarkers for disease and discover novel types of therapeutics and the task of finding information in journal articles on sets of genes is common for translational researchers working with microarray data. However, manually searching and scanning the literature references returned from PubMed is a time-consuming task for scientists. We built and evaluated an automatic summarizer of information on genes studied in microarray experiments. The Gene Information Clustering and Summarization System (GICSS is a system that integrates two related steps of the microarray data analysis process: functional gene clustering and gene information gathering. The system evaluation was conducted during the process of genomic researchers analyzing their own experimental microarray datasets. Results The clusters generated by GICSS were validated by scientists during their microarray analysis process. In addition, presenting sentences in the abstract provided significantly more important information to the users than just showing the title in the default PubMed format. Conclusion The evaluation results suggest that GICSS can be useful for researchers in genomic area. In addition, the hybrid evaluation method, partway between intrinsic and extrinsic system evaluation, may enable researchers to gauge the true usefulness of the tool for the scientists in their natural analysis workflow and also elicit suggestions for future enhancements. Availability GICSS can be accessed online at: http://ir.ohsu.edu/jianji/index.html

  8. The identification of informative genes from multiple datasets with increasing complexity

    Directory of Open Access Journals (Sweden)

    't Hoen Peter AC

    2010-01-01

    Full Text Available Abstract Background In microarray data analysis, factors such as data quality, biological variation, and the increasingly multi-layered nature of more complex biological systems complicates the modelling of regulatory networks that can represent and capture the interactions among genes. We believe that the use of multiple datasets derived from related biological systems leads to more robust models. Therefore, we developed a novel framework for modelling regulatory networks that involves training and evaluation on independent datasets. Our approach includes the following steps: (1 ordering the datasets based on their level of noise and informativeness; (2 selection of a Bayesian classifier with an appropriate level of complexity by evaluation of predictive performance on independent data sets; (3 comparing the different gene selections and the influence of increasing the model complexity; (4 functional analysis of the informative genes. Results In this paper, we identify the most appropriate model complexity using cross-validation and independent test set validation for predicting gene expression in three published datasets related to myogenesis and muscle differentiation. Furthermore, we demonstrate that models trained on simpler datasets can be used to identify interactions among genes and select the most informative. We also show that these models can explain the myogenesis-related genes (genes of interest significantly better than others (P et al. in identifying informative genes from multiple datasets with increasing complexity whilst additionally modelling the interaction between genes. Conclusions We show that Bayesian networks derived from simpler controlled systems have better performance than those trained on datasets from more complex biological systems. Further, we present that highly predictive and consistent genes, from the pool of differentially expressed genes, across independent datasets are more likely to be fundamentally

  9. A small set of extra-embryonic genes defines a new landmark for bovine embryo staging.

    Science.gov (United States)

    Degrelle, Séverine A; Lê Cao, Kim-Anh; Heyman, Yvan; Everts, Robin E; Campion, Evelyne; Richard, Christophe; Ducroix-Crépy, Céline; Tian, X Cindy; Lewin, Harris A; Renard, Jean-Paul; Robert-Granié, Christèle; Hue, Isabelle

    2011-01-01

    Axis specification in mouse is determined by a sequence of reciprocal interactions between embryonic and extra-embryonic tissues so that a few extra-embryonic genes appear as 'patterning' the embryo. Considering these interactions as essential, but lacking in most mammals the genetically driven approaches used in mouse and the corresponding patterning mutants, we examined whether a molecular signature originating from extra-embryonic tissues could relate to the developmental stage of the embryo proper and predict it. To this end, we have profiled bovine extra-embryonic tissues at peri-implantation stages, when gastrulation and early neurulation occur, and analysed the subsequent expression profiles through the use of predictive methods as previously reported for tumour classification. A set of six genes (CALM1, CPA3, CITED1, DLD, HNRNPDL, and TGFB3), half of which had not been previously associated with any extra-embryonic feature, appeared significantly discriminative and mainly dependent on embryonic tissues for its faithful expression. The predictive value of this set of genes for gastrulation and early neurulation stages, as assessed on naive samples, was remarkably high (93%). In silico connected to the bovine orthologues of the mouse patterning genes, this gene set is proposed as a new trait for embryo staging. As such, this will allow saving the bovine embryo proper for molecular or cellular studies. To us, it offers as well new perspectives for developmental phenotyping and modelling of embryonic/extra-embryonic co-differentiation.

  10. Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

    Science.gov (United States)

    Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...

  11. Genes are information, so information theory is coming to the aid of evolutionary biology.

    Science.gov (United States)

    Sherwin, William B

    2015-11-01

    Speciation is central to evolutionary biology, and to elucidate it, we need to catch the early genetic changes that set nascent taxa on their path to species status (Via 2009). That challenge is difficult, of course, for two chief reasons: (i) serendipity is required to catch speciation in the act; and (ii) after a short time span with lingering gene flow, differentiation may be low and/or embodied only in rare alleles that are difficult to sample. In this issue of Molecular Ecology Resources, Smouse et al. (2015) have noted that optimal assessment of differentiation within and between nascent species should be robust to these challenges, and they identified a measure based on Shannon's information theory that has many advantages for this and numerous other tasks. The Shannon measure exhibits complete additivity of information at different levels of subdivision. Of all the family of diversity measures ('0' or allele counts, '1' or Shannon, '2' or heterozygosity, F(ST) and related metrics) Shannon's measure comes closest to weighting alleles by their frequencies. For the Shannon measure, rare alleles that represent early signals of nascent speciation are neither down-weighted to the point of irrelevance, as for level 2 measures, nor up-weighted to overpowering importance, as for level 0 measures (Chao et al. 2010, )2015. Shannon measures have a long history in population genetics, dating back to Shannon's PhD thesis in 1940 (Crow 2001), but have received only sporadic attention, until a resurgence of interest in the last ten years, as reviewed briefly by Smouse et al. (2015). © 2015 John Wiley & Sons Ltd.

  12. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    K.M. Hettne (Kristina); J. Boorsma (Jeffrey); D.A.M. van Dartel (Dorien A M); J.J. Goeman (Jelle); E.C. de Jong (Esther); A.H. Piersma (Aldert); R.H. Stierum (Rob); J. Kleinjans (Jos); J.A. Kors (Jan)

    2013-01-01

    textabstractBackground: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with

  13. Next-generation text-mining mediated generation of chemical response-specific gene sets for interpretation of gene expression data

    NARCIS (Netherlands)

    K.M. Hettne (Kristina); J. Boorsma (Jeffrey); D.A.M. van Dartel (Dorien A M); J.J. Goeman (Jelle); E.C. de Jong (Esther); A.H. Piersma (Aldert); R.H. Stierum (Rob); J. Kleinjans (Jos); J.A. Kors (Jan)

    2013-01-01

    textabstractBackground: Availability of chemical response-specific lists of genes (gene sets) for pharmacological and/or toxic effect prediction for compounds is limited. We hypothesize that more gene sets can be created by next-generation text mining (next-gen TM), and that these can be used with g

  14. General approach for in vivo recovery of cell type-specific effector gene sets.

    Science.gov (United States)

    Barsi, Julius C; Tu, Qiang; Davidson, Eric H

    2014-05-01

    Differentially expressed, cell type-specific effector gene sets hold the key to multiple important problems in biology, from theoretical aspects of developmental gene regulatory networks (GRNs) to various practical applications. Although individual cell types of interest have been recovered by various methods and analyzed, systematic recovery of multiple cell type-specific gene sets from whole developing organisms has remained problematic. Here we describe a general methodology using the sea urchin embryo, a material of choice because of the large-scale GRNs already solved for this model system. This method utilizes the regulatory states expressed by given cells of the embryo to define cell type and includes a fluorescence activated cell sorting (FACS) procedure that results in no perturbation of transcript representation. We have extensively validated the method by spatial and qualitative analyses of the transcriptome expressed in isolated embryonic skeletogenic cells and as a consequence, generated a prototypical cell type-specific transcriptome database.

  15. Gowinda: unbiased analysis of gene set enrichment for genome-wide association studies.

    Science.gov (United States)

    Kofler, Robert; Schlötterer, Christian

    2012-08-01

    An analysis of gene set [e.g. Gene Ontology (GO)] enrichment assumes that all genes are sampled independently from each other with the same probability. These assumptions are violated in genome-wide association (GWA) studies since (i) longer genes typically have more single-nucleotide polymorphisms resulting in a higher probability of being sampled and (ii) overlapping genes are sampled in clusters. Herein, we introduce Gowinda, a software specifically designed to test for enrichment of gene sets in GWA studies. We show that GO tests on GWA data could result in a substantial number of false-positive GO terms. Permutation tests implemented in Gowinda eliminate these biases, but maintain sufficient power to detect enrichment of GO terms. Since sufficient resolution for large datasets requires millions of permutations, we use multi-threading to keep computation times reasonable. Gowinda is implemented in Java (v1.6) and freely available on http://code.google.com/p/gowinda/ christian.schloetterer@vetmeduni.ac.at Manual: http://code.google.com/p/gowinda/wiki/Manual. Test data and tutorial: http://code.google.com/p/gowinda/wiki/Tutorial. http://code.google.com/p/gowinda/wiki/VALIDATION.

  16. Comprehensive set of integrative plasmid vectors for copper-inducible gene expression in Myxococcus xanthus.

    Science.gov (United States)

    Gómez-Santos, Nuria; Treuner-Lange, Anke; Moraleda-Muñoz, Aurelio; García-Bravo, Elena; García-Hernández, Raquel; Martínez-Cayuela, Marina; Pérez, Juana; Søgaard-Andersen, Lotte; Muñoz-Dorado, José

    2012-04-01

    Myxococcus xanthus is widely used as a model system for studying gliding motility, multicellular development, and cellular differentiation. Moreover, M. xanthus is a rich source of novel secondary metabolites. The analysis of these processes has been hampered by the limited set of tools for inducible gene expression. Here we report the construction of a set of plasmid vectors to allow copper-inducible gene expression in M. xanthus. Analysis of the effect of copper on strain DK1622 revealed that copper concentrations of up to 500 μM during growth and 60 μM during development do not affect physiological processes such as cell viability, motility, or aggregation into fruiting bodies. Of the copper-responsive promoters in M. xanthus reported so far, the multicopper oxidase cuoA promoter was used to construct expression vectors, because no basal expression is observed in the absence of copper and induction linearly depends on the copper concentration in the culture medium. Four different plasmid vectors have been constructed, with different marker selection genes and sites of integration in the M. xanthus chromosome. The vectors have been tested and gene expression quantified using the lacZ gene. Moreover, we demonstrate the functional complementation of the motility defect caused by lack of PilB by the copper-induced expression of the pilB gene. These versatile vectors are likely to deepen our understanding of the biology of M. xanthus and may also have biotechnological applications.

  17. Identification of a set of genes showing regionally enriched expression in the mouse brain

    Directory of Open Access Journals (Sweden)

    Marra Marco A

    2008-07-01

    Full Text Available Abstract Background The Pleiades Promoter Project aims to improve gene therapy by designing human mini-promoters ( Results We have utilized LongSAGE to identify regionally enriched transcripts in the adult mouse brain. As supplemental strategies, we also performed a meta-analysis of published literature and inspected the Allen Brain Atlas in situ hybridization data. From a set of approximately 30,000 mouse genes, 237 were identified as showing specific or enriched expression in 30 target regions of the mouse brain. GO term over-representation among these genes revealed co-involvement in various aspects of central nervous system development and physiology. Conclusion Using a multi-faceted expression validation approach, we have identified mouse genes whose human orthologs are good candidates for design of mini-promoters. These mouse genes represent molecular markers in several discrete brain regions/cell-types, which could potentially provide a mechanistic explanation of unique functions performed by each region. This set of markers may also serve as a resource for further studies of gene regulatory elements influencing brain expression.

  18. Can survival prediction be improved by merging gene expression data sets?

    Directory of Open Access Journals (Sweden)

    Haleh Yasrebi

    Full Text Available BACKGROUND: High-throughput gene expression profiling technologies generating a wealth of data, are increasingly used for characterization of tumor biopsies for clinical trials. By applying machine learning algorithms to such clinically documented data sets, one hopes to improve tumor diagnosis, prognosis, as well as prediction of treatment response. However, the limited number of patients enrolled in a single trial study limits the power of machine learning approaches due to over-fitting. One could partially overcome this limitation by merging data from different studies. Nevertheless, such data sets differ from each other with regard to technical biases, patient selection criteria and follow-up treatment. It is therefore not clear at all whether the advantage of increased sample size outweighs the disadvantage of higher heterogeneity of merged data sets. Here, we present a systematic study to answer this question specifically for breast cancer data sets. We use survival prediction based on Cox regression as an assay to measure the added value of merged data sets. RESULTS: Using time-dependent Receiver Operating Characteristic-Area Under the Curve (ROC-AUC and hazard ratio as performance measures, we see in overall no significant improvement or deterioration of survival prediction with merged data sets as compared to individual data sets. This apparently was due to the fact that a few genes with strong prognostic power were not available on all microarray platforms and thus were not retained in the merged data sets. Surprisingly, we found that the overall best performance was achieved with a single-gene predictor consisting of CYB5D1. CONCLUSIONS: Merging did not deteriorate performance on average despite (a The diversity of microarray platforms used. (b The heterogeneity of patients cohorts. (c The heterogeneity of breast cancer disease. (d Substantial variation of time to death or relapse. (e The reduced number of genes in the merged data

  19. Large-scale modeling of condition-specific gene regulatory networks by information integration and inference.

    Science.gov (United States)

    Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

    2014-12-01

    Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research.

  20. Comparing nursing handover and documentation: forming one set of patient information.

    Science.gov (United States)

    Johnson, M; Sanchez, P; Suominen, H; Basilakis, J; Dawson, L; Kelly, B; Hanlen, L

    2014-03-01

    The aim of this study was to explore the potential for one set of patient information for nursing handover and documentation. Communication of patient information requires two processes in nursing: a verbal summary of the patients' care and another report within the nursing notes, creating duplication. Advances in speech recognition technology have provided an opportunity to consider the practicality of one set of information at the nursing end-of-shift. We used content analysis to compare transcripts from 162 digitally recorded handovers and written nursing notes for similar patients within general medical-surgical wards from two metropolitan hospitals in Sydney Australia. Using the Nursing Handover Minimum Dataset analysis framework similar content [n = 2109 (handover) n = 1902 (nursing notes)] was found within the handovers and notes at the end-of-shift (7:00 am and 2:00 pm). Analysis of the overarching categories demonstrated the emphasis within the differing data sources as: patient identification (31%), care planning or interventions (25%), clinical history (13%), and clinical status (13%) for handover, vs. care planning (47%), clinical status (24%), and outcomes or goals of care (12%) for nursing notes. This study has demonstrated that similar patient information is presented at handover and within documentation. Major categories are consistent with international nursing minimum datasets in use. We can use one set of patient information (within some limitations) for two purposes with system design, practice change and education. Experiments are currently being conducted trialling speech recognition within laboratory and clinical settings. One set of patient information, verbally generated at handover delivering electronic documentation within one process, will transform international nursing policy for nursing handover and documentation. © 2013 International Council of Nurses.

  1. ABAEnrichment: an R package to test for gene set expression enrichment in the adult and developing human brain.

    Science.gov (United States)

    Grote, Steffi; Prüfer, Kay; Kelso, Janet; Dannemann, Michael

    2016-10-15

    We present ABAEnrichment, an R package that tests for expression enrichment in specific brain regions at different developmental stages using expression information gathered from multiple regions of the adult and developing human brain, together with ontologically organized structural information about the brain, both provided by the Allen Brain Atlas. We validate ABAEnrichment by successfully recovering the origin of gene sets identified in specific brain cell-types and developmental stages. ABAEnrichment was implemented as an R package and is available under GPL (≥ 2) from the Bioconductor website (http://bioconductor.org/packages/3.3/bioc/html/ABAEnrichment.html). steffi_grote@eva.mpg.de, kelso@eva.mpg.de or michael_dannemann@eva.mpg.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  2. The Impact of Ranking Information on Students’ Behavior and Performance in Peer Review Settings

    DEFF Research Database (Denmark)

    Papadopoulos, Pantelis M.; Lagkas, Thomas D.; Demetriadis, Stavros N.

    2015-01-01

    The paper explores the potential of usage and ranking information in increasing student engagement in a double-blinded peer review setting, where students are allowed to select freely which/how many peer works to review. The study employed 56 volunteering sophomore students majoring in Informatic...

  3. Recruiting Science Majors into Secondary Science Teaching: Paid Internships in Informal Science Settings

    Science.gov (United States)

    Worsham, Heather M.; Friedrichsen, Patricia; Soucie, Marilyn; Barnett, Ellen; Akiba, Motoko

    2014-01-01

    Despite the importance of recruiting highly qualified individuals into the science teaching profession, little is known about the effectiveness of particular recruitment strategies. Over 3 years, 34 college science majors and undecided students were recruited into paid internships in informal science settings to consider secondary science teaching…

  4. Stereoscopy in Static Scientific Imagery in an Informal Education Setting: Does It Matter?

    Science.gov (United States)

    Price, C. Aaron; Lee, H.-S.; Malatesta, K.

    2014-01-01

    Stereoscopic technology (3D) is rapidly becoming ubiquitous across research, entertainment and informal educational settings. Children of today may grow up never knowing a time when movies, television and video games were not available stereoscopically. Despite this rapid expansion, the field's understanding of the impact of stereoscopic…

  5. Recruiting Science Majors into Secondary Science Teaching: Paid Internships in Informal Science Settings

    Science.gov (United States)

    Worsham, Heather M.; Friedrichsen, Patricia; Soucie, Marilyn; Barnett, Ellen; Akiba, Motoko

    2014-01-01

    Despite the importance of recruiting highly qualified individuals into the science teaching profession, little is known about the effectiveness of particular recruitment strategies. Over 3 years, 34 college science majors and undecided students were recruited into paid internships in informal science settings to consider secondary science teaching…

  6. Setting up Information Literacy Workshops in School Libraries: Imperatives, Principles and Methods

    Directory of Open Access Journals (Sweden)

    Reza Mokhtarpour

    2010-09-01

    Full Text Available While many professional literature have talked at length about the importance of dealing with information literacy in school libraries in ICT dominated era, but few have dealt with the nature and mode of implementation nor offered a road map. The strategy emphasized in this paper is to hold information literacy sessions through effective workshops. While explaining the reasons behind such workshops being essential in enhancing information literacy skills, the most important principles and stages for setting up of such workshops are offered in a step-by-step manner.

  7. Meta-analysis of differentiating mouse embryonic stem cell gene expression kinetics reveals early change of a small gene set.

    Directory of Open Access Journals (Sweden)

    Clive H Glover

    2006-11-01

    Full Text Available Stem cell differentiation involves critical changes in gene expression. Identification of these should provide endpoints useful for optimizing stem cell propagation as well as potential clues about mechanisms governing stem cell maintenance. Here we describe the results of a new meta-analysis methodology applied to multiple gene expression datasets from three mouse embryonic stem cell (ESC lines obtained at specific time points during the course of their differentiation into various lineages. We developed methods to identify genes with expression changes that correlated with the altered frequency of functionally defined, undifferentiated ESC in culture. In each dataset, we computed a novel statistical confidence measure for every gene which captured the certainty that a particular gene exhibited an expression pattern of interest within that dataset. This permitted a joint analysis of the datasets, despite the different experimental designs. Using a ranking scheme that favored genes exhibiting patterns of interest, we focused on the top 88 genes whose expression was consistently changed when ESC were induced to differentiate. Seven of these (103728_at, 8430410A17Rik, Klf2, Nr0b1, Sox2, Tcl1, and Zfp42 showed a rapid decrease in expression concurrent with a decrease in frequency of undifferentiated cells and remained predictive when evaluated in additional maintenance and differentiating protocols. Through a novel meta-analysis, this study identifies a small set of genes whose expression is useful for identifying changes in stem cell frequencies in cultures of mouse ESC. The methods and findings have broader applicability to understanding the regulation of self-renewal of other stem cell types.

  8. Ubiquitous information for ubiquitous computing: expressing clinical data sets with openEHR archetypes.

    Science.gov (United States)

    Garde, Sebastian; Hovenga, Evelyn; Buck, Jasmin; Knaup, Petra

    2006-01-01

    Ubiquitous computing requires ubiquitous access to information and knowledge. With the release of openEHR Version 1.0 there is a common model available to solve some of the problems related to accessing information and knowledge by improving semantic interoperability between clinical systems. Considerable work has been undertaken by various bodies to standardise Clinical Data Sets. Notwithstanding their value, several problems remain unsolved with Clinical Data Sets without the use of a common model underpinning them. This paper outlines these problems like incompatible basic data types and overlapping and incompatible definitions of clinical content. A solution to this based on openEHR archetypes is motivated and an approach to transform existing Clinical Data Sets into archetypes is presented. To avoid significant overlaps and unnecessary effort during archetype development, archetype development needs to be coordinated nationwide and beyond and also across the various health professions in a formalized process.

  9. Psychology of Agenda-Setting Effects. Mapping the Paths of Information Processing

    Directory of Open Access Journals (Sweden)

    Maxwell McCombs

    2014-01-01

    Full Text Available The concept of Need for Orientation introduced in the early years of agenda-setting research provided a psychological explanation for why agenda-setting effects occur in terms of what individuals bring to the media experience that determines the strength of these effects. Until recently, there had been no significant additions to our knowledge about the psychology of agenda-setting effects. However, the concept of Need for Orientation is only one part of the answer to the question about why agenda setting occurs. Recent research outlines a second way to answer the why question by describing the psychological process through which these effects occur. In this review, we integrate four contemporary studies that explicate dual psychological paths that lead to agenda-setting effects at the first and second levels. We then examine how information preferences and selective exposure can be profitably included in the agenda-setting framework. Complementing these new models of information processing and varying attention to media content and presentation cues, an expanded concept of psychological relevance, motivated reasoning goals (accuracy versus directional goals, and issue publics are discussed.

  10. Inferring phylogenies with incomplete data sets: a 5-gene, 567-taxon analysis of angiosperms

    Directory of Open Access Journals (Sweden)

    Hilu Khidir W

    2009-03-01

    Full Text Available Abstract Background Phylogenetic analyses of angiosperm relationships have used only a small percentage of available sequence data, but phylogenetic data matrices often can be augmented with existing data, especially if one allows missing characters. We explore the effects on phylogenetic analyses of adding 378 matK sequences and 240 26S rDNA sequences to the complete 3-gene, 567-taxon angiosperm phylogenetic matrix of Soltis et al. Results We performed maximum likelihood bootstrap analyses of the complete, 3-gene 567-taxon data matrix and the incomplete, 5-gene 567-taxon data matrix. Although the 5-gene matrix has more missing data (27.5% than the 3-gene data matrix (2.9%, the 5-gene analysis resulted in higher levels of bootstrap support. Within the 567-taxon tree, the increase in support is most evident for relationships among the 170 taxa for which both matK and 26S rDNA sequences were added, and there is little gain in support for relationships among the 119 taxa having neither matK nor 26S rDNA sequences. The 5-gene analysis also places the enigmatic Hydrostachys in Lamiales (BS = 97% rather than in Cornales (BS = 100% in 3-gene analysis. The placement of Hydrostachys in Lamiales is unprecedented in molecular analyses, but it is consistent with embryological and morphological data. Conclusion Adding available, and often incomplete, sets of sequences to existing data sets can be a fast and inexpensive way to increase support for phylogenetic relationships and produce novel and credible new phylogenetic hypotheses.

  11. Information Integration and Energy Expenditure in Gene Regulation.

    Science.gov (United States)

    Estrada, Javier; Wong, Felix; DePace, Angela; Gunawardena, Jeremy

    2016-06-30

    The quantitative concepts used to reason about gene regulation largely derive from bacterial studies. We show that this bacterial paradigm cannot explain the sharp expression of a canonical developmental gene in response to a regulating transcription factor (TF). In the absence of energy expenditure, with regulatory DNA at thermodynamic equilibrium, information integration across multiple TF binding sites can generate the required sharpness, but with strong constraints on the resultant "higher-order cooperativities." Even with such integration, there is a "Hopfield barrier" to sharpness; for n TF binding sites, this barrier is represented by the Hill function with the Hill coefficient n. If, however, energy is expended to maintain regulatory DNA away from thermodynamic equilibrium, as in kinetic proofreading, this barrier can be breached and greater sharpness achieved. Our approach is grounded in fundamental physics, leads to testable experimental predictions, and suggests how a quantitative paradigm for eukaryotic gene regulation can be formulated.

  12. Development of a Minimum Data Set (MDS) for C-Section Anesthesia Information Management System (AIMS).

    Science.gov (United States)

    Sheykhotayefeh, Mostafa; Safdari, Reza; Ghazisaeedi, Marjan; Khademi, Seyed Hossein; Seyed Farajolah, Seyedeh Sedigheh; Maserat, Elham; Jebraeily, Mohamad; Torabi, Vahid

    2017-04-01

    Caesarean section, also known as C-section, is a very common procedure in the world. Minimum data set (MDS) is defined as a set of data elements holding information regarding a series of target entities to provide a basis for planning, management, and performance evaluation. MDS has found a great use in health care information systems. Also, it can be considered as a basis for medical information management and has shown a great potential for contributing to the provision of high quality care and disease control measures. The principal aim of this research was to determine MDS and required capabilities for Anesthesia information management system (AIMS) in C-section in Iran. Data items collected from several selected AIMS were studied to establish an initial set of data. The population of this study composed of 115 anesthesiologists was asked to review the proposed data elements and score them in order of importance by using a five-point Likert scale. The items scored as important or highly important by at least 75% of the experts were included in the final list of minimum data set. Overall 8 classes of data (consisted of 81 key data elements) were determined as final set. Also, the most important required capabilities were related to airway management and hypertension and hypotension management. In the development of information system (IS) based on MDS and identification, because of the broad involvement of users, IS capabilities must focus on the users' needs to form a successful system. Therefore, it is essential to assess MDS watchfully by considering the planned uses of data. Also, IS should have essential capabilities to meet the needs of its users.

  13. Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    LENUS (Irish Health Repository)

    2011-10-05

    Abstract Background We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant.

  14. Evidence for intron length conservation in a set of mammalian genes associated with embryonic development

    Directory of Open Access Journals (Sweden)

    Korir Paul K

    2011-10-01

    Full Text Available Abstract Background We carried out an analysis of intron length conservation across a diverse group of nineteen mammalian species. Motivated by recent research suggesting a role for time delays associated with intron transcription in gene expression oscillations required for early embryonic patterning, we searched for examples of genes that showed the most extreme conservation of total intron content in mammals. Results Gene sets annotated as being involved in pattern specification in the early embryo or containing the homeobox DNA-binding domain, were significantly enriched among genes with highly conserved intron content. We used ancestral sequences reconstructed with probabilistic models that account for insertion and deletion mutations to distinguish insertion and deletion events on lineages leading to human and mouse from their last common ancestor. Using a randomization procedure, we show that genes containing the homeobox domain show less change in intron content than expected, given the number of insertion and deletion events within their introns. Conclusions Our results suggest selection for gene expression precision or the existence of additional development-associated genes for which transcriptional delay is functionally significant.

  15. A rough set based rational clustering framework for determining correlated genes.

    Science.gov (United States)

    Jeyaswamidoss, Jeba Emilyn; Thangaraj, Kesavan; Ramar, Kadarkarai; Chitra, Muthusamy

    2016-06-01

    Cluster analysis plays a foremost role in identifying groups of genes that show similar behavior under a set of experimental conditions. Several clustering algorithms have been proposed for identifying gene behaviors and to understand their significance. The principal aim of this work is to develop an intelligent rough clustering technique, which will efficiently remove the irrelevant dimensions in a high-dimensional space and obtain appropriate meaningful clusters. This paper proposes a novel biclustering technique that is based on rough set theory. The proposed algorithm uses correlation coefficient as a similarity measure to simultaneously cluster both the rows and columns of a gene expression data matrix and mean squared residue to generate the initial biclusters. Furthermore, the biclusters are refined to form the lower and upper boundaries by determining the membership of the genes in the clusters using mean squared residue. The algorithm is illustrated with yeast gene expression data and the experiment proves the effectiveness of the method. The main advantage is that it overcomes the problem of selection of initial clusters and also the restriction of one object belonging to only one cluster by allowing overlapping of biclusters.

  16. Shrinkage covariance matrix approach based on robust trimmed mean in gene sets detection

    Science.gov (United States)

    Karjanto, Suryaefiza; Ramli, Norazan Mohamed; Ghani, Nor Azura Md; Aripin, Rasimah; Yusop, Noorezatty Mohd

    2015-02-01

    Microarray involves of placing an orderly arrangement of thousands of gene sequences in a grid on a suitable surface. The technology has made a novelty discovery since its development and obtained an increasing attention among researchers. The widespread of microarray technology is largely due to its ability to perform simultaneous analysis of thousands of genes in a massively parallel manner in one experiment. Hence, it provides valuable knowledge on gene interaction and function. The microarray data set typically consists of tens of thousands of genes (variables) from just dozens of samples due to various constraints. Therefore, the sample covariance matrix in Hotelling's T2 statistic is not positive definite and become singular, thus it cannot be inverted. In this research, the Hotelling's T2 statistic is combined with a shrinkage approach as an alternative estimation to estimate the covariance matrix to detect significant gene sets. The use of shrinkage covariance matrix overcomes the singularity problem by converting an unbiased to an improved biased estimator of covariance matrix. Robust trimmed mean is integrated into the shrinkage matrix to reduce the influence of outliers and consequently increases its efficiency. The performance of the proposed method is measured using several simulation designs. The results are expected to outperform existing techniques in many tested conditions.

  17. Selection and validation of a set of reliable reference genes for quantitative sod gene expression analysis in C. elegans

    Directory of Open Access Journals (Sweden)

    Vandesompele Jo

    2008-01-01

    Full Text Available Abstract Background In the nematode Caenorhabditis elegans the conserved Ins/IGF-1 signaling pathway regulates many biological processes including life span, stress response, dauer diapause and metabolism. Detection of differentially expressed genes may contribute to a better understanding of the mechanism by which the Ins/IGF-1 signaling pathway regulates these processes. Appropriate normalization is an essential prerequisite for obtaining accurate and reproducible quantification of gene expression levels. The aim of this study was to establish a reliable set of reference genes for gene expression analysis in C. elegans. Results Real-time quantitative PCR was used to evaluate the expression stability of 12 candidate reference genes (act-1, ama-1, cdc-42, csq-1, eif-3.C, mdh-1, gpd-2, pmp-3, tba-1, Y45F10D.4, rgs-6 and unc-16 in wild-type, three Ins/IGF-1 pathway mutants, dauers and L3 stage larvae. After geNorm analysis, cdc-42, pmp-3 and Y45F10D.4 showed the most stable expression pattern and were used to normalize 5 sod expression levels. Significant differences in mRNA levels were observed for sod-1 and sod-3 in daf-2 relative to wild-type animals, whereas in dauers sod-1, sod-3, sod-4 and sod-5 are differentially expressed relative to third stage larvae. Conclusion Our findings emphasize the importance of accurate normalization using stably expressed reference genes. The methodology used in this study is generally applicable to reliably quantify gene expression levels in the nematode C. elegans using quantitative PCR.

  18. Trauma-informed care in inpatient mental health settings: a review of the literature.

    Science.gov (United States)

    Muskett, Coral

    2014-02-01

    Trauma-informed care is an emerging value that is seen as fundamental to effective and contemporary mental health nursing practice. Trauma-informed care, like recovery, leaves mental health nurses struggling to translate these values into day-to-day nursing practice. Many are confused about what individual actions they can take to support these values. To date, the most clearly articulated policy to emerge from the trauma-informed care movement in Australia has been the agreement to reduce, and wherever possible, eliminate the use of seclusion and restraint. Confronted with the constant churn of admissions and readmissions of clients with challenging behaviours, and seemingly intractable mental illness, the elimination of seclusion and restraint is seen to be utopian by many mental health nurses in inpatient settings. Is trauma-informed care solely about eliminating seclusion and restraint, or are there other tangible practices nurses could utilize to effect better health outcomes for mental health clients, especially those with significant abuse histories? This article summarizes the findings from the literature from 2000-2011 in identifying those practices and clinical activities that have been implemented to effect trauma-informed care in inpatient mental health settings.

  19. Minimum mutual information based level set clustering algorithm for fast MRI tissue segmentation.

    Science.gov (United States)

    Dai, Shuanglu; Man, Hong; Zhan, Shu

    2015-01-01

    Accurate and accelerated MRI tissue recognition is a crucial preprocessing for real-time 3d tissue modeling and medical diagnosis. This paper proposed an information de-correlated clustering algorithm implemented by variational level set method for fast tissue segmentation. The key idea is to design a local correlation term between original image and piecewise constant into the variational framework. The minimized correlation will then lead to de-correlated piecewise regions. Firstly, by introducing a continuous bounded variational domain describing the image, a probabilistic image restoration model is assumed to modify the distortion. Secondly, regional mutual information is introduced to measure the correlation between piecewise regions and original images. As a de-correlated description of the image, piecewise constants are finally solved by numerical approximation and level set evolution. The converged piecewise constants automatically clusters image domain into discriminative regions. The segmentation results show that our algorithm performs well in terms of time consuming, accuracy, convergence and clustering capability.

  20. A framework for scalable parameter estimation of gene circuit models using structural information

    KAUST Repository

    Kuwahara, Hiroyuki

    2013-06-21

    Motivation: Systematic and scalable parameter estimation is a key to construct complex gene regulatory models and to ultimately facilitate an integrative systems biology approach to quantitatively understand the molecular mechanisms underpinning gene regulation. Results: Here, we report a novel framework for efficient and scalable parameter estimation that focuses specifically on modeling of gene circuits. Exploiting the structure commonly found in gene circuit models, this framework decomposes a system of coupled rate equations into individual ones and efficiently integrates them separately to reconstruct the mean time evolution of the gene products. The accuracy of the parameter estimates is refined by iteratively increasing the accuracy of numerical integration using the model structure. As a case study, we applied our framework to four gene circuit models with complex dynamics based on three synthetic datasets and one time series microarray data set. We compared our framework to three state-of-the-art parameter estimation methods and found that our approach consistently generated higher quality parameter solutions efficiently. Although many general-purpose parameter estimation methods have been applied for modeling of gene circuits, our results suggest that the use of more tailored approaches to use domain-specific information may be a key to reverse engineering of complex biological systems. The Author 2013.

  1. Fusion of Imperfect Information in the Unified Framework of Random Sets Theory: Application to Target Identification

    Science.gov (United States)

    2007-11-01

    Informatique WGZ Anne-Laure Jousselme Éloi Bossé DRDC Valcartier Defence R&D Canada – Valcartier Technical Report DRDC Valcartier TR 2003-319 November 2007...Fusion of imperfect information in the unified framework of random sets theory Application to target identification Mihai Cristian Florea Informatique ...Cell CFB Esquimalt P.O. Box 17000 Stn Forces Victoria, British Columbia, V9A 7N2 Attn: Commanding Officer 1 M. C. Florea (author) Informatique WGZ inc

  2. A generalized rough set-based information filling technique for failure analysis of thruster experimental data

    Institute of Scientific and Technical Information of China (English)

    Han Shan; Zhu Qiang; Li Jianxun; Chen Lin

    2013-01-01

    Interval-valued data and incomplete data are two key problems for failure analysis of thruster experimental data and have been basically solved by the proposed methods in this paper. Firstly, information data acquired from the simulation and evaluation system formed as interval-valued information system (IIS) is classified by the interval similarity relation. Then, as an improve-ment of the classical rough set, a new kind of generalized information entropy called‘‘H0-informa-tion entropy’’ is suggested for the measurement of uncertainty and the classification ability of IIS. There is an innovative information filling technique using the properties of H0-information entropy to replace missing data by some smaller estimation intervals. Finally, an improved method of failure analysis synthesized by the above achievements is presented to classify the thruster experimental data, complete the information, and extract the failure rules. The feasibility and advantage of this method is testified by an actual application of failure analysis, whose performance is evaluated by the quantification of E-condition entropy.

  3. Sensitivity analysis of biological Boolean networks using information fusion based on nonadditive set functions

    Science.gov (United States)

    2014-01-01

    Background An algebraic method for information fusion based on nonadditive set functions is used to assess the joint contribution of Boolean network attributes to the sensitivity of the network to individual node mutations. The node attributes or characteristics under consideration are: in-degree, out-degree, minimum and average path lengths, bias, average sensitivity of Boolean functions, and canalizing degrees. The impact of node mutations is assessed using as target measure the average Hamming distance between a non-mutated/wild-type network and a mutated network. Results We find that for a biochemical signal transduction network consisting of several main signaling pathways whose nodes represent signaling molecules (mainly proteins), the algebraic method provides a robust classification of attribute contributions. This method indicates that for the biochemical network, the most significant impact is generated mainly by the combined effects of two attributes: out-degree, and average sensitivity of nodes. Conclusions The results support the idea that both topological and dynamical properties of the nodes need to be under consideration. The algebraic method is robust against the choice of initial conditions and partition of data sets in training and testing sets for estimation of the nonadditive set functions of the information fusion procedure. PMID:25189194

  4. Identification of the Core Set of Carbon-Associated Genes in a Bioenergy Grassland Soil

    Science.gov (United States)

    Howe, Adina; Yang, Fan; Williams, Ryan J.; Meyer, Folker; Hofmockel, Kirsten S.

    2016-01-01

    Despite the central role of soil microbial communities in global carbon (C) cycling, little is known about soil microbial community structure and even less about their metabolic pathways. Efforts to characterize soil communities often focus on identifying differences in gene content across environmental gradients, but an alternative question is what genes are similar in soils. These genes may indicate critical species or potential functions that are required in all soils. Here we identified the “core” set of C cycling sequences widely present in multiple soil metagenomes from a fertilized prairie (FP). Of 226,887 sequences associated with known enzymes involved in the synthesis, metabolism, and transport of carbohydrates, 843 were identified to be consistently prevalent across four replicate soil metagenomes. This core metagenome was functionally and taxonomically diverse, representing five enzyme classes and 99 enzyme families within the CAZy database. Though it only comprised 0.4% of all CAZy-associated genes identified in FP metagenomes, the core was found to be comprised of functions similar to those within cumulative soils. The FP CAZy-associated core sequences were present in multiple publicly available soil metagenomes and most similar to soils sharing geographic proximity. In soil ecosystems, where high diversity remains a key challenge for metagenomic investigations, these core genes represent a subset of critical functions necessary for carbohydrate metabolism, which can be targeted to evaluate important C fluxes in these and other similar soils. PMID:27855202

  5. Genome-Wide Temporal Expression Profiling in Caenorhabditis elegans Identifies a Core Gene Set Related to Long-Term Memory.

    Science.gov (United States)

    Freytag, Virginie; Probst, Sabine; Hadziselimovic, Nils; Boglari, Csaba; Hauser, Yannick; Peter, Fabian; Gabor Fenyves, Bank; Milnik, Annette; Demougin, Philippe; Vukojevic, Vanja; de Quervain, Dominique J-F; Papassotiropoulos, Andreas; Stetak, Attila

    2017-07-12

    The identification of genes related to encoding, storage, and retrieval of memories is a major interest in neuroscience. In the current study, we analyzed the temporal gene expression changes in a neuronal mRNA pool during an olfactory long-term associative memory (LTAM) in Caenorhabditis elegans hermaphrodites. Here, we identified a core set of 712 (538 upregulated and 174 downregulated) genes that follows three distinct temporal peaks demonstrating multiple gene regulation waves in LTAM. Compared with the previously published positive LTAM gene set (Lakhina et al., 2015), 50% of the identified upregulated genes here overlap with the previous dataset, possibly representing stimulus-independent memory-related genes. On the other hand, the remaining genes were not previously identified in positive associative memory and may specifically regulate aversive LTAM. Our results suggest a multistep gene activation process during the formation and retrieval of long-term memory and define general memory-implicated genes as well as conditioning-type-dependent gene sets.SIGNIFICANCE STATEMENT The identification of genes regulating different steps of memory is of major interest in neuroscience. Identification of common memory genes across different learning paradigms and the temporal activation of the genes are poorly studied. Here, we investigated the temporal aspects of Caenorhabditis elegans gene expression changes using aversive olfactory associative long-term memory (LTAM) and identified three major gene activation waves. Like in previous studies, aversive LTAM is also CREB dependent, and CREB activity is necessary immediately after training. Finally, we define a list of memory paradigm-independent core gene sets as well as conditioning-dependent genes. Copyright © 2017 the authors 0270-6474/17/376661-12$15.00/0.

  6. Speech-language pathologists' informal learning in healthcare settings: behaviours and motivations.

    Science.gov (United States)

    Walden, Patrick R; Bryan, Valerie C

    2011-08-01

    The current research sought to identify the types of informal learning behaviours speech-language pathologists (SLPs) working in healthcare settings engage in as well as SLPs' motivations for engaging in informal learning. Twenty-four American Speech-Language-Hearing Association (ASHA)-certified SLPs participated in this qualitative study. Data collection consisted of computer-mediated interviews, online journaling, and a virtual focus group. These textual data were coded and collapsed into themes. All participant SLPs reported that they learned through collaboration (inter- and intra-disciplinary), worked with patients to learn through trial-and-error, and consulted non-peer-reviewed material on the internet as well as peer-reviewed research in order to learn informally in the workplace. Eighteen of the 24 participants reported being motivated to learn at work to meet a patient's need to meet therapy goals. Five of the 24 participants reported meeting their own personal learning needs was a motivating factor and 10 of the 24 participants reported learning informally to meet the needs of the healthcare organization/SLP profession. Results were compared to past research on SLPs' information retrieval behaviours. It was concluded that SLPs acknowledge their personal work-related gaps in knowledge and skills and actively seek to develop their knowledge and skill base through informal means.

  7. Information Theoretic Approaches to Rapid Discovery of Relationships in Large Climate Data Sets

    Science.gov (United States)

    Knuth, Kevin H.; Rossow, William B.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    Mutual information as the asymptotic Bayesian measure of independence is an excellent starting point for investigating the existence of possible relationships among climate-relevant variables in large data sets, As mutual information is a nonlinear function of of its arguments, it is not beholden to the assumption of a linear relationship between the variables in question and can reveal features missed in linear correlation analyses. However, as mutual information is symmetric in its arguments, it only has the ability to reveal the probability that two variables are related. it provides no information as to how they are related; specifically, causal interactions or a relation based on a common cause cannot be detected. For this reason we also investigate the utility of a related quantity called the transfer entropy. The transfer entropy can be written as a difference between mutual informations and has the capability to reveal whether and how the variables are causally related. The application of these information theoretic measures is rested on some familiar examples using data from the International Satellite Cloud Climatology Project (ISCCP) to identify relation between global cloud cover and other variables, including equatorial pacific sea surface temperature (SST), over seasonal and El Nino Southern Oscillation (ENSO) cycles.

  8. Gene set analyses of genome-wide association studies on 49 quantitative traits measured in a single genetic epidemiology dataset.

    Science.gov (United States)

    Kim, Jihye; Kwon, Ji-Sun; Kim, Sangsoo

    2013-09-01

    Gene set analysis is a powerful tool for interpreting a genome-wide association study result and is gaining popularity these days. Comparison of the gene sets obtained for a variety of traits measured from a single genetic epidemiology dataset may give insights into the biological mechanisms underlying these traits. Based on the previously published single nucleotide polymorphism (SNP) genotype data on 8,842 individuals enrolled in the Korea Association Resource project, we performed a series of systematic genome-wide association analyses for 49 quantitative traits of basic epidemiological, anthropometric, or blood chemistry parameters. Each analysis result was subjected to subsequent gene set analyses based on Gene Ontology (GO) terms using gene set analysis software, GSA-SNP, identifying a set of GO terms significantly associated to each trait (pcorr neuronal or nerve systems.

  9. Microarray analysis identifies a common set of cellular genes modulated by different HCV replicon clones

    Directory of Open Access Journals (Sweden)

    Gerosolimo Germano

    2008-06-01

    Full Text Available Abstract Background Hepatitis C virus (HCV RNA synthesis and protein expression affect cell homeostasis by modulation of gene expression. The impact of HCV replication on global cell transcription has not been fully evaluated. Thus, we analysed the expression profiles of different clones of human hepatoma-derived Huh-7 cells carrying a self-replicating HCV RNA which express all viral proteins (HCV replicon system. Results First, we compared the expression profile of HCV replicon clone 21-5 with both the Huh-7 parental cells and the 21-5 cured (21-5c cells. In these latter, the HCV RNA has been eliminated by IFN-α treatment. To confirm data, we also analyzed microarray results from both the 21-5 and two other HCV replicon clones, 22-6 and 21-7, compared to the Huh-7 cells. The study was carried out by using the Applied Biosystems (AB Human Genome Survey Microarray v1.0 which provides 31,700 probes that correspond to 27,868 human genes. Microarray analysis revealed a specific transcriptional program induced by HCV in replicon cells respect to both IFN-α-cured and Huh-7 cells. From the original datasets of differentially expressed genes, we selected by Venn diagrams a final list of 38 genes modulated by HCV in all clones. Most of the 38 genes have never been described before and showed high fold-change associated with significant p-value, strongly supporting data reliability. Classification of the 38 genes by Panther System identified functional categories that were significantly enriched in this gene set, such as histones and ribosomal proteins as well as extracellular matrix and intracellular protein traffic. The dataset also included new genes involved in lipid metabolism, extracellular matrix and cytoskeletal network, which may be critical for HCV replication and pathogenesis. Conclusion Our data provide a comprehensive analysis of alterations in gene expression induced by HCV replication and reveal modulation of new genes potentially useful

  10. Using Portable Health Information Kiosk to assess chronic disease burden in remote settings.

    Science.gov (United States)

    Joshi, Ashish; Puricelli Perin, Douglas M; Arora, Mohit

    2013-01-01

    Cancer, cardiovascular disease, chronic respiratory disease, and type 2 diabetes, are responsible for over 50% of worldwide mortality. Chronic diseases have broad negative impacts in developing countries. Contributing to the development of chronic diseases are sedentary lifestyles, poor nutrition and eating habits, and air pollution, among other risk factors. These are also greatly increasing, and obesity has become a global phenomenon. Health promotion, and chronic disease prevention and surveillance, can be achieved through information and communication technologies (ICT), which acquire, disseminate and store health-related information electronically. The portable health information kiosk (PHIK) can be a powerful tool for promoting health education in communities in both urban and rural settings. The objective of the study was to utilize a PHIK as a tool to assess the burden of chronic disease and associated risk factors in diverse settings in India. A convenience sample was enrolled from three diverse geographical locations including urban, rural and tribal to explore the utilization of a PHIK for chronic disease health risk assessment in a community setting. Cross-sectional data was recorded during the period of March-May 2010 in Rourkela and Bhubaneswar in the state of Orissa, India. Participants were asked to use a touch screen, electronic kiosk that gathered subjective and objective data to understand the burden of chronic diseases and associated risk in the community setting. The subjective data included responses to a series of multiple-choice questions and the objective data was gathered using multiple physiological sensors such as weight, blood sugar and blood pressure. Descriptive analysis was performed using univariate statistics with results for the continuous variables being reported as means and standard deviations while results for the categorical variables were reported as frequency statistics as appropriate. A total of 429 participants aged 18

  11. Ancestry informative marker sets for determining continental origin and admixture proportions in common populations in America.

    Science.gov (United States)

    Kosoy, Roman; Nassir, Rami; Tian, Chao; White, Phoebe A; Butler, Lesley M; Silva, Gabriel; Kittles, Rick; Alarcon-Riquelme, Marta E; Gregersen, Peter K; Belmont, John W; De La Vega, Francisco M; Seldin, Michael F

    2009-01-01

    To provide a resource for assessing continental ancestry in a wide variety of genetic studies, we identified, validated, and characterized a set of 128 ancestry informative markers (AIMs). The markers were chosen for informativeness, genome-wide distribution, and genotype reproducibility on two platforms (TaqMan assays and Illumina arrays). We analyzed genotyping data from 825 subjects with diverse ancestry, including European, East Asian, Amerindian, African, South Asian, Mexican, and Puerto Rican. A comprehensive set of 128 AIMs and subsets as small as 24 AIMs are shown to be useful tools for ascertaining the origin of subjects from particular continents, and to correct for population stratification in admixed population sample sets. Our findings provide general guidelines for the application of specific AIM subsets as a resource for wide application. We conclude that investigators can use TaqMan assays for the selected AIMs as a simple and cost efficient tool to control for differences in continental ancestry when conducting association studies in ethnically diverse populations.

  12. A novel mutual information-based Boolean network inference method from time-series gene expression data

    Science.gov (United States)

    Barman, Shohag; Kwon, Yung-Keun

    2017-01-01

    Background Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately. Results In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI) method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods. Conclusions Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network. PMID:28178334

  13. Management of (pale-)oceanographic data sets using the PANGAEA information system: the SINOPS example

    Science.gov (United States)

    Dittert, Nicolas; Corrin, Lydie; Diepenbroek, Michael; Grobe, Hannes; Heinze, Christoph; Ragueneau, Olivier

    2002-08-01

    During the SINOPS project, an optimal state of the art simulation of the marine silicon cycle is attempted employing a biogeochemical ocean general circulation model (BOGCM) through three particular time steps relevant for global (paleo-) climate. In order to tune the model optimally, results of the simulations are compared to a comprehensive data set of 'real' observations. SINOPS' scientific data management ensures that data structure becomes homogeneous throughout the project. Practical work routine comprises systematic progress from data acquisition, through preparation, processing, quality check and archiving, up to the presentation of data to the scientific community. Meta-information and analytical data are mapped by an n-dimensional catalogue in order to itemize the analytical value and to serve as an unambiguous identifier. In practice, data management is carried out by means of the online-accessible information system PANGAEA, which offers a tool set comprising a data warehouse, Graphical Information System (GIS), 2-D plot, cross-section plot, etc. and whose multidimensional data model promotes scientific data mining. Besides scientific and technical aspects, this alliance between scientific project team and data management crew serves to integrate the participants and allows them to gain mutual respect and appreciation.

  14. Gene set based association analyses for the WSSV resistance of Pacific white shrimp Litopenaeus vannamei

    Science.gov (United States)

    Yu, Yang; Liu, Jingwen; Li, Fuhua; Zhang, Xiaojun; Zhang, Chengsong; Xiang, Jianhai

    2017-01-01

    White Spot Syndrome Virus (WSSV) is regarded as a virus with the strongest pathogenicity to shrimp. For the threshold trait such as disease resistance, marker assisted selection (MAS) was considered to be a more effective approach. In the present study, association analyses of single nucleotide polymorphisms (SNPs) located in a set of immune related genes were conducted to identify markers associated with WSSV resistance. SNPs were detected by bioinformatics analysis on RNA sequencing data generated by Illimina sequencing platform and Roche 454 sequencing technology. A total of 681 SNPs located in the exons of immune related genes were selected as candidate SNPs. Among these SNPs, 77 loci were genotyped in WSSV susceptible group and resistant group. Association analysis was performed based on logistic regression method under an additive and dominance model in GenABEL package. As a result, five SNPs showed associations with WSSV resistance at a significant level of 0.05. Besides, SNP-SNP interaction analysis was conducted. The combination of SNP loci in TRAF6, Cu/Zn SOD and nLvALF2 exhibited a significant effect on the WSSV resistance of shrimp. Gene expression analysis revealed that these SNPs might influence the expression of these immune-related genes. This study provides a useful method for performing MAS in shrimp. PMID:28094323

  15. Optimal Set Cover Formulation for Exclusive Row Biclustering of Gene Expression

    Institute of Scientific and Technical Information of China (English)

    Amichai Painsky; Saharon Rosset

    2014-01-01

    The availability of large microarray data has led to a growing interest in biclustering methods in the past decade. Several algorithms have been proposed to identify subsets of genes and conditions according to different similarity measures and under varying constraints. In this paper we focus on the exclusive row biclustering problem (also known as projected clustering) for gene expression, in which each row can only be a member of a single bicluster while columns can participate in multiple clusters. This type of biclustering may be adequate, for example, for clustering groups of cancer patients where each patient (row) is expected to be carrying only a single type of cancer, while each cancer type is associated with multiple (and possibly overlapping) genes (columns). We present a novel method to identify these exclusive row biclusters in the spirit of the optimal set cover problem. We present our algorithmic solution as a combination of existing biclustering algorithms and combinatorial auction techniques. Furthermore, we devise an approach for tuning the threshold of our algorithm based on comparison with a null model, inspired by the Gap statistic approach. We demonstrate our approach on both synthetic and real world gene expression data and show its power in identifying large span non-overlapping rows submatrices, while considering their unique nature.

  16. The international geosphere biosphere programme data and information system global land cover data set (DIScover)

    Science.gov (United States)

    Loveland, T.R.; Belward, A.S.

    1997-01-01

    The International Geosphere Biosphere Programme Data and Information System (IGBP-DIS), through the mapping expertise of the U.S. Geological Survey and the European Commission's Joint Research Centre, recently guided the completion of a 1-km resolution global land cover data set from advanced very high resolution radiometer (AVHRR) data. The 1-km resolution land cover product, 'DISCover,' was based on monthly normalized difference vegetation index composites from 1992 and 1993. The development of DISCover was coordinated by the IGBP-DIS Land Cover Working Group as part of the IGBP-DIS Focus 1 activity. DISCover is a 17-class land cover data set based on the scientific requirements of IGBP elements. The mapping used unsupervised classification and postclassification refinement using ancillary data. The development of this data set was motivated by the need for global land cover data with higher spatial resolution, improved temporal specificity, and known classification accuracy. The completed DISCover data set will soon be validated to determine the accuracy of the global classification.

  17. Information theory applied to the sparse gene ontology annotation network to predict novel gene function

    Science.gov (United States)

    Tao, Ying; Li, Jianrong

    2010-01-01

    Motivation Despite advances in the gene annotation process, the functions of a large portion of the gene products remain insufficiently characterized. In addition, the “in silico” prediction of novel Gene Ontology (GO) annotations for partially characterized gene functions or processes is highly dependent on reverse genetic or function genomics approaches. Results We propose a novel approach, Information Theory-based Semantic Similarity (ITSS), to automatically predict molecular functions of genes based on Gene Ontology annotations. We have demonstrated using a 10-fold cross-validation that the ITSS algorithm obtains prediction accuracies (Precision 97%, Recall 77%) comparable to other machine learning algorithms when applied to similarly dense annotated portions of the GO datasets. In addition, such method can generate highly accurate predictions in sparsely annotated portions of GO, in which previous algorithm failed to do so. As a result, our technique generates an order of magnitude more gene function predictions than previous methods. Further, this paper presents the first historical rollback validation for the predicted GO annotations, which may represent more realistic conditions for an evaluation than generally used cross-validations type of evaluations. By manually assessing a random sample of 100 predictions conducted in a historical roll-back evaluation, we estimate that a minimum precision of 51% (95% confidence interval: 43%–58%) can be achieved for the human GO Annotation file dated 2003. Availability The program is available on request. The 97,732 positive predictions of novel gene annotations from the 2005 GO Annotation dataset are available at http://phenos.bsd.uchicago.edu/mphenogo/prediction_result_2005.txt. PMID:17646340

  18. Gene expression risk signatures maintain prognostic power in multiple myeloma despite microarray probe set translation

    DEFF Research Database (Denmark)

    Hermansen, N E U; Borup, R; Andersen, M K

    2016-01-01

    INTRODUCTION: Gene expression profiling (GEP) risk models in multiple myeloma are based on 3'-end microarrays. We hypothesized that GEP risk signatures could retain prognostic power despite being translated and applied to whole-transcript microarray data. METHODS: We studied CD138-positive bone...... marrow plasma cells in a prospective cohort of 59 samples from newly diagnosed patients eligible for high-dose therapy (HDT) and 67 samples from previous HDT patients with progressive disease. We used Affymetrix Human Gene 1.1 ST microarrays for GEP. Nine GEP risk signatures were translated by probe set......-87). Various translated GEP risk signatures or combinations hereof were significantly correlated with survival: among newly diagnosed patients mainly in combination with cytogenetic high-risk markers and among relapsed patients mainly in combination with ISS stage III. CONCLUSION: Translated GEP risk...

  19. Generalizability and decision studies to inform observational and experimental research in classroom settings.

    Science.gov (United States)

    Bottema-Beutel, Kristen; Lloyd, Blair; Carter, Erik W; Asmus, Jennifer M

    2014-11-01

    Attaining reliable estimates of observational measures can be challenging in school and classroom settings, as behavior can be influenced by multiple contextual factors. Generalizability (G) studies can enable researchers to estimate the reliability of observational data, and decision (D) studies can inform how many observation sessions are necessary to achieve a criterion level of reliability. We conducted G and D studies using observational data from a randomized control trial focusing on social and academic participation of students with severe disabilities in inclusive secondary classrooms. Results highlight the importance of anchoring observational decisions to reliability estimates from existing or pilot data sets. We outline steps for conducting G and D studies and address options when reliability estimates are lower than desired.

  20. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer

    Energy Technology Data Exchange (ETDEWEB)

    Pandi, Narayanan Sathiya, E-mail: sathiyapandi@gmail.com; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Highlights: •Identified stomach lineage specific gene set (SLSGS) was found to be under expressed in gastric tumors. •Elevated expression of SLSGS in gastric tumor is a molecular predictor of metabolic type gastric cancer. •In silico pathway scanning identified estrogen-α signaling is a putative regulator of SLSGS in gastric cancer. •Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. -- Abstract: Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC.

  1. Genes (including RNA editing information) - RMG | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us RMG Genes... (including RNA editing information) Data detail Data name Genes (including RNA editing information...ase Site Policy | Contact Us Genes (including RNA editing information) - RMG | LSDB Archive ...

  2. Recruiting Science Majors into Secondary Science Teaching: Paid Internships in Informal Science Settings

    Science.gov (United States)

    Worsham, Heather M.; Friedrichsen, Patricia; Soucie, Marilyn; Barnett, Ellen; Akiba, Motoko

    2014-02-01

    Despite the importance of recruiting highly qualified individuals into the science teaching profession, little is known about the effectiveness of particular recruitment strategies. Over 3 years, 34 college science majors and undecided students were recruited into paid internships in informal science settings to consider secondary science teaching as a career. Analysis of interns' subsequent career plans revealed the internships were not effective in recruiting the interns into the secondary science teacher education program, although many interns thought they might consider becoming teachers later in their lives. Reasons for not pursuing teaching included continued indecisiveness, inflexibility of required plans of study, and concerns about teachers' pay and classroom management.

  3. Generalized dominance-based rough set approach to security evaluation with imprecise information

    Institute of Scientific and Technical Information of China (English)

    Zhao Liang; Xue Zhi

    2010-01-01

    The model of grey multi-attribute group decision-making(MAGDM)is studied,in which the attribute values are grey numbers.Based on the generalized dominance-based rough set approach(G-DRSA),a synthetic security evaluation method is presented.With the grey MAGDM security evaluation model as its foundation,the extension of technique for order performance by similarity to ideal solution(TOPSIS)integrates the evaluation of each decision-maker(DM)into a group's consensus and obtains the expected evaluation results of information system.Via the quality of sorting(QoS)of G-DRSA,the inherent information hidden in data is uncovered,and the security attribute weight and DMs'weight are rationally obtained.Taking the computer networks in a certain university as objects,the example illustrates that this method can effectively remove the bottleneck of the grey MAGDM model and has practical significance in the synthetic security evaluation.

  4. The first set of EST resource for gene discovery and marker development in pigeonpea (Cajanus cajan L.

    Directory of Open Access Journals (Sweden)

    Byregowda Munishamappa

    2010-03-01

    .8% in molecular function. Further, 19 genes were identified differentially expressed between FW- responsive genotypes and 20 between SMD- responsive genotypes. Generated ESTs were compiled together with 908 ESTs available in public domain, at the time of analysis, and a set of 5,085 unigenes were defined that were used for identification of molecular markers in pigeonpea. For instance, 3,583 simple sequence repeat (SSR motifs were identified in 1,365 unigenes and 383 primer pairs were designed. Assessment of a set of 84 primer pairs on 40 elite pigeonpea lines showed polymorphism with 15 (28.8% markers with an average of four alleles per marker and an average polymorphic information content (PIC value of 0.40. Similarly, in silico mining of 133 contigs with ≥ 5 sequences detected 102 single nucleotide polymorphisms (SNPs in 37 contigs. As an example, a set of 10 contigs were used for confirming in silico predicted SNPs in a set of four genotypes using wet lab experiments. Occurrence of SNPs were confirmed for all the 6 contigs for which scorable and sequenceable amplicons were generated. PCR amplicons were not obtained in case of 4 contigs. Recognition sites for restriction enzymes were identified for 102 SNPs in 37 contigs that indicates possibility of assaying SNPs in 37 genes using cleaved amplified polymorphic sequences (CAPS assay. Conclusion The pigeonpea EST dataset generated here provides a transcriptomic resource for gene discovery and development of functional markers associated with biotic stress resistance. Sequence analyses of this dataset have showed conservation of a considerable number of pigeonpea transcripts across legume and model plant species analysed as well as some putative pigeonpea specific genes. Validation of identified biotic stress responsive genes should provide candidate genes for allele mining as well as candidate markers for molecular breeding.

  5. Balance between noise and information flow maximizes set complexity of network dynamics.

    Directory of Open Access Journals (Sweden)

    Tuomo Mäki-Marttunen

    Full Text Available Boolean networks have been used as a discrete model for several biological systems, including metabolic and genetic regulatory networks. Due to their simplicity they offer a firm foundation for generic studies of physical systems. In this work we show, using a measure of context-dependent information, set complexity, that prior to reaching an attractor, random Boolean networks pass through a transient state characterized by high complexity. We justify this finding with a use of another measure of complexity, namely, the statistical complexity. We show that the networks can be tuned to the regime of maximal complexity by adding a suitable amount of noise to the deterministic Boolean dynamics. In fact, we show that for networks with Poisson degree distributions, all networks ranging from subcritical to slightly supercritical can be tuned with noise to reach maximal set complexity in their dynamics. For networks with a fixed number of inputs this is true for near-to-critical networks. This increase in complexity is obtained at the expense of disruption in information flow. For a large ensemble of networks showing maximal complexity, there exists a balance between noise and contracting dynamics in the state space. In networks that are close to critical the intrinsic noise required for the tuning is smaller and thus also has the smallest effect in terms of the information processing in the system. Our results suggest that the maximization of complexity near to the state transition might be a more general phenomenon in physical systems, and that noise present in a system may in fact be useful in retaining the system in a state with high information content.

  6. Joint genetic analysis using variant sets reveals polygenic gene-context interactions.

    Directory of Open Access Journals (Sweden)

    Francesco Paolo Casale

    2017-04-01

    Full Text Available Joint genetic models for multiple traits have helped to enhance association analyses. Most existing multi-trait models have been designed to increase power for detecting associations, whereas the analysis of interactions has received considerably less attention. Here, we propose iSet, a method based on linear mixed models to test for interactions between sets of variants and environmental states or other contexts. Our model generalizes previous interaction tests and in particular provides a test for local differences in the genetic architecture between contexts. We first use simulations to validate iSet before applying the model to the analysis of genotype-environment interactions in an eQTL study. Our model retrieves a larger number of interactions than alternative methods and reveals that up to 20% of cases show context-specific configurations of causal variants. Finally, we apply iSet to test for sub-group specific genetic effects in human lipid levels in a large human cohort, where we identify a gene-sex interaction for C-reactive protein that is missed by alternative methods.

  7. An information-based rough set approach to critical engineering factor identification

    Institute of Scientific and Technical Information of China (English)

    Jing Ji; Zheng Dongjian

    2008-01-01

    In order to analyze the main critical engineering factors, an information-based rough set approach that considers conditional information entropy as a measurement of information has been developed. An algorithm for continuous attribute discretization based on conditional information entropy and an algorithm for rule extraction considering the supports of rules are proposed. The initial decision system is established by collecting enough monitoring data. Then, the continuous attributes are discretized, and the condition attributes are reduced. Finally, the rules that indicate the action law of the main factors are extracted and the results are explained. By applying this approach to a crack in an arch gravity dam, it can be concluded that the water level and the temperature are the main factors affecting the crack opening, and there is a negative correlation between the crack opening and the temperature. This conclusion corresponds with the observation that cracks in most concrete dams are influenced mainly by water level and temperature, and the influence of temperature is more evident.

  8. A room with a view: Setting influences information disclosure in investigative interviews.

    Science.gov (United States)

    Dawson, Evan; Hartwig, Maria; Brimbal, Laure; Denisenkov, Philipp

    2017-08-01

    Research on embodied cognition and priming show that human behavior is influenced nonconsciously by the environment in metaphoric ways. Previous research has shown that conceptual priming can lead people to disclose sensitive information (Davis, Soref, Villalobos, & Mikulincer, 2016; Dawson, Hartwig, & Brimbal, 2015). Here, we sought to examine whether concepts of openness can be activated to promote disclosure within the interview itself, through the physical setting. In two laboratory studies, participants were exposed to details of a mock environmental terrorism conspiracy through a courier task, which they were subsequently interviewed about in different settings. In Study 1, participants were interviewed in either a room designed to activate openness, or a prototypically enclosed, bare custodial interview room. In Study 2, we manipulated both architectural and interior features of both rooms. Challenging the status quo that a small room is optimal for investigative interviewing, our findings offer compelling evidence that the spaciousness of an interview room can influence a person's tendency to be "open" with or "closed" about information. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  9. Methods to evaluate health information systems in healthcare settings: a literature review.

    Science.gov (United States)

    Rahimi, Bahlol; Vimarlund, Vivian

    2007-10-01

    Although information technology (IT)-based applications in healthcare have existed for more than three decades, methods to evaluate outputs and outcomes of the use of IT-based systems in medical informatics is still a challenge for decision makers, as well as to those who want to measure the effects of ICT in healthcare settings. The aim of this paper is to review published articles in the area evaluations of IT-based systems in order to gain knowledge about methodologies used and findings obtained from the evaluation of IT-based systems applied in healthcare settings. The literature review includes studies of IT-based systems between 2003 and 2005. The findings show that economic and organizational aspects dominate evaluation studies in this area. However, the results focus mostly on positive outputs such as user satisfaction, financial benefits and improved organizational work. This review shows that there is no standard framework for evaluation effects and outputs of implementation and use of IT in the healthcare setting and that until today no studies explore the impact of IT on the healthcare system' productivity and effectiveness.

  10. In silico analysis of stomach lineage specific gene set expression pattern in gastric cancer.

    Science.gov (United States)

    Pandi, Narayanan Sathiya; Suganya, Sivagurunathan; Rajendran, Suriliyandi

    2013-10-04

    Stomach lineage specific gene products act as a protective barrier in the normal stomach and their expression maintains the normal physiological processes, cellular integrity and morphology of the gastric wall. However, the regulation of stomach lineage specific genes in gastric cancer (GC) is far less clear. In the present study, we sought to investigate the role and regulation of stomach lineage specific gene set (SLSGS) in GC. SLSGS was identified by comparing the mRNA expression profiles of normal stomach tissue with other organ tissue. The obtained SLSGS was found to be under expressed in gastric tumors. Functional annotation analysis revealed that the SLSGS was enriched for digestive function and gastric epithelial maintenance. Employing a single sample prediction method across GC mRNA expression profiles identified the under expression of SLSGS in proliferative type and invasive type gastric tumors compared to the metabolic type gastric tumors. Integrative pathway activation prediction analysis revealed a close association between estrogen-α signaling and SLSGS expression pattern in GC. Elevated expression of SLSGS in GC is associated with an overall increase in the survival of GC patients. In conclusion, our results highlight that estrogen mediated regulation of SLSGS in gastric tumor is a molecular predictor of metabolic type GC and prognostic factor in GC. Copyright © 2013 Elsevier Inc. All rights reserved.

  11. Refining ensembles of predicted gene regulatory networks based on characteristic interaction sets.

    Directory of Open Access Journals (Sweden)

    Lukas Windhager

    Full Text Available Different ensemble voting approaches have been successfully applied for reverse-engineering of gene regulatory networks. They are based on the assumption that a good approximation of true network structure can be derived by considering the frequencies of individual interactions in a large number of predicted networks. Such approximations are typically superior in terms of prediction quality and robustness as compared to considering a single best scoring network only. Nevertheless, ensemble approaches only work well if the predicted gene regulatory networks are sufficiently similar to each other. If the topologies of predicted networks are considerably different, an ensemble of all networks obscures interesting individual characteristics. Instead, networks should be grouped according to local topological similarities and ensemble voting performed for each group separately. We argue that the presence of sets of co-occurring interactions is a suitable indicator for grouping predicted networks. A stepwise bottom-up procedure is proposed, where first mutual dependencies between pairs of interactions are derived from predicted networks. Pairs of co-occurring interactions are subsequently extended to derive characteristic interaction sets that distinguish groups of networks. Finally, ensemble voting is applied separately to the resulting topologically similar groups of networks to create distinct group-ensembles. Ensembles of topologically similar networks constitute distinct hypotheses about the reference network structure. Such group-ensembles are easier to interpret as their characteristic topology becomes clear and dependencies between interactions are known. The availability of distinct hypotheses facilitates the design of further experiments to distinguish between plausible network structures. The proposed procedure is a reasonable refinement step for non-deterministic reverse-engineering applications that produce a large number of candidate

  12. Developing a Minimum Data Set for an Information Management System to Study Traffic Accidents in Iran.

    Science.gov (United States)

    Mohammadi, Ali; Ahmadi, Maryam; Gharagozlu, Alireza

    2016-03-01

    Each year, around 1.2 million people die in the road traffic incidents. Reducing traffic accidents requires an exact understanding of the risk factors associated with traffic patterns and behaviors. Properly analyzing these factors calls for a comprehensive system for collecting and processing accident data. The aim of this study was to develop a minimum data set (MDS) for an information management system to study traffic accidents in Iran. This descriptive, cross-sectional study was performed in 2014. Data were collected from the traffic police, trauma centers, medical emergency centers, and via the internet. The investigated resources for this study were forms, databases, and documents retrieved from the internet. Forms and databases were identical, and one sample of each was evaluated. The related internet-sourced data were evaluated in their entirety. Data were collected using three checklists. In order to arrive at a consensus about the data elements, the decision Delphi technique was applied using questionnaires. The content validity and reliability of the questionnaires were assessed by experts' opinions and the test-retest method, respectively. An (MDS) of a traffic accident information management system was assigned to three sections: a minimum data set for traffic police with six classes, including 118 data elements; a trauma center with five data classes, including 57 data elements; and a medical emergency center, with 11 classes, including 64 data elements. Planning for the prevention of traffic accidents requires standardized data. As the foundation for crash prevention efforts, existing standard data infrastructures present policymakers and government officials with a great opportunity to strengthen and integrate existing accident information systems to better track road traffic injuries and fatalities.

  13. Developing a Minimum Data Set for an Information Management System to Study Traffic Accidents in Iran

    Science.gov (United States)

    Mohammadi, Ali; Ahmadi, Maryam; Gharagozlu, Alireza

    2016-01-01

    Background: Each year, around 1.2 million people die in the road traffic incidents. Reducing traffic accidents requires an exact understanding of the risk factors associated with traffic patterns and behaviors. Properly analyzing these factors calls for a comprehensive system for collecting and processing accident data. Objectives: The aim of this study was to develop a minimum data set (MDS) for an information management system to study traffic accidents in Iran. Materials and Methods: This descriptive, cross-sectional study was performed in 2014. Data were collected from the traffic police, trauma centers, medical emergency centers, and via the internet. The investigated resources for this study were forms, databases, and documents retrieved from the internet. Forms and databases were identical, and one sample of each was evaluated. The related internet-sourced data were evaluated in their entirety. Data were collected using three checklists. In order to arrive at a consensus about the data elements, the decision Delphi technique was applied using questionnaires. The content validity and reliability of the questionnaires were assessed by experts’ opinions and the test-retest method, respectively. Results: An (MDS) of a traffic accident information management system was assigned to three sections: a minimum data set for traffic police with six classes, including 118 data elements; a trauma center with five data classes, including 57 data elements; and a medical emergency center, with 11 classes, including 64 data elements. Conclusions: Planning for the prevention of traffic accidents requires standardized data. As the foundation for crash prevention efforts, existing standard data infrastructures present policymakers and government officials with a great opportunity to strengthen and integrate existing accident information systems to better track road traffic injuries and fatalities. PMID:27247791

  14. An ancient dental gene set governs development and continuous regeneration of teeth in sharks.

    Science.gov (United States)

    Rasch, Liam J; Martin, Kyle J; Cooper, Rory L; Metscher, Brian D; Underwood, Charlie J; Fraser, Gareth J

    2016-07-15

    The evolution of oral teeth is considered a major contributor to the overall success of jawed vertebrates. This is especially apparent in cartilaginous fishes including sharks and rays, which develop elaborate arrays of highly specialized teeth, organized in rows and retain the capacity for life-long regeneration. Perpetual regeneration of oral teeth has been either lost or highly reduced in many other lineages including important developmental model species, so cartilaginous fishes are uniquely suited for deep comparative analyses of tooth development and regeneration. Additionally, sharks and rays can offer crucial insights into the characters of the dentition in the ancestor of all jawed vertebrates. Despite this, tooth development and regeneration in chondrichthyans is poorly understood and remains virtually uncharacterized from a developmental genetic standpoint. Using the emerging chondrichthyan model, the catshark (Scyliorhinus spp.), we characterized the expression of genes homologous to those known to be expressed during stages of early dental competence, tooth initiation, morphogenesis, and regeneration in bony vertebrates. We have found that expression patterns of several genes from Hh, Wnt/β-catenin, Bmp and Fgf signalling pathways indicate deep conservation over ~450 million years of tooth development and regeneration. We describe how these genes participate in the initial emergence of the shark dentition and how they are redeployed during regeneration of successive tooth generations. We suggest that at the dawn of the vertebrate lineage, teeth (i) were most likely continuously regenerative structures, and (ii) utilised a core set of genes from members of key developmental signalling pathways that were instrumental in creating a dental legacy redeployed throughout vertebrate evolution. These data lay the foundation for further experimental investigations utilizing the unique regenerative capacity of chondrichthyan models to answer evolutionary

  15. Transcriptional shift identifies a set of genes driving breast cancer chemoresistance.

    Directory of Open Access Journals (Sweden)

    Laura Vera-Ramirez

    Full Text Available BACKGROUND: Distant recurrences after antineoplastic treatment remain a serious problem for breast cancer clinical management, which threats patients' life. Systemic therapy is administered to eradicate cancer cells from the organism, both at the site of the primary tumor and at any other potential location. Despite this intervention, a significant proportion of breast cancer patients relapse even many years after their primary tumor has been successfully treated according to current clinical standards, evidencing the existence of a chemoresistant cell subpopulation originating from the primary tumor. METHODS/FINDINGS: To identify key molecules and signaling pathways which drive breast cancer chemoresistance we performed gene expression analysis before and after anthracycline and taxane-based chemotherapy and compared the results between different histopathological response groups (good-, mid- and bad-response, established according to the Miller & Payne grading system. Two cohorts of 33 and 73 breast cancer patients receiving neoadjuvant chemotherapy were recruited for whole-genome expression analysis and validation assay, respectively. Identified genes were subjected to a bioinformatic analysis in order to ascertain the molecular function of the proteins they encode and the signaling in which they participate. High throughput technologies identified 65 gene sequences which were over-expressed in all groups (P ≤ 0·05 Bonferroni test. Notably we found that, after chemotherapy, a significant proportion of these genes were over-expressed in the good responders group, making their tumors indistinguishable from those of the bad responders in their expression profile (P ≤ 0.05 Benjamini-Hochgerg`s method. CONCLUSIONS: These data identify a set of key molecular pathways selectively up-regulated in post-chemotherapy cancer cells, which may become appropriate targets for the development of future directed therapies against breast cancer.

  16. Transcriptional Shift Identifies a Set of Genes Driving Breast Cancer Chemoresistance

    Science.gov (United States)

    Vera-Ramirez, Laura; Sanchez-Rovira, Pedro; Ramirez-Tortosa, Cesar L.; Quiles, Jose L.; Ramirez-Tortosa, MCarmen; Lorente, Jose A.

    2013-01-01

    Background Distant recurrences after antineoplastic treatment remain a serious problem for breast cancer clinical management, which threats patients’ life. Systemic therapy is administered to eradicate cancer cells from the organism, both at the site of the primary tumor and at any other potential location. Despite this intervention, a significant proportion of breast cancer patients relapse even many years after their primary tumor has been successfully treated according to current clinical standards, evidencing the existence of a chemoresistant cell subpopulation originating from the primary tumor. Methods/Findings To identify key molecules and signaling pathways which drive breast cancer chemoresistance we performed gene expression analysis before and after anthracycline and taxane-based chemotherapy and compared the results between different histopathological response groups (good-, mid- and bad-response), established according to the Miller & Payne grading system. Two cohorts of 33 and 73 breast cancer patients receiving neoadjuvant chemotherapy were recruited for whole-genome expression analysis and validation assay, respectively. Identified genes were subjected to a bioinformatic analysis in order to ascertain the molecular function of the proteins they encode and the signaling in which they participate. High throughput technologies identified 65 gene sequences which were over-expressed in all groups (P ≤ 0·05 Bonferroni test). Notably we found that, after chemotherapy, a significant proportion of these genes were over-expressed in the good responders group, making their tumors indistinguishable from those of the bad responders in their expression profile (P ≤ 0.05 Benjamini-Hochgerg`s method). Conclusions These data identify a set of key molecular pathways selectively up-regulated in post-chemotherapy cancer cells, which may become appropriate targets for the development of future directed therapies against breast cancer. PMID:23326553

  17. Evaluation of endogenous control genes for gene expression studies across multiple tissues and in the specific sets of fat- and muscle-type samples of the pig.

    Science.gov (United States)

    Gu, Y R; Li, M Z; Zhang, K; Chen, L; Jiang, A A; Wang, J Y; Li, X W

    2011-08-01

    To normalize a set of quantitative real-time PCR (q-PCR) data, it is essential to determine an optimal number/set of housekeeping genes, as the abundance of housekeeping genes can vary across tissues or cells during different developmental stages, or even under certain environmental conditions. In this study, of the 20 commonly used endogenous control genes, 13, 18 and 17 genes exhibited credible stability in 56 different tissues, 10 types of adipose tissue and five types of muscle tissue, respectively. Our analysis clearly showed that three optimal housekeeping genes are adequate for an accurate normalization, which correlated well with the theoretical optimal number (r ≥ 0.94). In terms of economical and experimental feasibility, we recommend the use of the three most stable housekeeping genes for calculating the normalization factor. Based on our results, the three most stable housekeeping genes in all analysed samples (TOP2B, HSPCB and YWHAZ) are recommended for accurate normalization of q-PCR data. We also suggest that two different sets of housekeeping genes are appropriate for 10 types of adipose tissue (the HSPCB, ALDOA and GAPDH genes) and five types of muscle tissue (the TOP2B, HSPCB and YWHAZ genes), respectively. Our report will serve as a valuable reference for other studies aimed at measuring tissue-specific mRNA abundance in porcine samples.

  18. Multiple genetic interaction experiments provide complementary information useful for gene function prediction.

    Directory of Open Access Journals (Sweden)

    Magali Michaut

    Full Text Available Genetic interactions help map biological processes and their functional relationships. A genetic interaction is defined as a deviation from the expected phenotype when combining multiple genetic mutations. In Saccharomyces cerevisiae, most genetic interactions are measured under a single phenotype - growth rate in standard laboratory conditions. Recently genetic interactions have been collected under different phenotypic readouts and experimental conditions. How different are these networks and what can we learn from their differences? We conducted a systematic analysis of quantitative genetic interaction networks in yeast performed under different experimental conditions. We find that networks obtained using different phenotypic readouts, in different conditions and from different laboratories overlap less than expected and provide significant unique information. To exploit this information, we develop a novel method to combine individual genetic interaction data sets and show that the resulting network improves gene function prediction performance, demonstrating that individual networks provide complementary information. Our results support the notion that using diverse phenotypic readouts and experimental conditions will substantially increase the amount of gene function information produced by genetic interaction screens.

  19. Robust CPD Algorithm for Non-Rigid Point Set Registration Based on Structure Information.

    Science.gov (United States)

    Peng, Lei; Li, Guangyao; Xiao, Mang; Xie, Li

    2016-01-01

    Recently, the Coherent Point Drift (CPD) algorithm has become a very popular and efficient method for point set registration. However, this method does not take into consideration the neighborhood structure information of points to find the correspondence and requires a manual assignment of the outlier ratio. Therefore, CPD is not robust for large degrees of degradation. In this paper, an improved method is proposed to overcome the two limitations of CPD. A structure descriptor, such as shape context, is used to perform the auxiliary calculation of the correspondence, and the proportion of each GMM component is adjusted by the similarity. The outlier ratio is formulated in the EM framework so that it can be automatically calculated and optimized iteratively. The experimental results on both synthetic data and real data demonstrate that the proposed method described here is more robust to deformation, noise, occlusion, and outliers than CPD and other state-of-the-art algorithms.

  20. Comprehensive yet scalable health information systems for low resource settings: a collaborative effort in sierra leone.

    Science.gov (United States)

    Braa, Jørn; Kanter, Andrew S; Lesh, Neal; Crichton, Ryan; Jolliffe, Bob; Sæbø, Johan; Kossi, Edem; Seebregts, Christopher J

    2010-11-13

    We address the problem of how to integrate health information systems in low-income African countries in which technical infrastructure and human resources vary wildly within countries. We describe a set of tools to meet the needs of different service areas including managing aggregate indicators, patient level record systems, and mobile tools for community outreach. We present the case of Sierra Leone and use this case to motivate and illustrate an architecture that allows us to provide services at each level of the health system (national, regional, facility and community) and provide different configurations of the tools as appropriate for the individual area. Finally, we present a, collaborative implementation of this approach in Sierra Leone.

  1. Supporting research sites in resource-limited settings: challenges in implementing information technology infrastructure.

    Science.gov (United States)

    Whalen, Christopher J; Donnell, Deborah; Tartakovsky, Michael

    2014-01-01

    As information and communication technology infrastructure becomes more reliable, new methods of electronic data capture, data marts/data warehouses, and mobile computing provide platforms for rapid coordination of international research projects and multisite studies. However, despite the increasing availability of Internet connectivity and communication systems in remote regions of the world, there are still significant obstacles. Sites with poor infrastructure face serious challenges participating in modern clinical and basic research, particularly that relying on electronic data capture and Internet communication technologies. This report discusses our experiences in supporting research in resource-limited settings. We describe examples of the practical and ethical/regulatory challenges raised by the use of these newer technologies for data collection in multisite clinical studies.

  2. Effect of set size, age, and mode of stimulus presentation on information-processing speed.

    Science.gov (United States)

    Norton, J. C.

    1972-01-01

    First, second, and third grade pupils served as subjects in an experiment designed to show the effect of age, mode of stimulus presentation, and information value on recognition time. Stimuli were presented in picture and printed word form and in groups of 2, 4, and 8. The results of the study indicate that first graders are slower than second and third graders who are nearly equal. There is a gross shift in reaction time as a function of mode of stimulus presentation with increase in age. The first graders take much longer to identify words than pictures, while the reverse is true of the older groups. With regard to set size, a slope appears in the pictures condition in the older groups, while for first graders, a large slope occurs in the words condition and only a much smaller one for pictures.

  3. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation.

    Science.gov (United States)

    Wang, Luman; Mo, Qiaochu; Wang, Jianxin

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given gene pair or probe set pair input by web users, both mutual information (MI) and Pearson correlation coefficient (r) are calculated, and several corresponding values are reported to reflect their coexpression correlation nature, including MI and r values, their respective rank orderings, their rank comparison, and their hybrid correlation value. Furthermore, for a given gene, the top 10 most relevant genes to it are displayed with the MI, r, or their hybrid perspective, respectively. Currently, the database totally includes 16 human cell groups, involving 20,283 human genes. The expression data and the calculated correlation results from the database are interactively accessible on the web page and can be implemented for other related applications and researches.

  4. Power distribution system diagnosis with uncertainty information based on rough sets and clouds model

    Science.gov (United States)

    Sun, Qiuye; Zhang, Huaguang

    2006-11-01

    During the distribution system fault period, usually the explosive growth signals including fuzziness and randomness are too redundant to make right decision for the dispatcher. The volume of data with a few uncertainties overwhelms classic information systems in the distribution control center and exacerbates the existing knowledge acquisition process of expert systems. So intelligent methods must be developed to aid users in maintaining and using this abundance of information effectively. An important issue in distribution fault diagnosis system (DFDS) is to allow the discovered knowledge to be as close as possible to natural languages to satisfy user needs with tractability, and to offer DFDS robustness. At this junction, the paper describes a systematic approach for detecting superfluous data. The approach therefore could offer user both the opportunity to learn about the data and to validate the extracted knowledge. It is considered as a "white box" rather than a "black box" like in the case of neural network. The cloud theory is introduced and the mathematical description of cloud has effectively integrated the fuzziness and randomness of linguistic terms in a unified way. Based on it, a method of knowledge representation in DFDS is developed which bridges the gap between quantitative knowledge and qualitative knowledge. In relation to classical rough set, the cloud-rough method can deal with the uncertainty of the attribute and make a soft discretization for continuous ones (such as the current and the voltage). A novel approach, including discretization, attribute reduction, rule reliability computation and equipment reliability computation, is presented. The data redundancy is greatly reduced based on an integrated use of cloud theory and rough set theory. Illustrated with a power distribution DFDS shows the effectiveness and practicality of the proposed approach.

  5. Module network inference from a cancer gene expression data set identifies microRNA regulated modules.

    Directory of Open Access Journals (Sweden)

    Eric Bonnet

    Full Text Available BACKGROUND: MicroRNAs (miRNAs are small RNAs that recognize and regulate mRNA target genes. Multiple lines of evidence indicate that they are key regulators of numerous critical functions in development and disease, including cancer. However, defining the place and function of miRNAs in complex regulatory networks is not straightforward. Systems approaches, like the inference of a module network from expression data, can help to achieve this goal. METHODOLOGY/PRINCIPAL FINDINGS: During the last decade, much progress has been made in the development of robust and powerful module network inference algorithms. In this study, we analyze and assess experimentally a module network inferred from both miRNA and mRNA expression data, using our recently developed module network inference algorithm based on probabilistic optimization techniques. We show that several miRNAs are predicted as statistically significant regulators for various modules of tightly co-expressed genes. A detailed analysis of three of those modules demonstrates that the specific assignment of miRNAs is functionally coherent and supported by literature. We further designed a set of experiments to test the assignment of miR-200a as the top regulator of a small module of nine genes. The results strongly suggest that miR-200a is regulating the module genes via the transcription factor ZEB1. Interestingly, this module is most likely involved in epithelial homeostasis and its dysregulation might contribute to the malignant process in cancer cells. CONCLUSIONS/SIGNIFICANCE: Our results show that a robust module network analysis of expression data can provide novel insights of miRNA function in important cellular processes. Such a computational approach, starting from expression data alone, can be helpful in the process of identifying the function of miRNAs by suggesting modules of co-expressed genes in which they play a regulatory role. As shown in this study, those modules can then be

  6. Exploring the clinical information system implementation readiness activities to support nursing in hospital settings.

    Science.gov (United States)

    Piscotty, Ronald J; Tzeng, Huey-Ming

    2011-11-01

    The implementation of clinical information systems can have a profound impact on nurses and their productivity. Poorly implemented systems can lead to unintended consequences that may have a negative impact on clinical processes and patient outcomes. Executives must have adequate knowledge to address nurses' concerns related to implementation. This study explored the clinical information system implementation readiness activities adopted by chief nurse executivesin hospital settings. A descriptive qualitative design was used, including interviews with six chief nurse executives, held from December 2003 through March 2004. The constant comparative method was used to analyze the interviews to extract readiness activity themes and compare these to the literature. The synthesized themes showed that the executives were knowledgeable about and engaged in several key areas, but not all, of the implementation readiness process. The majority of responses were classified into the thematic areas of champion support, staff preparation for change, training, organizational alignment, planning, and vendor support. The theme of a lack of vendor support was not identified in previous studies but was clear in the responses of the chief nurse executives interviewed.

  7. A Multistage Feature Selection Model for Document Classification Using Information Gain and Rough Set

    Directory of Open Access Journals (Sweden)

    Mrs. Leena. H. Patil

    2014-11-01

    Full Text Available Huge number of documents are increasing rapidly, therefore, to organize it in digitized form text categorization becomes an challenging issue. A major issue for text categorization is its large number of features. Most of the features are noisy, irrelevant and redundant, which may mislead the classifier. Hence, it is most important to reduce dimensionality of data to get smaller subset and provide the most gain in information. Feature selection techniques reduce the dimensionality of feature space. It also improves the overall accuracy and performance. Hence, to overcome the issues of text categorization feature selection is considered as an efficient technique . Therefore, we, proposed a multistage feature selection model to improve the overall accuracy and performance of classification. In the first stage document preprocessing part is performed. Secondly, each term within the documents are ranked according to their importance for classification using the information gain. Thirdly rough set technique is applied to the terms which are ranked importantly and feature reduction is carried out. Finally a document classification is performed on the core features using Naive Bayes and KNN classifier. Experiments are carried out on three UCI datasets, Reuters 21578, Classic 04 and Newsgroup 20. Results show the better accuracy and performance of the proposed model.

  8. Improving core outcome set development: qualitative interviews with developers provided pointers to inform guidance.

    Science.gov (United States)

    Gargon, Elizabeth; Williamson, Paula R; Young, Bridget

    2017-06-01

    The objective of the study was to explore core outcome set (COS) developers' experiences of their work to inform methodological guidance on COS development and identify areas for future methodological research. Semistructured, audio-recorded interviews with a purposive sample of 32 COS developers. Analysis of transcribed interviews was informed by the constant comparative method and framework analysis. Developers found COS development to be challenging, particularly in relation to patient participation and accessing funding. Their accounts raised fundamental questions about the status of COS development and whether it is consultation or research. Developers emphasized how the absence of guidance had affected their work and identified areas where guidance or evidence about COS development would be useful including, patient participation, ethics, international development, and implementation. They particularly wanted guidance on systematic reviews, Delphi, and consensus meetings. The findings raise important questions about the funding, status, and process of COS development and indicate ways that it could be strengthened. Guidance could help developers to strengthen their work, but over specification could threaten quality in COS development. Guidance should therefore highlight common issues to consider and encourage tailoring of COS development to the context and circumstances of particular COS. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  9. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    Science.gov (United States)

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  10. Considerations for Using Genetic and Epigenetic Information in Occupational Health Risk Assessment and Standard Setting.

    Science.gov (United States)

    Schulte, P A; Whittaker, C; Curran, C P

    2015-01-01

    Risk assessment forms the basis for both occupational health decision-making and the development of occupational exposure limits (OELs). Although genetic and epigenetic data have not been widely used in risk assessment and ultimately, standard setting, it is possible to envision such uses. A growing body of literature demonstrates that genetic and epigenetic factors condition biological responses to occupational and environmental hazards or serve as targets of them. This presentation addresses the considerations for using genetic and epigenetic information in risk assessments, provides guidance on using this information within the classic risk assessment paradigm, and describes a framework to organize thinking about such uses. The framework is a 4 × 4 matrix involving the risk assessment functions (hazard identification, dose-response modeling, exposure assessment, and risk characterization) on one axis and inherited and acquired genetic and epigenetic data on the other axis. The cells in the matrix identify how genetic and epigenetic data can be used for each risk assessment function. Generally, genetic and epigenetic data might be used as endpoints in hazard identification, as indicators of exposure, as effect modifiers in exposure assessment and dose-response modeling, as descriptors of mode of action, and to characterize toxicity pathways. Vast amounts of genetic and epigenetic data may be generated by high-throughput technologies. These data can be useful for assessing variability and reducing uncertainty in extrapolations, and they may serve as the foundation upon which identification of biological perturbations would lead to a new paradigm of toxicity pathway-based risk assessments.

  11. Educating the Next Generation of Geoscientists: Strategies for Formal and Informal Settings

    Science.gov (United States)

    Burrell, S.

    2013-12-01

    ENGAGE, Educating the Next Generation of Geoscientists, is an effort funded by the National Science Foundation to provide academic opportunities for members of underrepresented groups to learn geology in formal and informal settings through collaboration with other universities and science organizations. The program design tests the hypothesis that developing a culture of on-going dialogue around science issues through special guest lectures and workshops, creating opportunities for mentorship through informal lunches, incorporating experiential learning in the field into the geoscience curriculum in lower division courses, partnership-building through the provision of paid summer internships and research opportunities, enabling students to participate in professional conferences, and engaging family members in science education through family science nights and special presentations, will remove the academic, social and economic obstacles that have traditionally hindered members of underrepresented groups from participation in the geosciences and will result in an increase in geoscience literacy and enrollment. Student feedback and anecdotal evidence indicate an increased interest in geology as a course of study and increased awareness of the relevance of geology everyday life. Preliminary statistics from two years of program implementation indicate increased student comprehension of Earth science concepts and ability to use data to identify trends in the natural environment.

  12. Inferring gene dependency network specific to phenotypic alteration based on gene expression data and clinical information of breast cancer.

    Directory of Open Access Journals (Sweden)

    Xionghui Zhou

    Full Text Available Although many methods have been proposed to reconstruct gene regulatory network, most of them, when applied in the sample-based data, can not reveal the gene regulatory relations underlying the phenotypic change (e.g. normal versus cancer. In this paper, we adopt phenotype as a variable when constructing the gene regulatory network, while former researches either neglected it or only used it to select the differentially expressed genes as the inputs to construct the gene regulatory network. To be specific, we integrate phenotype information with gene expression data to identify the gene dependency pairs by using the method of conditional mutual information. A gene dependency pair (A,B means that the influence of gene A on the phenotype depends on gene B. All identified gene dependency pairs constitute a directed network underlying the phenotype, namely gene dependency network. By this way, we have constructed gene dependency network of breast cancer from gene expression data along with two different phenotype states (metastasis and non-metastasis. Moreover, we have found the network scale free, indicating that its hub genes with high out-degrees may play critical roles in the network. After functional investigation, these hub genes are found to be biologically significant and specially related to breast cancer, which suggests that our gene dependency network is meaningful. The validity has also been justified by literature investigation. From the network, we have selected 43 discriminative hubs as signature to build the classification model for distinguishing the distant metastasis risks of breast cancer patients, and the result outperforms those classification models with published signatures. In conclusion, we have proposed a promising way to construct the gene regulatory network by using sample-based data, which has been shown to be effective and accurate in uncovering the hidden mechanism of the biological process and identifying the gene

  13. Informative conditions for the data set in an MIMO networked control system with delays, packet dropout and transmission scheduling

    Science.gov (United States)

    Zhang, Cong; Xiong, Zhihua; Ye, Hao

    2014-07-01

    In system identification, a data set needs to be informative to guarantee that the identification criterion has a unique global minimum asymptotically and the parameter estimation is consistent. In this paper, we study the informativity of the data set in a multiple-input and multiple-output (MIMO) networked control system (NCS), which contains possible network-induced delays, packet dropout, transmission scheduling, or a combination of these factors in network transmission. Moreover, to guarantee the data set of this MIMO NCS to be informative, a group of conditions for network transmission and controller's proportional term are developed. Finally, simulation studies are given to illustrate the result.

  14. A novel CpG island set identifies tissue-specific methylation at developmental gene loci.

    Directory of Open Access Journals (Sweden)

    Robert Illingworth

    2008-01-01

    Full Text Available CpG islands (CGIs are dense clusters of CpG sequences that punctuate the CpG-deficient human genome and associate with many gene promoters. As CGIs also differ from bulk chromosomal DNA by their frequent lack of cytosine methylation, we devised a CGI enrichment method based on nonmethylated CpG affinity chromatography. The resulting library was sequenced to define a novel human blood CGI set that includes many that are not detected by current algorithms. Approximately half of CGIs were associated with annotated gene transcription start sites, the remainder being intra- or intergenic. Using an array representing over 17,000 CGIs, we established that 6%-8% of CGIs are methylated in genomic DNA of human blood, brain, muscle, and spleen. Inter- and intragenic CGIs are preferentially susceptible to methylation. CGIs showing tissue-specific methylation were overrepresented at numerous genetic loci that are essential for development, including HOX and PAX family members. The findings enable a comprehensive analysis of the roles played by CGI methylation in normal and diseased human tissues.

  15. Integrating genome-wide association study and expression quantitative trait loci data identifies multiple genes and gene set associated with neuroticism.

    Science.gov (United States)

    Fan, Qianrui; Wang, Wenyu; Hao, Jingcan; He, Awen; Wen, Yan; Guo, Xiong; Wu, Cuiyan; Ning, Yujie; Wang, Xi; Wang, Sen; Zhang, Feng

    2017-08-01

    Neuroticism is a fundamental personality trait with significant genetic determinant. To identify novel susceptibility genes for neuroticism, we conducted an integrative analysis of genomic and transcriptomic data of genome wide association study (GWAS) and expression quantitative trait locus (eQTL) study. GWAS summary data was driven from published studies of neuroticism, totally involving 170,906 subjects. eQTL dataset containing 927,753 eQTLs were obtained from an eQTL meta-analysis of 5311 samples. Integrative analysis of GWAS and eQTL data was conducted by summary data-based Mendelian randomization (SMR) analysis software. To identify neuroticism associated gene sets, the SMR analysis results were further subjected to gene set enrichment analysis (GSEA). The gene set annotation dataset (containing 13,311 annotated gene sets) of GSEA Molecular Signatures Database was used. SMR single gene analysis identified 6 significant genes for neuroticism, including MSRA (p value=2.27×10(-10)), MGC57346 (p value=6.92×10(-7)), BLK (p value=1.01×10(-6)), XKR6 (p value=1.11×10(-6)), C17ORF69 (p value=1.12×10(-6)) and KIAA1267 (p value=4.00×10(-6)). Gene set enrichment analysis observed significant association for Chr8p23 gene set (false discovery rate=0.033). Our results provide novel clues for the genetic mechanism studies of neuroticism. Copyright © 2017. Published by Elsevier Inc.

  16. Cross-Cultural Astronomy in Informal Education Settings - Collaboration with Integrity

    Science.gov (United States)

    Maryboy, Nancy; Hawkins, I.; Begay, D.; Sakimoto, P.

    2008-05-01

    The richness of astronomical knowledge and traditions from diverse cultures can engage participants of all ages and backgrounds. We will present astronomy-focused programs for museums, planetariums, and community centers designed to enhance participation of underserved populations in celebrating the International Year of Astronomy (IYA) in 2009. We will share examples of how the indigenous astronomies from the Southwestern US and Mesoamerica can be juxtaposed with Western astronomy to enhance education efforts and understanding for all audiences. In these examples, the traditional knowledge has been highlighted and incorporated into the realm of innovative and unique multimedia resources that engage students and the public, and which often ignite a deeper and more authentic interest in western astronomy and astrophysics. We will discuss approaches to displaying the Navajo sky in a digital planetarium in a manner that is true to the Navajo worldview and that also presents images and information from Western astronomy. We will share multi-media resources that highlight the importance of solar alignments in architecture and in landscape within the context of the seasons. We will also discuss how we are exploring ways to protect the intellectual property rights of indigenous sky knowledge while making aspects of it available to the general public. Our collaboration upholds the integrity of both Western and Indigenous astronomy knowledge and research protocols, and honors indigenous languages. We will discuss collaborative and relationship-based evaluation strategies emerging from the above efforts and from a new effort, Cosmic Serpent, a professional development program to increase the capacity of museum practitioners to bridge indigenous and western science learning in informal settings. We will provide links and information to access products and programs to engage all audiences in the wonder, complexity, and beauty of our Universe. We acknowledge the generous

  17. The use of qualitative methods to inform Delphi surveys in core outcome set development.

    Science.gov (United States)

    Keeley, T; Williamson, P; Callery, P; Jones, L L; Mathers, J; Jones, J; Young, B; Calvert, M

    2016-05-04

    Core outcome sets (COS) help to minimise bias in trials and facilitate evidence synthesis. Delphi surveys are increasingly being used as part of a wider process to reach consensus about what outcomes should be included in a COS. Qualitative research can be used to inform the development of Delphi surveys. This is an advance in the field of COS development and one which is potentially valuable; however, little guidance exists for COS developers on how best to use qualitative methods and what the challenges are. This paper aims to provide early guidance on the potential role and contribution of qualitative research in this area. We hope the ideas we present will be challenged, critiqued and built upon by others exploring the role of qualitative research in COS development. This paper draws upon the experiences of using qualitative methods in the pre-Delphi stage of the development of three different COS. Using these studies as examples, we identify some of the ways that qualitative research might contribute to COS development, the challenges in using such methods and areas where future research is required. Qualitative research can help to identify what outcomes are important to stakeholders; facilitate understanding of why some outcomes may be more important than others, determine the scope of outcomes; identify appropriate language for use in the Delphi survey and inform comparisons between stakeholder data and other sources, such as systematic reviews. Developers need to consider a number of methodological points when using qualitative research: specifically, which stakeholders to involve, how to sample participants, which data collection methods are most appropriate, how to consider outcomes with stakeholders and how to analyse these data. A number of areas for future research are identified. Qualitative research has the potential to increase the research community's confidence in COS, although this will be dependent upon using rigorous and appropriate

  18. Mechanical Unloading of Mouse Bone in Microgravity Significantly Alters Cell Cycle Gene Set Expression

    Science.gov (United States)

    Blaber, Elizabeth; Dvorochkin, Natalya; Almeida, Eduardo; Kaplan, Warren; Burns, Brnedan

    2012-07-01

    unloading in spaceflight, we conducted genome wide microarray analysis of total RNA isolated from the mouse pelvis. Specifically, 16 week old mice were subjected to 15 days spaceflight onboard NASA's STS-131 space shuttle mission. The pelvis of the mice was dissected, the bone marrow was flushed and the bones were briefly stored in RNAlater. The pelvii were then homogenized, and RNA was isolated using TRIzol. RNA concentration and quality was measured using a Nanodrop spectrometer, and 0.8% agarose gel electrophoresis. Samples of cDNA were analyzed using an Affymetrix GeneChip\\S Gene 1.0 ST (Sense Target) Array System for Mouse and GenePattern Software. We normalized the ST gene arrays using Robust Multichip Average (RMA) normalization, which summarizes perfectly matched spots on the array through the median polish algorithm, rather than normalizing according to mismatched spots. We also used Limma for statistical analysis, using the BioConductor Limma Library by Gordon Smyth, and differential expression analysis to identify genes with significant changes in expression between the two experimental conditions. Finally we used GSEApreRanked for Gene Set Enrichment Analysis (GSEA), with Kolmogorov-Smirnov style statistics to identify groups of genes that are regulated together using the t-statistics derived from Limma. Preliminary results show that 6,603 genes expressed in pelvic bone had statistically significant alterations in spaceflight compared to ground controls. These prominently included cell cycle arrest molecules p21, and p18, cell survival molecule Crbp1, and cell cycle molecules cyclin D1, and Cdk1. Additionally, GSEA results indicated alterations in molecular targets of cyclin D1 and Cdk4, senescence pathways resulting from abnormal laminin maturation, cell-cell contacts via E-cadherin, and several pathways relating to protein translation and metabolism. In total 111 gene sets out of 2,488, about 4%, showed statistically significant set alterations. These

  19. HIGEDA: a hierarchical gene-set genetics based algorithm for finding subtle motifs in biological sequences.

    Science.gov (United States)

    Le, Thanh; Altman, Tom; Gardiner, Katheleen

    2010-02-01

    Identification of motifs in biological sequences is a challenging problem because such motifs are often short, degenerate, and may contain gaps. Most algorithms that have been developed for motif-finding use the expectation-maximization (EM) algorithm iteratively. Although EM algorithms can converge quickly, they depend strongly on initialization parameters and can converge to local sub-optimal solutions. In addition, they cannot generate gapped motifs. The effectiveness of EM algorithms in motif finding can be improved by incorporating methods that choose different sets of initial parameters to enable escape from local optima, and that allow gapped alignments within motif models. We have developed HIGEDA, an algorithm that uses the hierarchical gene-set genetic algorithm (HGA) with EM to initiate and search for the best parameters for the motif model. In addition, HIGEDA can identify gapped motifs using a position weight matrix and dynamic programming to generate an optimal gapped alignment of the motif model with sequences from the dataset. We show that HIGEDA outperforms MEME and other motif-finding algorithms on both DNA and protein sequences. Source code and test datasets are available for download at http://ouray.cudenver.edu/~tnle/, implemented in C++ and supported on Linux and MS Windows.

  20. Generation of an algorithm based on minimal gene sets to clinically subtype triple negative breast cancer patients.

    Science.gov (United States)

    Ring, Brian Z; Hout, David R; Morris, Stephan W; Lawrence, Kasey; Schweitzer, Brock L; Bailey, Daniel B; Lehmann, Brian D; Pietenpol, Jennifer A; Seitz, Robert S

    2016-02-23

    Recently, a gene expression algorithm, TNBCtype, was developed that can divide triple-negative breast cancer (TNBC) into molecularly-defined subtypes. The algorithm has potential to provide predictive value for TNBC subtype-specific response to various treatments. TNBCtype used in a retrospective analysis of neoadjuvant clinical trial data of TNBC patients demonstrated that TNBC subtype and pathological complete response to neoadjuvant chemotherapy were significantly associated. Herein we describe an expression algorithm reduced to 101 genes with the power to subtype TNBC tumors similar to the original 2188-gene expression algorithm and predict patient outcomes. The new classification model was built using the same expression data sets used for the original TNBCtype algorithm. Gene set enrichment followed by shrunken centroid analysis were used for feature reduction, then elastic-net regularized linear modeling was used to identify genes for a centroid model classifying all subtypes, comprised of 101 genes. The predictive capability of both this new "lean" algorithm and the original 2188-gene model were applied to an independent clinical trial cohort of 139 TNBC patients treated initially with neoadjuvant doxorubicin/cyclophosphamide and then randomized to receive either paclitaxel or ixabepilone to determine association of pathologic complete response within the subtypes. The new 101-gene expression model reproduced the classification provided by the 2188-gene algorithm and was highly concordant in the same set of seven TNBC cohorts used to generate the TNBCtype algorithm (87%), as well as in the independent clinical trial cohort (88%), when cases with significant correlations to multiple subtypes were excluded. Clinical responses to both neoadjuvant treatment arms, found BL2 to be significantly associated with poor response (Odds Ratio (OR) =0.12, p=0.03 for the 2188-gene model; OR = 0.23, p sets can recapitulate the TNBC subtypes identified by the original 2188

  1. A transcriptomic approach to identify regulatory genes involved in fruit set of wild-type and parthenocarpic tomato genotypes.

    Science.gov (United States)

    Ruiu, Fabrizio; Picarella, Maurizio Enea; Imanishi, Shunsuke; Mazzucato, Andrea

    2015-10-01

    The tomato parthenocarpic fruit (pat) mutation associates a strong competence for parthenocarpy with homeotic transformation of anthers and aberrancy of ovules. To dissect this complex floral phenotype, genes involved in the pollination-independent fruit set of the pat mutant were investigated by microarray analysis using wild-type and mutant ovaries. Normalized expression data were subjected to one-way ANOVA and 2499 differentially expressed genes (DEGs) displaying a >1.5 log-fold change in at least one of the pairwise comparisons analyzed were detected. DEGs were categorized into 20 clusters and clusters classified into five groups representing transcripts with similar expression dynamics. The "regulatory function" group (685 DEGs) contained putative negative or positive fruit set regulators, "pollination-dependent" (411 DEGs) included genes activated by pollination, "fruit growth-related" (815 DEGs) genes activated at early fruit growth. The last groups listed genes with different or similar expression pattern at all stages in the two genotypes. qRT-PCR validation of 20 DEGs plus other four selected genes assessed the high reliability of microarray expression data; the average correlation coefficient for the 20 DEGs was 0.90. In all the groups were evidenced relevant transcription factors encoding proteins regulating meristem differentiation and floral organ development, genes involved in metabolism, transport and response of hormones, genes involved in cell division and in primary and secondary metabolism. Among pathways related to secondary metabolites emerged genes related to the synthesis of flavonoids, supporting the recent evidence that these compounds are important at the fruit set phase. Selected genes showing a de-regulated expression pattern in pat were studied in other four parthenocarpic genotypes either genetically anonymous or carrying lesions in known gene sequences. This comparative approach offered novel insights for improving the present

  2. Can frameworks inform knowledge about health policy processes? Reviewing health policy papers on agenda setting and testing them against a specific priority-setting framework.

    Science.gov (United States)

    Walt, Gill; Gilson, Lucy

    2014-12-01

    This article systematically reviews a set of health policy papers on agenda setting and tests them against a specific priority-setting framework. The article applies the Shiffman and Smith framework in extracting and synthesizing data from an existing set of papers, purposively identified for their relevance and systematically reviewed. Its primary aim is to assess how far the component parts of the framework help to identify the factors that influence the agenda setting stage of the policy process at global and national levels. It seeks to advance the field and inform the development of theory in health policy by examining the extent to which the framework offers a useful approach for organizing and analysing data. Applying the framework retrospectively to the selected set of papers, it aims to explore influences on priority setting and to assess how far the framework might gain from further refinement or adaptation, if used prospectively. In pursuing its primary aim, the article also demonstrates how the approach of framework synthesis can be used in health policy analysis research.

  3. A new set of reference genes for RT-qPCR assays in the yeast Dekkera bruxellensis.

    Science.gov (United States)

    de Barros Pita, Will; Leite, Fernanda Cristina Bezerra; de Souza Liberal, Anna Theresa; Pereira, Luciana Filgueira; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante; de Morais, Marcos Antonio

    2012-12-01

    The yeast Dekkera bruxellensis has been recently regarded as an important microorganism for bioethanol production owing to its ability to convert glucose, sucrose, and cellobiose to ethanol. The aim of this work was to validate a new set of reference genes for gene expression analysis by quantitative real-time PCR in D. bruxellensis and compare the influence of the method of choice for quantification of mRNA levels with the reliability of our data. Three candidate reference genes, DbEFA1, DbEFB1, and DbYNA1, were used in a quantitative analysis of 4 genes of interest, DbYNR1, DbTPS1, DbADH7, and DbUBA4, based on an approach for calculating the normalization factors by means of the geNorm applet. Each reference gene was also individually used for a 2(-ΔΔC(q)) (comparative C(q) method) calculation of the relative expression of genes of interest. Our results showed that the 3 reference genes provided enough stability and were complementary to the normalization factors method in different culture conditions. This work was able to confirm the usefulness of a previously reported reference gene, EFA1/TEF1, and increased the set of possible reference genes in D. bruxellensis to 4. Moreover, this can improve the reliability of the analysis of the regulation of gene expression in the industrial yeast D. bruxellensis.

  4. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    Science.gov (United States)

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  5. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    Science.gov (United States)

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  6. A meta-analysis of multiple matched copy number and transcriptomics data sets for inferring gene regulatory relationships.

    Directory of Open Access Journals (Sweden)

    Richard Newton

    Full Text Available Inferring gene regulatory relationships from observational data is challenging. Manipulation and intervention is often required to unravel causal relationships unambiguously. However, gene copy number changes, as they frequently occur in cancer cells, might be considered natural manipulation experiments on gene expression. An increasing number of data sets on matched array comparative genomic hybridisation and transcriptomics experiments from a variety of cancer pathologies are becoming publicly available. Here we explore the potential of a meta-analysis of thirty such data sets. The aim of our analysis was to assess the potential of in silico inference of trans-acting gene regulatory relationships from this type of data. We found sufficient correlation signal in the data to infer gene regulatory relationships, with interesting similarities between data sets. A number of genes had highly correlated copy number and expression changes in many of the data sets and we present predicted potential trans-acted regulatory relationships for each of these genes. The study also investigates to what extent heterogeneity between cell types and between pathologies determines the number of statistically significant predictions available from a meta-analysis of experiments.

  7. Higher primates, but not New World monkeys, have a duplicate set of enhancers flanking their apoC-I genes.

    Science.gov (United States)

    Puppione, Donald L

    2014-09-01

    Previous studies have demonstrated that the apoC-I gene and its pseudogene on human chromosome 19 are flanked by a duplicate set of enhancers. Multienhancers, ME.1 and ME.2, are located upstream from the genes and the hepatic control region enhancers, HCR.1 and HCR.2, are located downstream. The duplication of the enhancers has been thought to have occurred when the apoC-I gene was duplicated during primate evolution. Currently, the only primate data are for the human enhancers. Examining the genome of other primates (great and lesser apes, Old and New World monkeys), it was possible to locate the duplicate set of enhancers in apes and Old World monkeys. However, only a single set was found in New World monkeys. These observations provide additional evidence that the apoC-I gene and the flanking enhancers underwent duplication after the divergence of Old and New World monkeys.

  8. Informativeness of Early Huntington Disease Signs about Gene Status.

    Science.gov (United States)

    Oster, Emily; Eberly, Shirley W; Dorsey, E Ray; Kayson-Rubin, Elise; Oakes, David; Shoulson, Ira

    2015-01-01

    The cohort-level risk of Huntington disease (HD) is related to the age and symptom level of the cohort, but this relationship has not been made precise. To predict the evolving likelihood of carrying the Huntington disease (HD) gene for at-risk adults using age and sign level. Using data from adults with early signs and symptoms of HD linked to information on genetic status, we use Bayes' theorem to calculate the probability that an undiagnosed individual of a certain age and sign level has an expanded CAG repeat. Both age and sign levels have substantial influence on the likelihood of HD onset, and the probability of eventual diagnosis changes as those at risk age and exhibit (or fail to exhibit) symptoms. For example, our data suggest that in a cohort of individuals age 26 with a Unified Huntington's Disease Rating Scale (UHDRS) motor score of 7-10 70% of them will carry the HD mutation. For individuals age 56, the same motor score suggests only a 40% chance of carrying the mutation. Early motor signs of HD, overall and the chorea subscore, were highly predictive of disease onset at any age. However, body mass index (BMI) and cognitive performance scores were not as highly predictive. These results suggest that if researchers or clinicians are looking for early clues of HD, it may be more foretelling to look at motor rather than cognitive signs. Application of similar approaches could be used with other adult-onset genetic conditions.

  9. MiRNA-TF-gene network analysis through ranking of biomolecules for multi-informative uterine leiomyoma dataset.

    Science.gov (United States)

    Mallik, Saurav; Maulik, Ujjwal

    2015-10-01

    Gene ranking is an important problem in bioinformatics. Here, we propose a new framework for ranking biomolecules (viz., miRNAs, transcription-factors/TFs and genes) in a multi-informative uterine leiomyoma dataset having both gene expression and methylation data using (statistical) eigenvector centrality based approach. At first, genes that are both differentially expressed and methylated, are identified using Limma statistical test. A network, comprising these genes, corresponding TFs from TRANSFAC and ITFP databases, and targeter miRNAs from miRWalk database, is then built. The biomolecules are then ranked based on eigenvector centrality. Our proposed method provides better average accuracy in hub gene and non-hub gene classifications than other methods. Furthermore, pre-ranked Gene set enrichment analysis is applied on the pathway database as well as GO-term databases of Molecular Signatures Database with providing a pre-ranked gene-list based on different centrality values for comparing among the ranking methods. Finally, top novel potential gene-markers for the uterine leiomyoma are provided.

  10. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm

    Directory of Open Access Journals (Sweden)

    Lei Zhang

    2016-01-01

    Full Text Available Among non-small cell lung cancer (NSCLC, adenocarcinoma (AC, and squamous cell carcinoma (SCC are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR, can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.

  11. Classification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm.

    Science.gov (United States)

    Zhang, Lei; Wang, Linlin; Du, Bochuan; Wang, Tianjiao; Tian, Pu; Tian, Suyan

    2016-01-01

    Among non-small cell lung cancer (NSCLC), adenocarcinoma (AC), and squamous cell carcinoma (SCC) are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to be an effective tool for distinguishing AC and SCC. Gene set analysis is regarded as irrelevant to the identification of gene expression signatures. Nevertheless, we found that one specific gene set analysis method, significance analysis of microarray-gene set reduction (SAMGSR), can be adopted directly to select relevant features and to construct gene expression signatures. In this study, we applied SAMGSR to a NSCLC gene expression dataset. When compared with several novel feature selection algorithms, for example, LASSO, SAMGSR has equivalent or better performance in terms of predictive ability and model parsimony. Therefore, SAMGSR is a feature selection algorithm, indeed. Additionally, we applied SAMGSR to AC and SCC subtypes separately to discriminate their respective stages, that is, stage II versus stage I. Few overlaps between these two resulting gene signatures illustrate that AC and SCC are technically distinct diseases. Therefore, stratified analyses on subtypes are recommended when diagnostic or prognostic signatures of these two NSCLC subtypes are constructed.

  12. The Ph.D.-candidate as an information literate resource: developing research support and information literacy skills in an informal setting

    Directory of Open Access Journals (Sweden)

    Hilde Daland

    2013-11-01

    Full Text Available This article aims at suggesting a new way of developing research support for PhD-candidates. Previous research on the field of research support greatly focuses on the librarians’ competencies and how to assist researchers with what they lack in information literacy (IL skills. There is little focus on collaboration with researchers to achieve a mutual learning outcome in regard to developing research support and IL skills. A socio-cultural view on IL indicates that IL skills are developed in a context, and therefore are situated. A high level of IL in one situation could be regarded as insufficient in another. Therefore, a librarian’s view on IL could be incomparable to a PhD-student’s everyday information needs. Many liaison librarians do not have a PhD, but are still expected to provide PhD-candidates with research support of high quality. How can we do so if we only see the librarian’s perspective? Can informal settings and user involvement be a productive way of developing research support and IL skills? As librarians it is not always easy to know what researchers need. However, if the threshold has been lowered, in an informal setting, one might obtain the questions that reveal difficulties for researchers when it comes to library services and resources. Also, through user involvement, the researchers can teach librarians about the research process. This study includes an anonymous survey among PhD-candidates at the Faculty of Humanities and Education at the University of Agder (UoA and interviews with two of the PhD-candidates in addition to interviews with all of Agder University Library’s (AUL liaison librarians. In general, PhD-candidates that interact informally with their liaison librarian have a higher confidence in their own overview when it comes to library resources. They do not have problems contacting their librarians for help, but they do not expect the librarians to do their searching for them.

  13. Developing a Minimum Data Set of the Information Management System for Orthopedic Injuries in Iran

    Science.gov (United States)

    Ahmadi, Maryam; Mohammadi, Ali; Chraghbaigi, Ramin; Fathi, Taimur; Shojaee Baghini, Mahdieh

    2014-01-01

    Background: Orthopedic injuries are the most common types of injuries. To identify the main causes of injuries, collecting data in a standard manner at the national level are needed, which justifies necessity of making a minimum data set (MDS). Objectives: The aim of this study was to develop an MDS of the information management system for orthopedic injuries in Iran. Materials and Methods: This descriptive cross-sectional study was performed in 2013. Data were collected from hospitals affiliated with Tehran University of Medical Sciences that had orthopedic department, medical documents centers, legal medicine centers, emergency centers, internet access, and library. Investigated documents were orthopedic injury records in 2012, documents that retrieved from the internet, and printed materials. Records with Random sampling by S22-S99 categories from ICD-10 were selected and the related internet-sourced data were evaluated entirely. Data were collected using a checklist. In order to make a consensus about the data elements, the decision Delphi technique was applied by a questionnaire. The content validity and reliability of the questionnaire were assessed by expert’s opinions and test-retest method, respectively. Results: An MDS of orthopedic injuries were assigned to two categories: administrative category with six classes including 142 data elements, and clinical category with 17 classes including 250 data elements. Conclusions: This study showed that some of the essential data elements included in other country’s MDS or required for organizations and healthcare providers were not included. Therefore, a complete list of an MDS elements was created. Existence of comprehensive data concerning the causes and mechanisms of injuries informs public health policy-makers about injuries occurrence and enables them to take rationale measures to deal with these problems. PMID:25237576

  14. Climate change education in informal settings: Using boundary objects to frame network dissemination

    Science.gov (United States)

    Steiner, Mary Ann

    This study of climate change education dissemination takes place in the context of a larger project where institutions in four cities worked together to develop a linked set of informal learning experiences about climate change. Each city developed an organizational network to explore new ways to connect urban audiences with climate change education. The four city-specific networks shared tools, resources, and knowledge with each other. The networks were related in mission and goals, but were structured and functioned differently depending on the city context. This study illustrates how the tools, resources, and knowledge developed in one network were shared with networks in two additional cities. Boundary crossing theory frames the study to describe the role of objects and processes in sharing between networks. Findings suggest that the goals, capacity and composition of networks resulted in a different emphasis in dissemination efforts, in one case to push the approach out to partners for their own work and in the other to pull partners into a more collaborative stance. Learning experiences developed in each city as a result of the dissemination reflected these differences in the city-specific emphasis with the push city diving into messy examples of the approach to make their own examples, and the pull city offering polished experiences to partners in order to build confidence in the climate change messaging. The networks themselves underwent different kinds of growth and change as a result of dissemination. The emphasis on push and use of messy examples resulted in active use of the principles of the approach and the pull emphasis with polished examples resulted in the cultivation of partnerships with the hub and the potential to engage in the educational approach. These findings have implications for boundary object theory as a useful grounding for dissemination designs in the context of networks of informal learning organizations to support a shift in

  15. How preferences, information and institutions interactively drive agenda-setting: questions in the Belgian parliament, 1993-2000

    NARCIS (Netherlands)

    R. Vliegenthart; S. Walgrave; B. Zicha

    2013-01-01

    In this article an integrated framework of agenda-setting is proposed that incorporates the two main accounts of agenda-setting: the information-processing approach by Comparative Agenda Project scholars and the preference-centred account advanced by Comparative Manifestoes Project scholars. The stu

  16. META-GSA: Combining Findings from Gene-Set Analyses across Several Genome-Wide Association Studies.

    Directory of Open Access Journals (Sweden)

    Albert Rosenberger

    Full Text Available Gene-set analysis (GSA methods are used as complementary approaches to genome-wide association studies (GWASs. The single marker association estimates of a predefined set of genes are either contrasted with those of all remaining genes or with a null non-associated background. To pool the p-values from several GSAs, it is important to take into account the concordance of the observed patterns resulting from single marker association point estimates across any given gene set. Here we propose an enhanced version of Fisher's inverse χ2-method META-GSA, however weighting each study to account for imperfect correlation between association patterns.We investigated the performance of META-GSA by simulating GWASs with 500 cases and 500 controls at 100 diallelic markers in 20 different scenarios, simulating different relative risks between 1 and 1.5 in gene sets of 10 genes. Wilcoxon's rank sum test was applied as GSA for each study. We found that META-GSA has greater power to discover truly associated gene sets than simple pooling of the p-values, by e.g. 59% versus 37%, when the true relative risk for 5 of 10 genes was assume to be 1.5. Under the null hypothesis of no difference in the true association pattern between the gene set of interest and the set of remaining genes, the results of both approaches are almost uncorrelated. We recommend not relying on p-values alone when combining the results of independent GSAs.We applied META-GSA to pool the results of four case-control GWASs of lung cancer risk (Central European Study and Toronto/Lunenfeld-Tanenbaum Research Institute Study; German Lung Cancer Study and MD Anderson Cancer Center Study, which had already been analyzed separately with four different GSA methods (EASE; SLAT, mSUMSTAT and GenGen. This application revealed the pathway GO0015291 "transmembrane transporter activity" as significantly enriched with associated genes (GSA-method: EASE, p = 0.0315 corrected for multiple testing. Similar

  17. Interests-in-motion in an informal, media-rich learning setting

    Directory of Open Access Journals (Sweden)

    Ty Hollett

    2016-01-01

    Full Text Available Much of the literature related to connected learning approaches youth interests as fixed on specific disciplines or activities (e.g. STEM, music production, or game design. As such, mentors design youth-focused programs to serve those interests. Through a micro-ethnographic analysis of two youth’s Minecraft-centered gameplay in a public library, this article makes two primary contributions to research on learning within, and the design of, informal, media-rich settings. First, rather than approach youth interests as fixed on specific disciplines or activities (e.g. STEM, music production, or video games, this article traces youth interests as they spark and emerge among individuals and groups. Then, it follows those interests as they subsequently spread over time, becoming interests-in-motion. Second, recognition of these interests-in-motion can lead mentors to develop program designs that enable learners to work with artifacts (digital and physical that learners can progressively configure and re-configure over time. Mentors, then, design-in-time as they harness the energy surrounding those emergent interests, creating extending learning opportunities in response.

  18. Managing Information Uncertainty in Wave Height Modeling for the Offshore Structural Analysis through Random Set

    Directory of Open Access Journals (Sweden)

    Keqin Yan

    2017-01-01

    Full Text Available This chapter presents a reliability study for an offshore jacket structure with emphasis on the features of nonconventional modeling. Firstly, a random set model is formulated for modeling the random waves in an ocean site. Then, a jacket structure is investigated in a pushover analysis to identify the critical wave direction and key structural elements. This is based on the ultimate base shear strength. The selected probabilistic models are adopted for the important structural members and the wave direction is specified in the weakest direction of the structure for a conservative safety analysis. The wave height model is processed in a P-box format when it is used in the numerical analysis. The models are applied to find the bounds of the failure probabilities for the jacket structure. The propagation of this wave model to the uncertainty in results is investigated in both an interval analysis and Monte Carlo simulation. The results are compared in context of information content and numerical accuracy. Further, the failure probability bounds are compared with the conventional probabilistic approach.

  19. Stereoscopy in Static Scientific Imagery in an Informal Education Setting: Does It Matter?

    Science.gov (United States)

    Price, C. Aaron; Lee, H.-S.; Malatesta, K.

    2014-12-01

    Stereoscopic technology (3D) is rapidly becoming ubiquitous across research, entertainment and informal educational settings. Children of today may grow up never knowing a time when movies, television and video games were not available stereoscopically. Despite this rapid expansion, the field's understanding of the impact of stereoscopic visualizations on learning is rather limited. Much of the excitement of stereoscopic technology could be due to a novelty effect, which will wear off over time. This study controlled for the novelty factor using a variety of techniques. On the floor of an urban science center, 261 children were shown 12 photographs and visualizations of highly spatial scientific objects and scenes. The images were randomly shown in either traditional (2D) format or in stereoscopic format. The children were asked two questions of each image—one about a spatial property of the image and one about a real-world application of that property. At the end of the test, the child was asked to draw from memory the last image they saw. Results showed no overall significant difference in response to the questions associated with 2D or 3D images. However, children who saw the final slide only in 3D drew more complex representations of the slide than those who did not. Results are discussed through the lenses of cognitive load theory and the effect of novelty on engagement.

  20. Health literacy in the urgent care setting: What factors impact consumer comprehension of health information?

    Science.gov (United States)

    Alberti, Traci L; Morris, Nancy J

    2017-05-01

    An increasing number of Americans are using urgent care (UC) clinics due to: improved health insurance coverage, the need to decrease cost, primary care offices with limited appointment availability, and a desire for convenient care. Patients are treated by providers they may not know for episodic illness or injuries while in pain or not feeling well. Treatment instructions and follow-up directions are provided quickly. To examine health literacy in the adult UC population and identify patient characteristics associated with health literacy risk. As part of a larger cross-sectional study, UC patients seen between October 2013 and January 2014 completed a demographic questionnaire and the Newest Vital Sign. Descriptive, nonparametric analyses, and a multinomial logistic regression were done to assess health literacy, associated and predictive factors. A total of 57.5% of 285 participants had adequate health literacy. The likelihood of limited health literacy was associated with increased age (p literacy is common in a suburban UC setting, increasing the risk that consumers may not understand vital health information. Clear provider communication and confirmation of comprehension of discharge instructions for self-management is essential to optimize outcomes for UC patients. ©2017 American Association of Nurse Practitioners.

  1. Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrep-resented upstream motifs

    Directory of Open Access Journals (Sweden)

    Silengo Lorenzo

    2004-05-01

    Full Text Available Abstract Background Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. Results To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. Conclusions The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcritpion factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.

  2. Data-Driven Derivation of an "Informer Compound Set" for Improved Selection of Active Compounds in High-Throughput Screening.

    Science.gov (United States)

    Paricharak, Shardul; IJzerman, Adriaan P; Jenkins, Jeremy L; Bender, Andreas; Nigsch, Florian

    2016-09-26

    Despite the usefulness of high-throughput screening (HTS) in drug discovery, for some systems, low assay throughput or high screening cost can prohibit the screening of large numbers of compounds. In such cases, iterative cycles of screening involving active learning (AL) are employed, creating the need for smaller "informer sets" that can be routinely screened to build predictive models for selecting compounds from the screening collection for follow-up screens. Here, we present a data-driven derivation of an informer compound set with improved predictivity of active compounds in HTS, and we validate its benefit over randomly selected training sets on 46 PubChem assays comprising at least 300,000 compounds and covering a wide range of assay biology. The informer compound set showed improvement in BEDROC(α = 100), PRAUC, and ROCAUC values averaged over all assays of 0.024, 0.014, and 0.016, respectively, compared to randomly selected training sets, all with paired t-test p-values <10(-15). A per-assay assessment showed that the BEDROC(α = 100), which is of particular relevance for early retrieval of actives, improved for 38 out of 46 assays, increasing the success rate of smaller follow-up screens. Overall, we showed that an informer set derived from historical HTS activity data can be employed for routine small-scale exploratory screening in an assay-agnostic fashion. This approach led to a consistent improvement in hit rates in follow-up screens without compromising scaffold retrieval. The informer set is adjustable in size depending on the number of compounds one intends to screen, as performance gains are realized for sets with more than 3,000 compounds, and this set is therefore applicable to a variety of situations. Finally, our results indicate that random sampling may not adequately cover descriptor space, drawing attention to the importance of the composition of the training set for predicting actives.

  3. GeneNotes – A novel information management software for biologists

    Directory of Open Access Journals (Sweden)

    Wong Wing H

    2005-02-01

    Full Text Available Abstract Background Collecting and managing information is a challenging task in a genome-wide profiling research project. Most databases and online computational tools require a direct human involvement. Information and computational results are presented in various multimedia formats (e.g., text, image, PDF, word files, etc., many of which cannot be automatically processed by computers in biologically meaningful ways. In addition, the quality of computational results is far from perfect and requires nontrivial manual examination. The timely selection, integration and interpretation of heterogeneous biological information still heavily rely on the sensibility of biologists. Biologists often feel overwhelmed by the huge amount of and the great diversity of distributed heterogeneous biological information. Description We developed an information management application called GeneNotes. GeneNotes is the first application that allows users to collect and manage multimedia biological information about genes/ESTs. GeneNotes provides an integrated environment for users to surf the Internet, collect notes for genes/ESTs, and retrieve notes. GeneNotes is supported by a server that integrates gene annotations from many major databases (e.g., HGNC, MGI, etc.. GeneNotes uses the integrated gene annotations to (a identify genes given various types of gene IDs (e.g., RefSeq ID, GenBank ID, etc., and (b provide quick views of genes. GeneNotes is free for academic usage. The program and the tutorials are available at: http://bayes.fas.harvard.edu/genenotes/. Conclusions GeneNotes provides a novel human-computer interface to assist researchers to collect and manage biological information. It also provides a platform for studying how users behave when they manipulate biological information. The results of such study can lead to innovation of more intelligent human-computer interfaces that greatly shorten the cycle of biology research.

  4. Correlation of a set of gene variants, life events and personality features on adult ADHD severity.

    Science.gov (United States)

    Müller, Daniel J; Chiesa, Alberto; Mandelli, Laura; De Luca, Vincenzo; De Ronchi, Diana; Jain, Umesh; Serretti, Alessandro; Kennedy, James L

    2010-07-01

    Increasing evidence suggests that symptoms of attention deficit hyperactivity disorder (ADHD) could persist into adult life in a substantial proportion of cases. The aim of the present study was to investigate the impact of (1) adverse events, (2) personality traits and (3) genetic variants chosen on the basis of previous findings and (4) their possible interactions on adult ADHD severity. One hundred and ten individuals diagnosed with adult ADHD were evaluated for occurrence of adverse events in childhood and adulthood, and personality traits by the Temperament and Character Inventory (TCI). Common polymorphisms within a set of nine important candidate genes (SLC6A3, DBH, DRD4, DRD5, HTR2A, CHRNA7, BDNF, PRKG1 and TAAR9) were genotyped for each subject. Life events, personality traits and genetic variations were analyzed in relationship to severity of current symptoms, according to the Brown Attention Deficit Disorder Scale (BADDS). Genetic variations were not significantly associated with severity of ADHD symptoms. Life stressors displayed only a minor effect as compared to personality traits. Indeed, symptoms' severity was significantly correlated with the temperamental trait of Harm avoidance and the character trait of Self directedness. The results of the present work are in line with previous evidence of a significant correlation between some personality traits and adult ADHD. However, several limitations such as the small sample size and the exclusion of patients with other severe comorbid psychiatric disorders could have influenced the significance of present findings.

  5. Gene set based integrated data analysis reveals phenotypic differences in a brain cancer model.

    Directory of Open Access Journals (Sweden)

    Kjell Petersen

    Full Text Available A key challenge in the data analysis of biological high-throughput experiments is to handle the often low number of samples in the experiments compared to the number of biomolecules that are simultaneously measured. Combining experimental data using independent technologies to illuminate the same biological trends, as well as complementing each other in a larger perspective, is one natural way to overcome this challenge. In this work we investigated if integrating proteomics and transcriptomics data from a brain cancer animal model using gene set based analysis methodology, could enhance the biological interpretation of the data relative to more traditional analysis of the two datasets individually. The brain cancer model used is based on serial passaging of transplanted human brain tumor material (glioblastoma--GBM through several generations in rats. These serial transplantations lead over time to genotypic and phenotypic changes in the tumors and represent a medically relevant model with a rare access to samples and where consequent analyses of individual datasets have revealed relatively few significant findings on their own. We found that the integrated analysis both performed better in terms of significance measure of its findings compared to individual analyses, as well as providing independent verification of the individual results. Thus a better context for overall biological interpretation of the data can be achieved.

  6. Calibrated Peer Review: A New Tool for Integrating Information Literacy Skills in Writing-Intensive Large Classroom Settings.

    OpenAIRE

    Fosmire, Michael

    2010-01-01

    Calibrated Peer Review™ (CPR) is a program that can significantly enhance the ability to integrate intensive information literacy exercises into large classroom settings. CPR is founded on a solid pedagogic base for learning, and it is formulated in such a way that information skills can easily be inserted. However, there is no mention of its application for information literacy in the library literature. A sample implementation of CPR in a course co-taught by science disciplinary faculty and...

  7. Visual presentation as a welcome alternative to textual presentation of gene annotation information.

    Science.gov (United States)

    Desai, Jairav; Flatow, Jared M; Song, Jie; Zhu, Lihua J; Du, Pan; Huang, Chiang-Ching; Lu, Hui; Lin, Simon M; Kibbe, Warren A

    2010-01-01

    The functions of a gene are traditionally annotated textually using either free text (Gene Reference Into Function or GeneRIF) or controlled vocabularies (e.g., Gene Ontology or Disease Ontology). Inspired by the latest word cloud tools developed by the Information Visualization Group at IBM Research, we have prototyped a visual system for capturing gene annotations, which we named Gene Graph Into Function or GeneGIF. Fully developing the GeneGIF system would be a significant effort. To justify the necessity and to specify the design requirements of GeneGIF, we first surveyed the end-user preferences. From 53 responses, we found that a majority (64%, p 0.05) of the users favored visual presentation of information (GeneGIF) compared to textual (GeneRIF) information. The results of this study indicate that a visual presentation tool, such as GeneGIF, can complement standard textual presentation of gene annotations. Moreover, the survey participants provided many constructive comments that will specify the development of a phase-two project (http://128.248.174.241/) to visually annotate each gene in the human genome.

  8. Performance of single and concatenated sets of mitochondrial genes at inferring metazoan relationships relative to full mitogenome data.

    Directory of Open Access Journals (Sweden)

    Justin C Havird

    Full Text Available Mitochondrial (mt genes are some of the most popular and widely-utilized genetic loci in phylogenetic studies of metazoan taxa. However, their linked nature has raised questions on whether using the entire mitogenome for phylogenetics is overkill (at best or pseudoreplication (at worst. Moreover, no studies have addressed the comparative phylogenetic utility of mitochondrial genes across individual lineages within the entire Metazoa. To comment on the phylogenetic utility of individual mt genes as well as concatenated subsets of genes, we analyzed mitogenomic data from 1865 metazoan taxa in 372 separate lineages spanning genera to subphyla. Specifically, phylogenies inferred from these datasets were statistically compared to ones generated from all 13 mt protein-coding (PC genes (i.e., the "supergene" set to determine which single genes performed "best" at, and the minimum number of genes required to, recover the "supergene" topology. Surprisingly, the popular marker COX1 performed poorest, while ND5, ND4, and ND2 were most likely to reproduce the "supergene" topology. Averaged across all lineages, the longest ∼2 mt PC genes were sufficient to recreate the "supergene" topology, although this average increased to ∼5 genes for datasets with 40 or more taxa. Furthermore, concatenation of the three "best" performing mt PC genes outperformed that of the three longest mt PC genes (i.e, ND5, COX1, and ND4. Taken together, while not all mt PC genes are equally interchangeable in phylogenetic studies of the metazoans, some subset can serve as a proxy for the 13 mt PC genes. However, the exact number and identity of these genes is specific to the lineage in question and cannot be applied indiscriminately across the Metazoa.

  9. A set of vectors for introduction of antibiotic resistance genes by in vitro Cre-mediated recombination

    OpenAIRE

    Vassetzky Yegor S; Dmitriev Petr V

    2008-01-01

    Abstract Background Introduction of new antibiotic resistance genes in the plasmids of interest is a frequent task in molecular cloning practice. Classical approaches involving digestion with restriction endonucleases and ligation are time-consuming. Findings We have created a set of insertion vectors (pINS) carrying genes that provide resistance to various antibiotics (puromycin, blasticidin and G418) and containing a loxP site. Each vector (pINS-Puro, pINS-Blast or pINS-Neo) contains either...

  10. Tailoring Healthy Workplace Interventions to Local Healthcare Settings: A Complexity Theory-Informed Workplace of Well-Being Framework.

    Science.gov (United States)

    Brand, Sarah L; Fleming, Lora E; Wyatt, Katrina M

    2015-01-01

    Many healthy workplace interventions have been developed for healthcare settings to address the consistently low scores of healthcare professionals on assessments of mental and physical well-being. Complex healthcare settings present challenges for the scale-up and spread of successful interventions from one setting to another. Despite general agreement regarding the importance of the local setting in affecting intervention success across different settings, there is no consensus on what it is about a local setting that needs to be taken into account to design healthy workplace interventions appropriate for different local settings. Complexity theory principles were used to understand a workplace as a complex adaptive system and to create a framework of eight domains (system characteristics) that affect the emergence of system-level behaviour. This Workplace of Well-being (WoW) framework is responsive and adaptive to local settings and allows a shared understanding of the enablers and barriers to behaviour change by capturing local information for each of the eight domains. We use the results of applying the WoW framework to one workplace, a UK National Health Service ward, to describe the utility of this approach in informing design of setting-appropriate healthy workplace interventions that create workplaces conducive to healthy behaviour change.

  11. The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae.

    Science.gov (United States)

    Teixeira, Miguel Cacho; Monteiro, Pedro Tiago; Guerreiro, Joana Fernandes; Gonçalves, Joana Pinho; Mira, Nuno Pereira; dos Santos, Sandra Costa; Cabrito, Tânia Rodrigues; Palma, Margarida; Costa, Catarina; Francisco, Alexandre Paulo; Madeira, Sara Cordeiro; Oliveira, Arlindo Limede; Freitas, Ana Teresa; Sá-Correia, Isabel

    2014-01-01

    The YEASTRACT (http://www.yeastract.com) information system is a tool for the analysis and prediction of transcription regulatory associations in Saccharomyces cerevisiae. Last updated in June 2013, this database contains over 200,000 regulatory associations between transcription factors (TFs) and target genes, including 326 DNA binding sites for 113 TFs. All regulatory associations stored in YEASTRACT were revisited and new information was added on the experimental conditions in which those associations take place and on whether the TF is acting on its target genes as activator or repressor. Based on this information, new queries were developed allowing the selection of specific environmental conditions, experimental evidence or positive/negative regulatory effect. This release further offers tools to rank the TFs controlling a gene or genome-wide response by their relative importance, based on (i) the percentage of target genes in the data set; (ii) the enrichment of the TF regulon in the data set when compared with the genome; or (iii) the score computed using the TFRank system, which selects and prioritizes the relevant TFs by walking through the yeast regulatory network. We expect that with the new data and services made available, the system will continue to be instrumental for yeast biologists and systems biology researchers.

  12. TXTGate: profiling gene groups with text-based information

    DEFF Research Database (Denmark)

    Glenisson, P.; Coessens, B.; Van Vooren, S.

    2004-01-01

    We implemented a framework called TXTGate that combines literature indices of selected public biological resources in a flexible text-mining system designed towards the analysis of groups of genes. By means of tailored vocabularies, term-as well as gene-centric views are offered on selected textual...... fields and MEDLINE abstracts used in LocusLink and the Saccharomyces Genome Database. Subclustering and links to external resources allow for in-depth analysis of the resulting term profiles....

  13. Tailored patient information using a database system: Increasing patient compliance in a day surgery setting

    DEFF Research Database (Denmark)

    Grode, Jesper Nicolai Riis; Grode, Louise; Steinsøe, Ulla

    2013-01-01

    rehabilitation. The hospital is responsible of providing the patients with accurate information enabling the patient to prepare for surgery. Often patients are overloaded with uncoordinated information, letters and leaflets. The contribution of this project is a database system enabling health professionals...... was established to support these requirements. A relational database system holds all information pieces in a granular, structured form. Each individual piece of information can be joined with other pieces thus supporting the tailoring of information. A web service layer caters for integration with output systems....../media (word processing engines, web, mobile apps, and information kiosks). To lower the adoption bar of the system, an MS Word user interface was integrated with the web service layer, and information can now quickly be categorised and grouped according to purpose of use, users can quickly setup information...

  14. The development of an efficient multipurpose bean pod mottle virus viral vector set for foreign gene expression and RNA silencing.

    Science.gov (United States)

    Zhang, Chunquan; Bradshaw, Jeffrey D; Whitham, Steven A; Hill, John H

    2010-05-01

    Plant viral vectors are valuable tools for heterologous gene expression, and because of virus-induced gene silencing (VIGS), they also have important applications as reverse genetics tools for gene function studies. Viral vectors are especially useful for plants such as soybean (Glycine max) that are recalcitrant to transformation. Previously, two generations of bean pod mottle virus (BPMV; genus Comovirus) vectors have been developed for overexpressing and silencing genes in soybean. However, the design of the previous vectors imposes constraints that limit their utility. For example, VIGS target sequences must be expressed as fusion proteins in the same reading frame as the viral polyprotein. This requirement limits the design of VIGS target sequences to open reading frames. Furthermore, expression of multiple genes or simultaneous silencing of one gene and expression of another was not possible. To overcome these and other issues, a new BPMV-based vector system was developed to facilitate a variety of applications for gene function studies in soybean as well as in common bean (Phaseolus vulgaris). These vectors are designed for simultaneous expression of multiple foreign genes, insertion of noncoding/antisense sequences, and simultaneous expression and silencing. The simultaneous expression of green fluorescent protein and silencing of phytoene desaturase shows that marker gene-assisted silencing is feasible. These results demonstrate the utility of this BPMV vector set for a wide range of applications in soybean and common bean, and they have implications for improvement of other plant virus-based vector systems.

  15. Genomic determinants of sporulation in Bacilli and Clostridia: towards the minimal set of sporulation-specific genes.

    Science.gov (United States)

    Galperin, Michael Y; Mekhedov, Sergei L; Puigbo, Pere; Smirnov, Sergey; Wolf, Yuri I; Rigden, Daniel J

    2012-11-01

    Three classes of low-G+C Gram-positive bacteria (Firmicutes), Bacilli, Clostridia and Negativicutes, include numerous members that are capable of producing heat-resistant endospores. Spore-forming firmicutes include many environmentally important organisms, such as insect pathogens and cellulose-degrading industrial strains, as well as human pathogens responsible for such diseases as anthrax, botulism, gas gangrene and tetanus. In the best-studied model organism Bacillus subtilis, sporulation involves over 500 genes, many of which are conserved among other bacilli and clostridia. This work aimed to define the genomic requirements for sporulation through an analysis of the presence of sporulation genes in various firmicutes, including those with smaller genomes than B. subtilis. Cultivable spore-formers were found to have genomes larger than 2300 kb and encompass over 2150 protein-coding genes of which 60 are orthologues of genes that are apparently essential for sporulation in B. subtilis. Clostridial spore-formers lack, among others, spoIIB, sda, spoVID and safA genes and have non-orthologous displacements of spoIIQ and spoIVFA, suggesting substantial differences between bacilli and clostridia in the engulfment and spore coat formation steps. Many B. subtilis sporulation genes, particularly those encoding small acid-soluble spore proteins and spore coat proteins, were found only in the family Bacillaceae, or even in a subset of Bacillus spp. Phylogenetic profiles of sporulation genes, compiled in this work, confirm the presence of a common sporulation gene core, but also illuminate the diversity of the sporulation processes within various lineages. These profiles should help further experimental studies of uncharacterized widespread sporulation genes, which would ultimately allow delineation of the minimal set(s) of sporulation-specific genes in Bacilli and Clostridia. Published 2012. This article is a U.S. Government work and is in the public domain in the USA.

  16. Mapping of three QTLs for seed setting and analysis on the candidate gene forqSS-1 in rice (Oryza sativa L.)

    Institute of Scientific and Technical Information of China (English)

    Elsheikh Y M Ahmed; ZHANG Yan-pei; YU Jian-ping; Rashid M A Rehman; ZHANG Zhan-ying; ZHANG Hong-liang; LI Jin-jie; LI Zi-chao

    2016-01-01

    The lower seed setting is one of the major hindrances which face grain yield in rice. One of the main reasons to cause low spikelet fertility (seed setting) is male sterility or polen abortion. Notably, polen abortion has been frequently observed in advanced progenies of rice. In the present study, 149 BC2F6 individuals with signiifcant segregation in spikelet fertility were generated from the cross between N040212 (indica) and Nipponbare (japonica) and used for primary gene mapping. Three QTLs,qSS-1, qSS-7 andqSS-9 at chromosomes 1, 7 and 9, respectively, were found to be associated with seed setting. The recombinant analysis and the physical mapping information from publicly available resources exhibited that theqSS-1, qSS-7 andqSS-9 loci were mapped to an interval of 188, 701 and 3741 kb, respectively. The seed setting re-sponsible for QTLqSS-1 was further ifne mapped to 93.5 kb by using BC2F7 population of 1849 individuals. There are 16 possible putative genes in this 93.5 kb region. Polen vitality tests and artiifcial polination indicated that the male gamete has abnormal polen while the female gamete was normal. These data showed that low seed setting rate relative toqSS-1 may be caused by abnormal polen grains. These results wil be useful for cloning, functional analysis of the target gene governing spikelet fertility (seed setting) and understanding the genetic bases of polen sterility.

  17. KAGIANA: An Excel-Based Tool for Retrieving Summary Information on Arabidopsis Genes

    Science.gov (United States)

    Ogata, Yoshiyuki; Sakurai, Nozomu; Aoki, Koh; Suzuki, Hideyuki; Okazaki, Koei; Saito, Kazuki; Shibata, Daisuke

    2009-01-01

    Various public databases provide Arabidopsis gene information via the internet. It is useful to abstract information obtained from such databases. We have developed the KAGIANA tool, which allows a user to retrieve summary information obtained from selective databases and to access pages for a gene of interest in those databases. The tool is based on Microsoft Excel and provides several macro programs for gene expression analyses. It can assist plant biologists in accessing omics information for plant biology. The KAGIANA tool is freely available at http://pmnedo.kazusa.or.jp/kagiana/. PMID:19043069

  18. [A fast algorithm to build a supertree with a set of gene trees].

    Science.gov (United States)

    Gorbunov, K Iu; Liubetskiĭ, V A

    2012-01-01

    Important desired properties of an algorithm to construct a supertree (species tree) by reconciling input trees are its low complexity and applicability to large biological data. In its common statement the problem is proved to be NP-hard, i.e. to have an exponential complexity in practice. We propose a reformulation of the supertree building problem that allows a computationally effective solution. We introduce a biologically natural requirement that the supertree is sought for such that it does not contain clades incompatible with those existing in the input trees. The algorithm was tested with simulated and biological trees and was shown to possess an almost square complexity even if horizontal transfers are allowed. If HGTs are not assumed, the algorithm is mathematically correct and possesses the longest running time of n3 x[V0]3, where n is the number of input trees and [V0] is the total number of species. The authors are unaware of analogous solutions in published evidence. The corresponding inferring program, its usage examples and manual are freely available at http://lab6.iitp.ru/en/super3gl. The available program does not implement HGTs. The generalized case is described in the publication "A tree nearest in average to a set of trees" (Information Transmission Problems, 2011).

  19. GabiPD – The GABI Primary Database integrates plant proteomic data with gene-centric information

    Directory of Open Access Journals (Sweden)

    Björn eUsadel

    2012-07-01

    Full Text Available GabiPD is an integrative plant omics database that has been established as part of the German initiative for Genome Analysis of the Plant Biological System (GABI. Data from different omics disciplines are integrated and interactively visualized. Proteomics is represented by data and tools aiding studies on the identification of posttranslational modification and function of proteins. Annotated 2DE-gel images are offered to inspect protein sets expressed in different tissues of Arabidopsis thaliana and Brassica napus. From a given protein spot, a link will direct the user to the related GreenCard Gene entry where detailed gene-centric information will support the functional annotation. Beside MapMan- and GO-classification, information on conserved protein domains and on orthologs is integrated in this GreenCard service. Moreover, all other GabiPD data related to the gene, including transcriptomic data, as well as gene-specific links to external resources are provided. Researches interested in plant protein phosphorylation will find information on potential MAP kinase substrates identified in different protein microarray studies integrated in GabiPD’s Phosphoproteomics page. These data can be easily compared to experimentally identified or predicted phosphorylation sites in PhosPhAt via the related Gene GreenCard. This will allow the selection of interesting candidates for further experimental validation of their phosphorylation.

  20. 77 FR 38634 - Request for Information: Collection and Use of Patient Work Information in the Clinical Setting...

    Science.gov (United States)

    2012-06-28

    ... information from providers of primary care and occupational medicine, vendors and creators of EHR software... From the Federal Register Online via the Government Publishing Office DEPARTMENT OF HEALTH AND...), Department of Health and Human Services (HHS). ACTION: Request for public comments. SUMMARY: The...

  1. A genetic signature of spina bifida risk from pathway-informed comprehensive gene-variant analysis.

    Directory of Open Access Journals (Sweden)

    Nicholas J Marini

    Full Text Available Despite compelling epidemiological evidence that folic acid supplements reduce the frequency of neural tube defects (NTDs in newborns, common variant association studies with folate metabolism genes have failed to explain the majority of NTD risk. The contribution of rare alleles as well as genetic interactions within the folate pathway have not been extensively studied in the context of NTDs. Thus, we sequenced the exons in 31 folate-related genes in a 480-member NTD case-control population to identify the full spectrum of allelic variation and determine whether rare alleles or obvious genetic interactions within this pathway affect NTD risk. We constructed a pathway model, predetermined independent of the data, which grouped genes into coherent sets reflecting the distinct metabolic compartments in the folate/one-carbon pathway (purine synthesis, pyrimidine synthesis, and homocysteine recycling to methionine. By integrating multiple variants based on these groupings, we uncovered two provocative, complex genetic risk signatures. Interestingly, these signatures differed by race/ethnicity: a Hispanic risk profile pointed to alterations in purine biosynthesis, whereas that in non-Hispanic whites implicated homocysteine metabolism. In contrast, parallel analyses that focused on individual alleles, or individual genes, as the units by which to assign risk revealed no compelling associations. These results suggest that the ability to layer pathway relationships onto clinical variant data can be uniquely informative for identifying genetic risk as well as for generating mechanistic hypotheses. Furthermore, the identification of ethnic-specific risk signatures for spina bifida resonated with epidemiological data suggesting that the underlying pathogenesis may differ between Hispanic and non-Hispanic groups.

  2. Validation of a set of reference genes to study response to herbicide stress in grasses

    OpenAIRE

    Petit Cécile; Pernin Fanny; Heydel Jean-Marie; Délye Christophe

    2012-01-01

    Abstract Background Non-target-site based resistance to herbicides is a major threat to the chemical control of agronomically noxious weeds. This adaptive trait is endowed by differences in the expression of a number of genes in plants that are resistant or sensitive to herbicides. Quantification of the expression of such genes requires normalising qPCR data using reference genes with stable expression in the system studied as internal standards. The aim of this study was to validate referenc...

  3. Platform dependence of inference on gene-wise and gene-set involvement in human lung development

    Directory of Open Access Journals (Sweden)

    Kho Alvin T

    2009-06-01

    Full Text Available Abstract Background With the recent development of microarray technologies, the comparability of gene expression data obtained from different platforms poses an important problem. We evaluated two widely used platforms, Affymetrix U133 Plus 2.0 and the Illumina HumanRef-8 v2 Expression Bead Chips, for comparability in a biological system in which changes may be subtle, namely fetal lung tissue as a function of gestational age. Results We performed the comparison via sequence-based probe matching between the two platforms. "Significance grouping" was defined as a measure of comparability. Using both expression correlation and significance grouping as measures of comparability, we demonstrated that despite overall cross-platform differences at the single gene level, increased correlation between the two platforms was found in genes with higher expression level, higher probe overlap, and lower p-value. We also demonstrated that biological function as determined via KEGG pathways or GO categories is more consistent across platforms than single gene analysis. Conclusion We conclude that while the comparability of the platforms at the single gene level may be increased by increasing sample size, they are highly comparable ontologically even for subtle differences in a relatively small sample size. Biologically relevant inference should therefore be reproducible across laboratories using different platforms.

  4. The Information Seeking and Use of English Language Learners in a High School Setting

    Science.gov (United States)

    Kim, Sung Un

    2010-01-01

    This study examines the information seeking and use behaviors of English language learners (ELLs) while performing a research task, using Vygotsky's Zone of Proximal Development and Kuhlthau's Information Search Process as theoretical frameworks. The research tasks implemented in this study were curriculum based units where students engaged a…

  5. The Schizophrenia-Associated BRD1 Gene Regulates Behavior, Neurotransmission, and Expression of Schizophrenia Risk Enriched Gene Sets in Mice

    DEFF Research Database (Denmark)

    Qvist, Per; Christensen, Jane Hvarregaard; Vardya, Irina;

    2016-01-01

    BACKGROUND: The schizophrenia-associated BRD1 gene encodes a transcriptional regulator whose comprehensive chromatin interactome is enriched with schizophrenia risk genes. However, the biology underlying the disease association of BRD1 remains speculative. METHODS: This study assessed......-inhibition imbalances involving loss of parvalbumin immunoreactive interneurons. RNA-sequencing analyses of cortical and striatal micropunches from Brd1(+/-) and wild-type mice revealed differential expression of genes enriched for schizophrenia risk, including several schizophrenia genome-wide association study risk...... the transcriptional drive of a schizophrenia-associated BRD1 risk variant in vitro. Accordingly, to examine the effects of reduced Brd1 expression, we generated a genetically modified Brd1(+/-) mouse and subjected it to behavioral, electrophysiological, molecular, and integrative genomic analyses with focus...

  6. Information resource preferences by general pediatricians in office settings: a qualitative study

    Directory of Open Access Journals (Sweden)

    Lehmann Harold P

    2005-10-01

    Full Text Available Abstract Background Information needs and resource preferences of office-based general pediatricians have not been well characterized. Methods Data collected from a sample of twenty office-based urban/suburban general pediatricians consisted of: (a a demographic survey about participants' practice and computer use, (b semi-structured interviews on their use of different types of information resources and (c semi-structured interviews on perceptions of information needs and resource preferences in response to clinical vignettes representing cases in Genetics and Infectious Diseases. Content analysis of interviews provided participants' perceived use of resources and their perceived questions and preferred resources in response to vignettes. Results Participants' average time in practice was 15.4 years (2–28 years. All had in-office online access. Participants identified specialist/generalist colleagues, general/specialty pediatric texts, drug formularies, federal government/professional organization Websites and medical portals (when available as preferred information sources. They did not identify decision-making texts, evidence-based reviews, journal abstracts, medical librarians or consumer health information for routine office use. In response to clinical vignettes in Genetics and Infectious Diseases, participants identified Question Types about patient-specific (diagnosis, history and findings and general medical (diagnostic, therapeutic and referral guidelines information. They identified specialists and specialty textbooks, history and physical examination, colleagues and general pediatric textbooks, and federal and professional organizational Websites as information sources. Participants with access to portals identified them as information resources in lieu of texts. For Genetics vignettes, participants identified questions about prenatal history, disease etiology and treatment guidelines. For Genetics vignettes, they identified

  7. Different gene sets contribute to different symptom dimensions of depression and anxiety

    NARCIS (Netherlands)

    van Veen, Tineke; Goeman, Jelle J.; Monajemi, Ramin; Wardenaar, Klaas J.; Hartman, Catharina A.; Snieder, Harold; Nolte, Ilja M.; Penninx, Brenda W. J. H.; Zitman, Frans G.

    2012-01-01

    Although many genetic association studies have been carried out, it remains unclear which genes contribute to depression. This may be due to heterogeneity of the DSM-IV category of depression. Specific symptom-dimensions provide a more homogenous phenotype. Furthermore, as effects of individual gene

  8. A mixture model-based strategy for selecting sets of genes in multiclass response microarray experiments.

    Science.gov (United States)

    Broët, Philippe; Lewin, Alex; Richardson, Sylvia; Dalmasso, Cyril; Magdelenat, Henri

    2004-11-01

    Multiclass response (MCR) experiments are those in which there are more than two classes to be compared. In these experiments, though the null hypothesis is simple, there are typically many patterns of gene expression changes across the different classes that led to complex alternatives. In this paper, we propose a new strategy for selecting genes in MCR that is based on a flexible mixture model for the marginal distribution of a modified F-statistic. Using this model, false positive and negative discovery rates can be estimated and combined to produce a rule for selecting a subset of genes. Moreover, the method proposed allows calculation of these rates for any predefined subset of genes. We illustrate the performance our approach using simulated datasets and a real breast cancer microarray dataset. In this latter study, we investigate predefined subset of genes and point out interesting differences between three distinct biological pathways. http://www.bgx.org.uk/software.html

  9. Minimum information about a marker gene sequence (MIMARKS) and minimum information about any (x) sequence (MIxS) specifications

    DEFF Research Database (Denmark)

    Yilmaz, Pelin; Kottmann, Renzo; Field, Dawn

    2011-01-01

    Here we present a standard developed by the Genomic Standards Consortium (GSC) for reporting marker gene sequences--the minimum information about a marker gene sequence (MIMARKS). We also introduce a system for describing the environment from which a biological sample originates. The 'environmental...... present the minimum information about any (x) sequence (MIxS). Adoption of MIxS will enhance our ability to analyze natural genetic diversity documented by massive DNA sequencing efforts from myriad ecosystems in our ever-changing biosphere....

  10. Comparative mitogenomic analyses of three scallops (Bivalvia: Pectinidae reveal high level variation of genomic organization and a diversity of transfer RNA gene sets

    Directory of Open Access Journals (Sweden)

    Kong Xiaoyu

    2009-05-01

    exhibit a high level of genomic variation and a diversity of tRNA gene sets, characterized by extensive translocation of genes. These features provide useful clues and information for evolutionary analysis of scallop mitogenomes.

  11. Genetic network and gene set enrichment analysis to identify biomarkers related to cigarette smoking and lung cancer.

    Science.gov (United States)

    Fang, Xiaocong; Netzer, Michael; Baumgartner, Christian; Bai, Chunxue; Wang, Xiangdong

    2013-02-01

    Cigarette smoking is the most demonstrated risk factor for the development of lung cancer, while the related genetic mechanisms are still unclear. The preprocessed microarray expression dataset was downloaded from Gene Expression Omnibus database. Samples were classified according to the disease state, stage and smoking state. A new computational strategy was applied for the identification and biological interpretation of new candidate genes in lung cancer and smoking by coupling a network-based approach with gene set enrichment analysis. Network analysis was performed by pair-wise comparison according to the disease states (tumor or normal), smoking states (current smokers or nonsmokers or former smokers), or the disease stage (stages I-IV). The most activated metabolic pathways were identified by gene set enrichment analysis. Panels of top ranked gene candidates in smoking or cancer development were identified, including genes involved in cell proliferation and drug metabolism like cytochrome P450 and WW domain containing transcription regulator 1. Semaphorin 5A and protein phosphatase 1F are the common genes represented as major hubs in both the smoking and cancer related network. Six pathways, e.g. cell cycle, DNA replication, RNA transport, protein processing in endoplasmic reticulum, vascular smooth muscle contraction and endocytosis were commonly involved in smoking and lung cancer when comparing the top ten selected pathways. New approach of bioinformatics for biomarker identification and validation can probe into deep genetic relationships between cigarette smoking and lung cancer. Our studies indicate that disease-specific network biomarkers, interaction between genes/proteins, or cross-talking of pathways provide more specific values for the development of precision therapies for lung. Copyright © 2012 Elsevier Ltd. All rights reserved.

  12. Sharing clinical information across care settings: the birth of an integrated assessment system

    Directory of Open Access Journals (Sweden)

    Henrard Jean-Claude

    2009-04-01

    Full Text Available Abstract Background Population ageing, the emergence of chronic illness, and the shift away from institutional care challenge conventional approaches to assessment systems which traditionally are problem and setting specific. Methods From 2002, the interRAI research collaborative undertook development of a suite of assessment tools to support assessment and care planning of persons with chronic illness, frailty, disability, or mental health problems across care settings. The suite constitutes an early example of a "third generation" assessment system. Results The rationale and development strategy for the suite is described, together with a description of potential applications. To date, ten instruments comprise the suite, each comprising "core" items shared among the majority of instruments and "optional" items that are specific to particular care settings or situations. Conclusion This comprehensive suite offers the opportunity for integrated multi-domain assessment, enabling electronic clinical records, data transfer, ease of interpretation and streamlined training.

  13. ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings

    NARCIS (Netherlands)

    Drachsler, Hendrik; Pecceu, Dries; Arts, Tanja; Hutten, Edwin; Rutledge, Lloyd; Van Rosmalen, Peter; Hummel, Hans; Koper, Rob

    2009-01-01

    Drachsler, H., Peccau, D., Arts, T., Hutten, E., Rutledge, L., Van Rosmalen, P., Hummel, H. G. K., & Koper, R. (2009). ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings. Presentation at the 2nd Workshop Mash-Up Personal Learning

  14. ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings

    NARCIS (Netherlands)

    Drachsler, Hendrik; Pecceu, Dries; Arts, Tanja; Hutten, Edwin; Rutledge, Lloyd; Van Rosmalen, Peter; Hummel, Hans; Koper, Rob

    2009-01-01

    Drachsler, H., Peccau, D., Arts, T., Hutten, E., Rutledge, L., Van Rosmalen, P., Hummel, H. G. K., & Koper, R. (2009). ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings. Presentation at the 2nd Workshop Mash-Up Personal Learning

  15. ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings

    NARCIS (Netherlands)

    Drachsler, Hendrik; Pecceu, Dries; Arts, Tanja; Hutten, Edwin; Rutledge, Lloyd; Van Rosmalen, Peter; Hummel, Hans; Koper, Rob

    2009-01-01

    Drachsler, H., Peccau, D., Arts, T., Hutten, E., Rutledge, L., Van Rosmalen, P., Hummel, H. G. K., & Koper, R. (2009). ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings. In F. Wild, M. Kalz, M. Palmér & D. Müller (Eds.), Proceedi

  16. Similarity-potency trees: a method to search for SAR information in compound data sets and derive SAR rules.

    Science.gov (United States)

    Wawer, Mathias; Bajorath, Jürgen

    2010-08-23

    An intuitive and generally applicable analysis method, termed similarity-potency tree (SPT), is introduced to mine structure-activity relationship (SAR) information in compound data sets of any source. Only compound potency values and nearest-neighbor similarity relationships are considered. Rather than analyzing a data set as a whole, in part overlapping compound neighborhoods are systematically generated and represented as SPTs. This local analysis scheme simplifies the evaluation of SAR information and SPTs of high SAR information content are easily identified. By inspecting only a limited number of compound neighborhoods, it is also straightforward to determine whether data sets contain only little or no interpretable SAR information. Interactive analysis of SPTs is facilitated by reading the trees in two directions, which makes it possible to extract SAR rules, if available, in a consistent manner. The simplicity and interpretability of the data structure and the ease of calculation are characteristic features of this approach. We apply the methodology to high-throughput screening and lead optimization data sets, compare the approach to standard clustering techniques, illustrate how SAR rules are derived, and provide some practical guidance how to best utilize the methodology. The SPT program is made freely available to the scientific community.

  17. Developing Information Literacy through Independent Learning Projects in a UK Setting: Pilot Projects for Year 9 and Year 6 Pupils

    Science.gov (United States)

    Jones, Rebecca

    2010-01-01

    Two information literacy skills pilot projects are being undertaken at Malvern St James School (MSJ) with Year 6 and Year 9 pupils during 2009-10. The projects encourage the development of independent learning skills, with pupils planning, managing and executing both the research and practical elements of their project. Each pupil sets their own…

  18. Predictive information speeds up visual awareness in an individuation task by modulating threshold setting, not processing efficiency.

    Science.gov (United States)

    De Loof, Esther; Van Opstal, Filip; Verguts, Tom

    2016-04-01

    Theories on visual awareness claim that predicted stimuli reach awareness faster than unpredicted ones. In the current study, we disentangle whether prior information about the upcoming stimulus affects visual awareness of stimulus location (i.e., individuation) by modulating processing efficiency or threshold setting. Analogous research on stimulus identification revealed that prior information modulates threshold setting. However, as identification and individuation are two functionally and neurally distinct processes, the mechanisms underlying identification cannot simply be extrapolated directly to individuation. The goal of this study was therefore to investigate how individuation is influenced by prior information about the upcoming stimulus. To do so, a drift diffusion model was fitted to estimate the processing efficiency and threshold setting for predicted versus unpredicted stimuli in a cued individuation paradigm. Participants were asked to locate a picture, following a cue that was congruent, incongruent or neutral with respect to the picture's identity. Pictures were individuated faster in the congruent and neutral condition compared to the incongruent condition. In the diffusion model analysis, the processing efficiency was not significantly different across conditions. However, the threshold setting was significantly higher following an incongruent cue compared to both congruent and neutral cues. Our results indicate that predictive information about the upcoming stimulus influences visual awareness by shifting the threshold for individuation rather than by enhancing processing efficiency.

  19. A database paradigm for the management of DICOM-RT structure sets using a geographic information system

    Science.gov (United States)

    Shao, Weber; Kupelian, Patrick A.; Wang, Jason; Low, Daniel A.; Ruan, Dan

    2014-03-01

    We devise a paradigm for representing the DICOM-RT structure sets in a database management system, in such way that secondary calculations of geometric information can be performed quickly from the existing contour definitions. The implementation of this paradigm is achieved using the PostgreSQL database system and the PostGIS extension, a geographic information system commonly used for encoding geographical map data. The proposed paradigm eliminates the overhead of retrieving large data records from the database, as well as the need to implement various numerical and data parsing routines, when additional information related to the geometry of the anatomy is desired.

  20. Identification of candidate genes for prostate cancer-risk SNPs utilizing a normal prostate tissue eQTL data set

    Science.gov (United States)

    Thibodeau, S. N.; French, A. J.; McDonnell, S. K.; Cheville, J.; Middha, S.; Tillmans, L.; Riska, S.; Baheti, S.; Larson, M. C.; Fogarty, Z.; Zhang, Y.; Larson, N.; Nair, A.; O'Brien, D.; Wang, L.; Schaid, D J.

    2015-01-01

    Multiple studies have identified loci associated with the risk of developing prostate cancer but the associated genes are not well studied. Here we create a normal prostate tissue-specific eQTL data set and apply this data set to previously identified prostate cancer (PrCa)-risk SNPs in an effort to identify candidate target genes. The eQTL data set is constructed by the genotyping and RNA sequencing of 471 samples. We focus on 146 PrCa-risk SNPs, including all SNPs in linkage disequilibrium with each risk SNP, resulting in 100 unique risk intervals. We analyse cis-acting associations where the transcript is located within 2 Mb (±1 Mb) of the risk SNP interval. Of all SNP–gene combinations tested, 41.7% of SNPs demonstrate a significant eQTL signal after adjustment for sample histology and 14 expression principal component covariates. Of the 100 PrCa-risk intervals, 51 have a significant eQTL signal and these are associated with 88 genes. This study provides a rich resource to study biological mechanisms underlying genetic risk to PrCa. PMID:26611117

  1. 75 FR 2013 - Health Information Technology: Initial Set of Standards, Implementation Specifications, and...

    Science.gov (United States)

    2010-01-13

    ... health record, including for the segmentation and protection from disclosure of specific and sensitive... patient demographic data, including, at a minimum, race, ethnicity, primary language, and gender... levels. For example, one organization may use an information model to describe patient...

  2. Discovery of gene-gene interactions across multiple independent data sets of late onset Alzheimer disease from the Alzheimer Disease Genetics Consortium.

    Science.gov (United States)

    Hohman, Timothy J; Bush, William S; Jiang, Lan; Brown-Gentry, Kristin D; Torstenson, Eric S; Dudek, Scott M; Mukherjee, Shubhabrata; Naj, Adam; Kunkle, Brian W; Ritchie, Marylyn D; Martin, Eden R; Schellenberg, Gerard D; Mayeux, Richard; Farrer, Lindsay A; Pericak-Vance, Margaret A; Haines, Jonathan L; Thornton-Wells, Tricia A

    2016-02-01

    Late-onset Alzheimer disease (AD) has a complex genetic etiology, involving locus heterogeneity, polygenic inheritance, and gene-gene interactions; however, the investigation of interactions in recent genome-wide association studies has been limited. We used a biological knowledge-driven approach to evaluate gene-gene interactions for consistency across 13 data sets from the Alzheimer Disease Genetics Consortium. Fifteen single nucleotide polymorphism (SNP)-SNP pairs within 3 gene-gene combinations were identified: SIRT1 × ABCB1, PSAP × PEBP4, and GRIN2B × ADRA1A. In addition, we extend a previously identified interaction from an endophenotype analysis between RYR3 × CACNA1C. Finally, post hoc gene expression analyses of the implicated SNPs further implicate SIRT1 and ABCB1, and implicate CDH23 which was most recently identified as an AD risk locus in an epigenetic analysis of AD. The observed interactions in this article highlight ways in which genotypic variation related to disease may depend on the genetic context in which it occurs. Further, our results highlight the utility of evaluating genetic interactions to explain additional variance in AD risk and identify novel molecular mechanisms of AD pathogenesis.

  3. Candidate genes for chronic obstructive pulmonary disease in two large data sets

    DEFF Research Database (Denmark)

    Bakke, P S; Zhu, G; Gulsvik, A;

    2011-01-01

    to these phenotypes in this first study were tested in a second, family based, study that included 635 pedigrees with 1910 individuals. Significant associations to the binary COPD phenotype in both populations were seen for STAT1 (rs13010343) and NFKBIB/SIRT2 (rs2241704) (p... of the GC gene were significantly associated with FEV1 in percent predicted and FEV1/FVC, respectively in both populations (pSIRT2, and GC genes in two independent populations, the associations of the former two genes...

  4. Implementation of BacMam virus gene delivery technology in a drug discovery setting.

    Science.gov (United States)

    Kost, Thomas A; Condreay, J Patrick; Ames, Robert S; Rees, Stephen; Romanos, Michael A

    2007-05-01

    Membrane protein targets constitute a key segment of drug discovery portfolios and significant effort has gone into increasing the speed and efficiency of pursuing these targets. However, issues still exist in routine gene expression and stable cell-based assay development for membrane proteins, which are often multimeric or toxic to host cells. To enhance cell-based assay capabilities, modified baculovirus (BacMam virus) gene delivery technology has been successfully applied to the transient expression of target proteins in mammalian cells. Here, we review the development, full implementation and benefits of this platform-based gene expression technology in support of SAR and HTS assays across GlaxoSmithKline.

  5. Identification and Validation of a New Set of Five Genes for Prediction of Risk in Early Breast Cancer

    Directory of Open Access Journals (Sweden)

    Giorgio Mustacchi

    2013-05-01

    Full Text Available Molecular tests predicting the outcome of breast cancer patients based on gene expression levels can be used to assist in making treatment decisions after consideration of conventional markers. In this study we identified a subset of 20 mRNA differentially regulated in breast cancer analyzing several publicly available array gene expression data using R/Bioconductor package. Using RTqPCR we evaluate 261 consecutive invasive breast cancer cases not selected for age, adjuvant treatment, nodal and estrogen receptor status from paraffin embedded sections. The biological samples dataset was split into a training (137 cases and a validation set (124 cases. The gene signature was developed on the training set and a multivariate stepwise Cox analysis selected five genes independently associated with DFS: FGF18 (HR = 1.13, p = 0.05, BCL2 (HR = 0.57, p = 0.001, PRC1 (HR = 1.51, p = 0.001, MMP9 (HR = 1.11, p = 0.08, SERF1a (HR = 0.83, p = 0.007. These five genes were combined into a linear score (signature weighted according to the coefficients of the Cox model, as: 0.125FGF18 − 0.560BCL2 + 0.409PRC1 + 0.104MMP9 − 0.188SERF1A (HR = 2.7, 95% CI = 1.9–4.0, p < 0.001. The signature was then evaluated on the validation set assessing the discrimination ability by a Kaplan Meier analysis, using the same cut offs classifying patients at low, intermediate or high risk of disease relapse as defined on the training set (p < 0.001. Our signature, after a further clinical validation, could be proposed as prognostic signature for disease free survival in breast cancer patients where the indication for adjuvant chemotherapy added to endocrine treatment is uncertain.

  6. Annotation of gene function in citrus using gene expression information and co-expression networks

    OpenAIRE

    Wong, Darren CJ; Sweetman, Crystal; Ford, Christopher M.

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related bi...

  7. Annotation of gene function in citrus using gene expression information and co-expression networks

    OpenAIRE

    Wong, Darren CJ; Sweetman, Crystal; Ford, Christopher M.

    2014-01-01

    Background The genus Citrus encompasses major cultivated plants such as sweet orange, mandarin, lemon and grapefruit, among the world’s most economically important fruit crops. With increasing volumes of transcriptomics data available for these species, Gene Co-expression Network (GCN) analysis is a viable option for predicting gene function at a genome-wide scale. GCN analysis is based on a “guilt-by-association” principle whereby genes encoding proteins involved in similar and/or related bi...

  8. Assessing the Association of Mitochondrial Genetic Variation With Primary Open-Angle Glaucoma Using Gene-Set Analyses.

    Science.gov (United States)

    Khawaja, Anthony P; Cooke Bailey, Jessica N; Kang, Jae Hee; Allingham, R Rand; Hauser, Michael A; Brilliant, Murray; Budenz, Donald L; Christen, William G; Fingert, John; Gaasterland, Douglas; Gaasterland, Terry; Kraft, Peter; Lee, Richard K; Lichter, Paul R; Liu, Yutao; Medeiros, Felipe; Moroi, Syoko E; Richards, Julia E; Realini, Tony; Ritch, Robert; Schuman, Joel S; Scott, William K; Singh, Kuldev; Sit, Arthur J; Vollrath, Douglas; Wollstein, Gadi; Zack, Donald J; Zhang, Kang; Pericak-Vance, Margaret; Weinreb, Robert N; Haines, Jonathan L; Pasquale, Louis R; Wiggs, Janey L

    2016-09-01

    Recent studies indicate that mitochondrial proteins may contribute to the pathogenesis of primary open-angle glaucoma (POAG). In this study, we examined the association between POAG and common variations in gene-encoding mitochondrial proteins. We examined genetic data from 3430 POAG cases and 3108 controls derived from the combination of the GLAUGEN and NEIGHBOR studies. We constructed biological-system coherent mitochondrial nuclear-encoded protein gene-sets by intersecting the MitoCarta database with the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. We examined the mitochondrial gene-sets for association with POAG and with normal-tension glaucoma (NTG) and high-tension glaucoma (HTG) subsets using Pathway Analysis by Randomization Incorporating Structure. We identified 22 KEGG pathways with significant mitochondrial protein-encoding gene enrichment, belonging to six general biological classes. Among the pathway classes, mitochondrial lipid metabolism was associated with POAG overall (P = 0.013) and with NTG (P = 0.0006), and mitochondrial carbohydrate metabolism was associated with NTG (P = 0.030). Examining the individual KEGG pathway mitochondrial gene-sets, fatty acid elongation and synthesis and degradation of ketone bodies, both lipid metabolism pathways, were significantly associated with POAG (P = 0.005 and P = 0.002, respectively) and NTG (P = 0.0004 and P < 0.0001, respectively). Butanoate metabolism, a carbohydrate metabolism pathway, was significantly associated with POAG (P = 0.004), NTG (P = 0.001), and HTG (P = 0.010). We present an effective approach for assessing the contributions of mitochondrial genetic variation to open-angle glaucoma. Our findings support a role for mitochondria in POAG pathogenesis and specifically point to lipid and carbohydrate metabolism pathways as being important.

  9. Candidate genes for chronic obstructive pulmonary disease in two large data sets

    DEFF Research Database (Denmark)

    Bakke, P S; Zhu, G; Gulsvik, A

    2011-01-01

    Lack of reproducibility of findings has been a criticism of genetic association studies in complex diseases like chronic obstructive pulmonary disease (COPD). We selected 257 polymorphisms of 16 genes with reported or potential relationshipsto COPD and genotyped these variants in a case-control s......Lack of reproducibility of findings has been a criticism of genetic association studies in complex diseases like chronic obstructive pulmonary disease (COPD). We selected 257 polymorphisms of 16 genes with reported or potential relationshipsto COPD and genotyped these variants in a case...... of the GC gene were significantly associated with FEV1 in percent predicted and FEV1/FVC, respectively in both populations (pgenes in two independent populations, the associations of the former two genes...

  10. Genome-wide survey and developmental expression mapping of zebrafish SET domain-containing genes

    National Research Council Canada - National Science Library

    Sun, Xiao-Jian; Xu, Peng-Fei; Zhou, Ting; Hu, Ming; Fu, Chun-Tang; Zhang, Yong; Jin, Yi; Chen, Yi; Chen, Sai-Juan; Huang, Qiu-Hua; Liu, Ting Xi; Chen, Zhu

    2008-01-01

    .... Since some of these genes have been revealed to be essential for embryonic development, we propose that the zebrafish, a vertebrate model organism possessing many advantages for developmental studies...

  11. Different gene sets contribute to different symptom dimensions of depression and anxiety

    OpenAIRE

    van Veen, Tineke; Goeman, Jelle J.; Monajemi, Ramin; Wardenaar, Klaas J; Hartman, Catharina A; Snieder, Harold; Nolte, Ilja M; Penninx, Brenda W. J. H.; Zitman, Frans G.

    2012-01-01

    Although many genetic association studies have been carried out, it remains unclear which genes contribute to depression. This may be due to heterogeneity of the DSM-IV category of depression. Specific symptom-dimensions provide a more homogenous phenotype. Furthermore, as effects of individual genes are small, analysis of genetic data at the pathway-level provides more power to detect associations and yield valuable biological insight. In 1,398 individuals with a Major Depressive Disorder, t...

  12. Prioritizing predicted cis-regulatory elements for co-expressed gene sets based on Lasso regression models.

    Science.gov (United States)

    Hu, Hong; Roqueiro, Damian; Dai, Yang

    2011-01-01

    Computational prediction of cis-regulatory elements for a set of co-expressed genes based on sequence analysis provides an overwhelming volume of potential transcription factor binding sites. It presents a challenge to prioritize transcription factors for regulatory functional studies. A novel approach based on the use of Lasso regression models is proposed to address this problem. We examine the ability of the Lasso model using time-course microarray data obtained from a comprehensive study of gene expression profiles in skin and mucosal wounds in mouse over all stages of wound healing.

  13. TropGENE-DB, a multi-tropical crop information system.

    Science.gov (United States)

    Ruiz, Manuel; Rouard, Mathieu; Raboin, Louis Marie; Lartaud, Marc; Lagoda, Pierre; Courtois, Brigitte

    2004-01-01

    TropGENE-DB, is a crop information system created to store genetic, molecular and phenotypic data of the numerous yet poorly documented tropical crop species. The most common data stored in TropGENE-DB are information on genetic resources (agro-morphological data, parentages, allelic diversity), molecular markers, genetic maps, results of quantitative trait loci analyses, data from physical mapping, sequences, genes, as well as the corresponding references. TropGENE-DB is organized on a crop basis with currently three running modules (sugarcane, cocoa and banana), with plans to create additional modules for rice, cotton, oil palm, coconut, rubber tree, pineapple, taro, yam and sorghum. The TropGENE-DB information system is accessible for consultation via the internet at http://tropgenedb.cirad.fr. Specific web consultation interfaces have been designed to allow quick consultations as well as complex queries.

  14. Mining disease genes using integrated protein-protein interaction and gene-gene co-regulation information.

    Science.gov (United States)

    Li, Jin; Wang, Limei; Guo, Maozu; Zhang, Ruijie; Dai, Qiguo; Liu, Xiaoyan; Wang, Chunyu; Teng, Zhixia; Xuan, Ping; Zhang, Mingming

    2015-01-01

    In humans, despite the rapid increase in disease-associated gene discovery, a large proportion of disease-associated genes are still unknown. Many network-based approaches have been used to prioritize disease genes. Many networks, such as the protein-protein interaction (PPI), KEGG, and gene co-expression networks, have been used. Expression quantitative trait loci (eQTLs) have been successfully applied for the determination of genes associated with several diseases. In this study, we constructed an eQTL-based gene-gene co-regulation network (GGCRN) and used it to mine for disease genes. We adopted the random walk with restart (RWR) algorithm to mine for genes associated with Alzheimer disease. Compared to the Human Protein Reference Database (HPRD) PPI network alone, the integrated HPRD PPI and GGCRN networks provided faster convergence and revealed new disease-related genes. Therefore, using the RWR algorithm for integrated PPI and GGCRN is an effective method for disease-associated gene mining.

  15. Phospholipase C isozymes are deregulated in colorectal cancer--insights gained from gene set enrichment analysis of the transcriptome.

    Directory of Open Access Journals (Sweden)

    Stine A Danielsen

    Full Text Available Colorectal cancer (CRC is one of the most common cancer types in developed countries. To identify molecular networks and biological processes that are deregulated in CRC compared to normal colonic mucosa, we applied Gene Set Enrichment Analysis to two independent transcriptome datasets, including a total of 137 CRC and ten normal colonic mucosa samples. Eighty-two gene sets as described by the Kyoto Encyclopedia of Genes and Genomes database had significantly altered gene expression in both datasets. These included networks associated with cell division, DNA maintenance, and metabolism. Among signaling pathways with known changes in key genes, the "Phosphatidylinositol signaling network", comprising part of the PI3K pathway, was found deregulated. The downregulated genes in this pathway included several members of the Phospholipase C protein family, and the reduced expression of two of these, PLCD1 and PLCE1, were successfully validated in CRC biopsies (n = 70 and cell lines (n = 19 by quantitative analyses. The repression of both genes was found associated with KRAS mutations (P = 0.005 and 0.006, respectively, and we observed that microsatellite stable carcinomas with reduced PLCD1 expression more frequently had TP53 mutations (P = 0.002. Promoter methylation analyses of PLCD1 and PLCE1 performed in cell lines and tumor biopsies revealed that methylation of PLCD1 can contribute to reduced expression in 40% of the microsatellite instable carcinomas. In conclusion, we have identified significantly deregulated pathways in CRC, and validated repression of PLCD1 and PLCE1 expression. This illustrates that the GSEA approach may guide discovery of novel biomarkers in cancer.

  16. High resolution remote sensing information identification for characterizing uranium mineralization setting in Namibia

    Science.gov (United States)

    Zhang, Jie-Lin; Wang, Jun-hu; Zhou, Mi; Huang, Yan-ju; Xuan, Yan-xiu; Wu, Ding

    2011-11-01

    The modern Earth Observation System (EOS) technology takes important role in the uranium geological exploration, and high resolution remote sensing as one of key parts of EOS is vital to characterize spectral and spatial information of uranium mineralization factors. Utilizing satellite high spatial resolution and hyperspectral remote sensing data (QuickBird, Radarsat2, ASTER), field spectral measurement (ASD data) and geological survey, this paper established the spectral identification characteristics of uranium mineralization factors including six different types of alaskite, lower and upper marble of Rössing formation, dolerite, alkali metasomatism, hematization and chloritization in the central zone of Damara Orogen, Namibia. Moreover, adopted the texture information identification technology, the geographical distribution zones of ore-controlling faults and boundaries between the different strata were delineated. Based on above approaches, the remote sensing geological anomaly information and image interpretation signs of uranium mineralization factors were extracted, the metallogenic conditions were evaluated, and the prospective areas have been predicted.

  17. How to Set Up Information Systems A Non-specialist's Guide to the Multiview Approach

    CERN Document Server

    Bell, Simon

    2012-01-01

    This introductory user's guide to systems analysis and systems design focuses on building sustainable information systems to meet tomorrow's needs. It shows how practitioners can apply multiple participatory perspectives in development, so as to avoid future problems. As a practical guide, it is presented to be readily comprehensible and is organized to enable users to concentrate on their goals efficiently, and with minimum theoretical elaboration. The chapters follow the sequence involved in planning an information system, explaining key words, the time involved in each step, ending with a t

  18. Improving gene regulatory network inference using network topology information.

    Science.gov (United States)

    Nair, Ajay; Chetty, Madhu; Wangikar, Pramod P

    2015-09-01

    Inferring the gene regulatory network (GRN) structure from data is an important problem in computational biology. However, it is a computationally complex problem and approximate methods such as heuristic search techniques, restriction of the maximum-number-of-parents (maxP) for a gene, or an optimal search under special conditions are required. The limitations of a heuristic search are well known but literature on the detailed analysis of the widely used maxP technique is lacking. The optimal search methods require large computational time. We report the theoretical analysis and experimental results of the strengths and limitations of the maxP technique. Further, using an optimal search method, we combine the strengths of the maxP technique and the known GRN topology to propose two novel algorithms. These algorithms are implemented in a Bayesian network framework and tested on biological, realistic, and in silico networks of different sizes and topologies. They overcome the limitations of the maxP technique and show superior computational speed when compared to the current optimal search algorithms.

  19. A two-sample test for high-dimensional data with applications to gene-set testing

    CERN Document Server

    Chen, Song Xi; 10.1214/09-AOS716

    2010-01-01

    We propose a two-sample test for the means of high-dimensional data when the data dimension is much larger than the sample size. Hotelling's classical $T^2$ test does not work for this "large $p$, small $n$" situation. The proposed test does not require explicit conditions in the relationship between the data dimension and sample size. This offers much flexibility in analyzing high-dimensional data. An application of the proposed test is in testing significance for sets of genes which we demonstrate in an empirical study on a leukemia data set.

  20. Hospital status admission determination: the use of Boolean logic, set theory, and information theory to improve accuracy.

    Science.gov (United States)

    Cohen, Daniel H

    2012-01-01

    To evaluate methods of logic, set theory, and information theory in developing a conceptual framework that would be useful in an educational process as well as in developing a consistent and rational method for hospital status determination. To implement these methods on a daily basis in interaction with nurse case managers, physicians, and in documentation of the process. A tertiary private, not-for-profit institution within the department of case management and utilization review. These methods were well accepted by those involved in the decision process and allowed a Case Management Assignment Protocol to function well in the hospital environment with a low level of disagreement and conflict. Medical information can be processed effectively with conceptual models of logic and information theory. The used commercial screening systems are described well by set theory and are intersecting sets of patient variables and characteristics. These methods can be used in educational processes in practice settings apart from those using the Case Management Assignment Protocol. It provides a basis for evaluation of patients' presentations that use important factors such as clinical uncertainty, patient specific data, and reference to preexisting admission criteria.

  1. An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels

    Directory of Open Access Journals (Sweden)

    Gregersen Peter K

    2009-07-01

    Full Text Available Abstract Background Case-control genetic studies of complex human diseases can be confounded by population stratification. This issue can be addressed using panels of ancestry informative markers (AIMs that can provide substantial population substructure information. Previously, we described a panel of 128 SNP AIMs that were designed as a tool for ascertaining the origins of subjects from Europe, Sub-Saharan Africa, Americas, and East Asia. Results In this study, genotypes from Human Genome Diversity Panel populations were used to further evaluate a 93 SNP AIM panel, a subset of the 128 AIMS set, for distinguishing continental origins. Using both model-based and relatively model-independent methods, we here confirm the ability of this AIM set to distinguish diverse population groups that were not previously evaluated. This study included multiple population groups from Oceana, South Asia, East Asia, Sub-Saharan Africa, North and South America, and Europe. In addition, the 93 AIM set provides population substructure information that can, for example, distinguish Arab and Ashkenazi from Northern European population groups and Pygmy from other Sub-Saharan African population groups. Conclusion These data provide additional support for using the 93 AIM set to efficiently identify continental subject groups for genetic studies, to identify study population outliers, and to control for admixture in association studies.

  2. Transcriptome datasets supply basic gene information for RNAi pest management and gene functional studies inNephotettix cincticeps (Uhler)

    Institute of Scientific and Technical Information of China (English)

    CHEN Tai-yu; HOU Ji-xiang; LIN Yong-jun

    2016-01-01

    RNA interference (RNAi) technology has the potential to be used in pest management in crop production. Here, the transcriptome ofNephotettix cincticeps(Uhler) was deeply sequenced to investigate the systematic RNAi mechanism and candidate genes for dsRNA feeding. In our datasets, a total of 81225 transcripts were obtained with the length from 150 bp to about 4.2 kb. Almost al the genes related to the RNAi core pathway were proved to be present inN. cincticeps transcriptome. Two transcripts that respectively encode a systemic interference defective (SID) were identiifed in our da-tabase, indicating that the systematic RNAi pathway can function effectively inN. cincticeps. Our datasets not only supply basic gene information for the studies of gene expression and functions in N. cincticeps, such as the control genes for gene expression analysis, but also provide candidate genes for RNAi pest management, such as the genes that encode P450 monooxygenase, V-ATPase and chitin synthase.

  3. Lessons learned obtaining informed consent in research with vulnerable populations in community health center settings

    Directory of Open Access Journals (Sweden)

    Riden Heather E

    2012-11-01

    Full Text Available Abstract Background To improve equity in access to medical research, successful strategies are needed to recruit diverse populations. Here, we examine experiences of community health center (CHC staff who guided an informed consent process to overcome recruitment barriers in a medical record review study. Methods We conducted ten semi-structured interviews with CHC staff members. Interviews were audiotaped, transcribed, and structurally and thematically coded. We used NVivo, an ethnographic data management software program, to analyze themes related to recruitment challenges. Results CHC interviewees reported that a key challenge to recruitment included the difficult balance between institutional review board (IRB requirements for informed consent, and conveying an appropriate level of risk to patients. CHC staff perceived that the requirements of IRB certification itself posed a barrier to allowing diverse staff to participate in recruitment efforts. A key barrier to recruitment also included the lack of updated contact information on CHC patients. CHC interviewees reported that the successes they experienced reflected an alignment between study aims and CHC goals, and trusted relationships between CHCs and staff and the patients they recruited. Conclusions Making IRB training more accessible to CHC-based staff, improving consent form clarity for participants, and developing processes for routinely updating patient information would greatly lower recruitment barriers for diverse populations in health services research.

  4. Digital Resource Developments for Mathematics Education Involving Homework across Formal, Non-Formal and Informal Settings

    Science.gov (United States)

    Radovic, Slaviša; Passey, Don

    2016-01-01

    The aim of this paper is to explore further an under-developed area--how drivers of curriculum, pedagogy and assessment conceptions and practices shape the creation and uses of technologically based resources to support mathematics learning across informal, non-formal and formal learning environments. The paper considers: the importance of…

  5. Empowering Interviews: Narrative Interviews in the Study of Information Literacy in Everyday Life Settings

    Science.gov (United States)

    Eckerdal, Johanna Rivano

    2013-01-01

    Introduction: This paper presents a way to design and conduct interviews, within a sociocultural perspective, for studying information literacy practices in everyday life. Methods: A framework was developed combining a socio-cultural perspective with a narrative interview was developed. Interviewees were invited to participate by talking and using…

  6. Integration of Information and Communication Technology and Pupils' Motivation in a Physical Education Setting

    Science.gov (United States)

    Legrain, Pascal; Gillet, Nicolas; Gernigon, Christophe; Lafreniere, Marc-André

    2015-01-01

    The purpose of this study was to test an integrative model regarding the impact of information and communication technology (ICT) on achievement in physical education. Pupils' perceptions of autonomy-support from teacher, satisfaction of basic psychological needs, and self-determined motivation were considered to mediate the impact of ICT on…

  7. Integration of Information and Communication Technology and Pupils' Motivation in a Physical Education Setting

    Science.gov (United States)

    Legrain, Pascal; Gillet, Nicolas; Gernigon, Christophe; Lafreniere, Marc-André

    2015-01-01

    The purpose of this study was to test an integrative model regarding the impact of information and communication technology (ICT) on achievement in physical education. Pupils' perceptions of autonomy-support from teacher, satisfaction of basic psychological needs, and self-determined motivation were considered to mediate the impact of ICT on…

  8. Informal interpreters in medical settings: a socio-comparative study of the Netherlands and Turkey

    NARCIS (Netherlands)

    Schouten, B.; Ross, J.; Zendedel, R.; Meeuwesen, L.

    2012-01-01

    Between 2008 and 2010, academics in five European countries collaborated on an EU-funded project, Training Intercultural and Bilingual Competences in Health and Social Care (TRICC). Among TRICC’s aims was to deepen understanding of informal interpreting through eliciting the perspectives of interpre

  9. Gene Set-Based Functionome Analysis of Pathogenesis in Epithelial Ovarian Serous Carcinoma and the Molecular Features in Different FIGO Stages

    Directory of Open Access Journals (Sweden)

    Chia-Ming Chang

    2016-06-01

    Full Text Available Serous carcinoma (SC is the most common subtype of epithelial ovarian carcinoma and is divided into four stages by the Federation of Gynecologists and Obstetrics (FIGO staging system. Currently, the molecular functions and biological processes of SC at different FIGO stages have not been quantified. Here, we conducted a whole-genome integrative analysis to investigate the functions of SC at different stages. The function, as defined by the GO term or canonical pathway gene set, was quantified by measuring the changes in the gene expressional order between cancerous and normal control states. The quantified function, i.e., the gene set regularity (GSR index, was utilized to investigate the pathogenesis and functional regulation of SC at different FIGO stages. We showed that the informativeness of the GSR indices was sufficient for accurate pattern recognition and classification for machine learning. The function regularity presented by the GSR indices showed stepwise deterioration during SC progression from FIGO stage I to stage IV. The pathogenesis of SC was centered on cell cycle deregulation and accompanied with multiple functional aberrations as well as their interactions.

  10. HoxBlinc RNA recruits Set1/MLL complexes to activate Hox gene expression patterns and mesoderm lineage development

    Science.gov (United States)

    Deng, Changwang; Li, Ying; Zhou, Lei; Cho, Joonseok; Patel, Bhavita; Terada, Nao; Li, Yangqiu; Bungert, Jörg; Qiu, Yi; Huang, Suming

    2015-01-01

    Summary Trithorax proteins and long-intergenic noncoding RNAs are critical regulators of embryonic stem cell pluripotency; however, how they cooperatively regulate germ layer mesoderm specification remains elusive. We report here that HoxBlinc RNA first specifies Flk1+ mesoderm and then promotes hematopoietic differentiation through regulating hoxb gene pathways. HoxBlinc binds to the hoxb genes, recruits Setd1a/MLL1 complexes, and mediates long-range chromatin interactions to activate transcription of the hoxb genes. Depletion of HoxBlinc by shRNA-mediated KD or CRISPR-Cas9-mediated genetic deletion inhibits expression of hoxb genes and other factors regulating cardiac/hematopoietic differentiation. Reduced hoxb gene expression is accompanied by decreased recruitment of Set1/MLL1 and H3K4me3 modification, as well as by reduced chromatin loop formation. Re-expression of hoxb2-b4 genes in HoxBlinc-depleted embryoid bodies rescues Flk1+ precursors that undergo hematopoietic differentiation. Thus, HoxBlinc plays an important role in controlling hoxb transcription networks that mediate specification of mesoderm-derived Flk1+ precursors and differentiation of Flk1+ cells into hematopoietic lineages. PMID:26725110

  11. A set of vectors for introduction of antibiotic resistance genes by in vitro Cre-mediated recombination

    Directory of Open Access Journals (Sweden)

    Vassetzky Yegor S

    2008-12-01

    Full Text Available Abstract Background Introduction of new antibiotic resistance genes in the plasmids of interest is a frequent task in molecular cloning practice. Classical approaches involving digestion with restriction endonucleases and ligation are time-consuming. Findings We have created a set of insertion vectors (pINS carrying genes that provide resistance to various antibiotics (puromycin, blasticidin and G418 and containing a loxP site. Each vector (pINS-Puro, pINS-Blast or pINS-Neo contains either a chloramphenicol or a kanamycin resistance gene and is unable to replicate in most E. coli strains as it contains a conditional R6Kγ replication origin. Introduction of the antibiotic resistance genes into the vector of interest is achieved by Cre-mediated recombination between the replication-incompetent pINS and a replication-competent target vector. The recombination mix is then transformed into E. coli and selected by the resistance marker (kanamycin or chloramphenicol present in pINS, which allows to recover the recombinant plasmids with 100% efficiency. Conclusion Here we propose a simple strategy that allows to introduce various antibiotic-resistance genes into any plasmid containing a replication origin, an ampicillin resistance gene and a loxP site.

  12. Application description and policy model in collaborative environment for sharing of information on epidemiological and clinical research data sets.

    Directory of Open Access Journals (Sweden)

    Elias César Araujo de Carvalho

    Full Text Available BACKGROUND: Sharing of epidemiological and clinical data sets among researchers is poor at best, in detriment of science and community at large. The purpose of this paper is therefore to (1 describe a novel Web application designed to share information on study data sets focusing on epidemiological clinical research in a collaborative environment and (2 create a policy model placing this collaborative environment into the current scientific social context. METHODOLOGY: The Database of Databases application was developed based on feedback from epidemiologists and clinical researchers requiring a Web-based platform that would allow for sharing of information about epidemiological and clinical study data sets in a collaborative environment. This platform should ensure that researchers can modify the information. A Model-based predictions of number of publications and funding resulting from combinations of different policy implementation strategies (for metadata and data sharing were generated using System Dynamics modeling. PRINCIPAL FINDINGS: The application allows researchers to easily upload information about clinical study data sets, which is searchable and modifiable by other users in a wiki environment. All modifications are filtered by the database principal investigator in order to maintain quality control. The application has been extensively tested and currently contains 130 clinical study data sets from the United States, Australia, China and Singapore. Model results indicated that any policy implementation would be better than the current strategy, that metadata sharing is better than data-sharing, and that combined policies achieve the best results in terms of publications. CONCLUSIONS: Based on our empirical observations and resulting model, the social network environment surrounding the application can assist epidemiologists and clinical researchers contribute and search for metadata in a collaborative environment, thus potentially

  13. Evaluation of the utility of gene expression and metabolic information for genomic prediction in maize.

    Science.gov (United States)

    Guo, Zhigang; Magwire, Michael M; Basten, Christopher J; Xu, Zhanyou; Wang, Daolong

    2016-12-01

    Predictive ability derived from gene expression and metabolic information was evaluated using genomic prediction methods based on datasets from a public maize panel. With the rapid development of high throughput biological technologies, information from gene expression and metabolites has received growing attention in plant genetics and breeding. In this study, we evaluated the utility of gene expression and metabolic information for genomic prediction using data obtained from a maize diversity panel. Our results show that, when used as predictor variables, gene expression levels and metabolite abundances provided reasonable predictive abilities relative to those based on genetic markers, although these values were not as large as those with genetic markers. Integrating gene expression levels and metabolite abundances with genetic markers significantly improved predictive abilities in comparison to the benchmark genomic best linear unbiased prediction model using genome-wide markers only. Predictive abilities based on gene expression and metabolites were trait-specific and were affected by the time of measurement and tissue samples as well as the number of genes and metabolites included in the model. In general, our results suggest that, rather than being conventionally used as intermediate phenotypes, gene expression and metabolic information can be used as predictors for genomic prediction and help improve genetic gains for complex traits in breeding programs.

  14. Gene expression profiling identifies a set of transcripts that are up-regulated inhuman testicular seminoma.

    Science.gov (United States)

    Yamada, Shigeyuki; Kohu, Kazuyoshi; Ishii, Tomohiko; Ishidoya, Shigeto; Ishidoya, Shigeru; Hiramatsu, Masayoshi; Kanto, Satoru; Fukuzaki, Atsushi; Adachi, Yutsu; Endoh, Mareyuki; Moriya, Takuya; Sasaki, Hiroki; Satake, Masanobu; Arai, Yoichi

    2004-10-31

    Seminoma constitutes one subtype of human testicular germ cell tumors and is uniformly composed of cells that are morphologically similar to the primordial germ cells and/or the cells in the carcinoma in situ. We performed a genome-wide exploration of the genes that are specifically up-regulated in seminoma by oligonucleotide-based microarray analysis. This revealed 106 genes that are significantly and consistently up-regulated in the seminomas compared to the adjacent normal tissues of the testes. The microarray data were validated by semi-quantitative RT-PCR analysis. Of the 106 genes, 42 mapped to a small number of specific chromosomal regions, namely, 1q21, 2p23, 6p21-22, 7p14-15, 12pll, 12p13, 12q13-14 and 22q12-13. This list of up-regulated genes may be useful in identifying the causative oncogene(s) and/or the origin of seminoma. Furthermore, immunohistochemical analysis revealed that the seminoma cells specifically expressed the six gene products that were selected randomly from the list. These proteins include CCND2 and DNMT3A and may be useful as molecular pathological markers of seminoma.

  15. rapidGSEA: Speeding up gene set enrichment analysis on multi-core CPUs and CUDA-enabled GPUs.

    Science.gov (United States)

    Hundt, Christian; Hildebrandt, Andreas; Schmidt, Bertil

    2016-09-23

    Gene Set Enrichment Analysis (GSEA) is a popular method to reveal significant dependencies between predefined sets of gene symbols and observed phenotypes by evaluating the deviation of gene expression values between cases and controls. An established measure of inter-class deviation, the enrichment score, is usually computed using a weighted running sum statistic over the whole set of gene symbols. Due to the lack of analytic expressions the significance of enrichment scores is determined using a non-parametric estimation of their null distribution by permuting the phenotype labels of the probed patients. Accordingly, GSEA is a time-consuming task due to the large number of required permutations to accurately estimate the nominal p-value - a circumstance that is even more pronounced during multiple hypothesis testing since its estimate is lower-bounded by the inverse number of samples in permutation space. We present rapidGSEA - a software suite consisting of two tools for facilitating permutation-based GSEA: cudaGSEA and ompGSEA. cudaGSEA is a CUDA-accelerated tool using fine-grained parallelization schemes on massively parallel architectures while ompGSEA is a coarse-grained multi-threaded tool for multi-core CPUs. Nominal p-value estimation of 4,725 gene sets on a data set consisting of 20,639 unique gene symbols and 200 patients (183 cases + 17 controls) each probing one million permutations takes 19 hours on a Xeon CPU and less than one hour on a GeForce Titan X GPU while the established GSEA tool from the Broad Institute (broadGSEA) takes roughly 13 days. cudaGSEA outperforms broadGSEA by around two orders-of-magnitude on a single Tesla K40c or GeForce Titan X GPU. ompGSEA provides around one order-of-magnitude speedup to broadGSEA on a standard Xeon CPU. The rapidGSEA suite is open-source software and can be downloaded at https://github.com/gravitino/cudaGSEA as standalone application or package for the R framework.

  16. Should patients set the agenda for informed, consent? A prospective survey of desire for information and discussion prior to routine cataract surgery

    Directory of Open Access Journals (Sweden)

    Lee Teak Tan

    2008-08-01

    Full Text Available Lee Teak Tan1,2, Huw Jenkins1,2, John Roberts-Harry2, Michael Austin11Singleton Hospital, Swansea, UK; 2West Wales General Hospital, Carmarthen, UKPurpose: To ascertain the level of information relating to specific risks desired by patients prior to cataract surgery.Setting: Dedicated cataract surgery pre-assessment clinics of 2 hospitals in South West Wales, UK.Methods: Consecutive patients (106 were recruited prospectively. Of these, 6 were formally excluded due to deafness or disorientation. Eligible patients (100 were asked a set of preliminary questions to determine their understanding of the nature of cataract, risk perception, and level of information felt necessary prior to giving consent. Those who desired further information were guided through a standardized questionnaire, which included an audio-visual presentation giving information relating to each potential surgical complication, allowing patients to rate them for relevance to their giving of informed consent.Results: Of the entire group of 100, 32 did not wish to know “anything at all” about risks and would prefer to leave decision making to their ophthalmologist; 22 were interested only in knowing their overall chance of visual improvement; and 46 welcomed a general discussion of possible complications, of whom 25 went on to enquire about specific complications. Of these 25, 18 wished to be informed of posterior capsular (PC tearing, 17 of endophthalmitis, 16 each of dropped lens, retinal detachment and corneal clouding, and 15 of bleeding, sympathetic ophthalmia, and PC opacification.Conclusion: Patients differ in their desire for information prior to cataract surgery, with one significant minority favoring little or no discussion of risk and another wishing detailed consideration of specific risks. A system of consent where patients have a choice as to the level of discussion undertaken may better suit patients’ wishes than a doctor-specified agenda.Keywords: cataract

  17. Transcriptome analysis of cortical tissue reveals shared sets of downregulated genes in autism and schizophrenia

    Science.gov (United States)

    Ellis, S E; Panitch, R; West, A B; Arking, D E

    2016-01-01

    Autism (AUT), schizophrenia (SCZ) and bipolar disorder (BPD) are three highly heritable neuropsychiatric conditions. Clinical similarities and genetic overlap between the three disorders have been reported; however, the causes and the downstream effects of this overlap remain elusive. By analyzing transcriptomic RNA-sequencing data generated from post-mortem cortical brain tissues from AUT, SCZ, BPD and control subjects, we have begun to characterize the extent of gene expression overlap between these disorders. We report that the AUT and SCZ transcriptomes are significantly correlated (P<0.001), whereas the other two cross-disorder comparisons (AUT–BPD and SCZ–BPD) are not. Among AUT and SCZ, we find that the genes differentially expressed across disorders are involved in neurotransmission and synapse regulation. Despite the lack of global transcriptomic overlap across all three disorders, we highlight two genes, IQSEC3 and COPS7A, which are significantly downregulated compared with controls across all three disorders, suggesting either shared etiology or compensatory changes across these neuropsychiatric conditions. Finally, we tested for enrichment of genes differentially expressed across disorders in genetic association signals in AUT, SCZ or BPD, reporting lack of signal in any of the previously published genome-wide association study (GWAS). Together, these studies highlight the importance of examining gene expression from the primary tissue involved in neuropsychiatric conditions—the cortical brain. We identify a shared role for altered neurotransmission and synapse regulation in AUT and SCZ, in addition to two genes that may more generally contribute to neurodevelopmental and neuropsychiatric conditions. PMID:27219343

  18. Building outline reconstruction from ALS data set with a priori information

    Science.gov (United States)

    Jarząbek-Rychard, M.; Borkowski, A.

    2011-12-01

    Extraction of building boundaries is a n important step towards 3D buildings reconstruction. It may be also of interest on their own, for the real estate industry, GIS and automated updating of cadastral maps. In this paper we propose a comprehensive method for an automated extraction and delineation of building outlines from raw airborne laser scanning data. The presented workflow comprises three steps. It starts with identification of the points belonging to each singular building. The second step is to trace the points that compose a building boundary. In the last step an adjustment process is applied, that aims in boundary lines regularization. The first step -building detection is a most computationally expensive process and has a fundamental importance for the whole algorithm. A proposed approach is to include building address points that give exact information about building location. This additional information highly reduces the complexity of the building points extraction.

  19. Identification of a novel set of genes reflecting different in vivo invasive patterns of human GBM cells

    Directory of Open Access Journals (Sweden)

    Monticone Massimiliano

    2012-08-01

    Full Text Available Abstract Background Most patients affected by Glioblastoma multiforme (GBM, grade IV glioma experience a recurrence of the disease because of the spreading of tumor cells beyond surgical boundaries. Unveiling mechanisms causing this process is a logic goal to impair the killing capacity of GBM cells by molecular targeting. We noticed that our long-term GBM cultures, established from different patients, may display two categories/types of growth behavior in an orthotopic xenograft model: expansion of the tumor mass and formation of tumor branches/nodules (nodular like, NL-type or highly diffuse single tumor cell infiltration (HD-type. Methods We determined by DNA microarrays the gene expression profiles of three NL-type and three HD-type long-term GBM cultures. Subsequently, individual genes with different expression levels between the two groups were identified using Significance Analysis of Microarrays (SAM. Real time RT-PCR, immunofluorescence and immunoblot analyses, were performed for a selected subgroup of regulated gene products to confirm the results obtained by the expression analysis. Results Here, we report the identification of a set of 34 differentially expressed genes in the two types of GBM cultures. Twenty-three of these genes encode for proteins localized to the plasma membrane and 9 of these for proteins are involved in the process of cell adhesion. Conclusions This study suggests the participation in the diffuse infiltrative/invasive process of GBM cells within the CNS of a novel set of genes coding for membrane-associated proteins, which should be thus susceptible to an inhibition strategy by specific targeting. Massimiliano Monticone and Antonio Daga contributed equally to this work

  20. Genetic investigation of 100 heart genes in sudden unexplained death victims in a forensic setting

    DEFF Research Database (Denmark)

    Christiansen, Sofie Lindgren; Hertz, Christin Løth; Ferrero, Laura

    2016-01-01

    indicate that broad genetic investigation of SUD victims increases the diagnostic outcome, and the investigation should comprise genes involved in both cardiomyopathies and cardiac channelopathies.European Journal of Human Genetics advance online publication, 21 September 2016; doi:10.1038/ejhg.2016.118....

  1. Using RNAi in C. "elegans" to Demonstrate Gene Knockdown Phenotypes in the Undergraduate Biology Lab Setting

    Science.gov (United States)

    Roy, Nicole M.

    2013-01-01

    RNA interference (RNAi) is a powerful technology used to knock down genes in basic research and medicine. In 2006 RNAi technology using "Caenorhabditis elegans" ("C. elegans") was awarded the Nobel Prize in medicine and thus students graduating in the biological sciences should have experience with this technology. However,…

  2. Development of a new set of reference genes for normalization of real-time RT-PCR data of porcine backfat and longissimus dorsi muscle, and evaluation with PPARGC1A

    Directory of Open Access Journals (Sweden)

    Van Zeveren Alex

    2006-10-01

    Full Text Available Abstract Background An essential part of using real-time RT-PCR is that expression results have to be normalized before any conclusions can be drawn. This can be done by using one or multiple, validated reference genes, depending on the desired accuracy of the results. In the pig however, very little information is available on the expression stability of reference genes. The aim of this study was therefore to develop a new set of reference genes which can be used for normalization of mRNA expression data of genes expressed in porcine backfat and longissimus dorsi muscle, both representing an economically important part of a pig's carcass. Because of its multiple functions in fat metabolism and muscle fibre type composition, peroxisome proliferative activated receptor γ coactivator 1α (PPARGC1A is a very interesting candidate gene for meat quality, and was an ideal gene to evaluate our developed set of reference genes for normalization of mRNA expression data of both tissue types. Results The mRNA expression stability of 10 reference genes was determined. The expression of RPL13A and SDHA appeared to be highly unstable. After normalization to the geometric mean of the three most stably expressed reference genes (ACTB, TBP and TOP2B, the results not only showed that the mRNA expression of PPARGC1A was significantly higher in each of the longissimus dorsi muscle samples than in backfat (P Conclusion This study provides a new set of reference genes (ACTB, TBP and TOP2B suitable for normalization of real-time RT-PCR data of backfat and longissimus dorsi muscle in the pig. The obtained PPARGC1A expression results, after application of this set of reference genes, are a first step in unravelling the PPARGC1A expression pattern in the pig and provide a basis for possible selection towards improved meat quality while maintaining a lean carcass.

  3. Definitional ceremonies: narrative practices for psychologists to inform interdisciplinary teams' understanding of children's spirituality in pediatric settings.

    Science.gov (United States)

    Moore, Kelsey; Talwar, Victoria; Moxley-Haegert, Linda

    2015-03-01

    In pediatric settings, parents and children often seek spiritual and religious support from their healthcare provider, as they try to find meaning in their illness. Narrative practices, such as definitional ceremonies, can provide a unique framework for psychologists to explore children's spirituality and its role in the midst of illness. In addition, definitional ceremonies can be used as a means for psychologists to inform interdisciplinary teams' understanding of children's spirituality and its relevance in pediatric treatment settings. In this article, our objectives are to (a) provide a brief overview of the literature on children's spirituality, (b) review some of the literature on childhood cancer patients' spirituality, (c) highlight the importance of whole-person care for diverse pediatric patients, and (d) introduce definitional ceremonies as appropriate narrative practices that psychologists can use to both guide their therapy and inform interdisciplinary teams' understanding of children's spirituality.

  4. Applying information and communications technologies to collect health data from remote settings: a systematic assessment of current technologies.

    Science.gov (United States)

    Ashar, Raj; Lewis, Sheri; Blazes, David L; Chretien, J P

    2010-04-01

    Modern information and communications technologies (ICTs) are now so feature-rich and widely available that they can be used to "capture," or collect and transmit, health data from remote settings. Electronic data capture can reduce the time necessary to notify public health authorities, and provide important baseline information. A number of electronic health data capture systems based on specific ICTs have been developed for remote areas. We expand on that body of work by defining and applying an assessment process to characterize ICTs for remote-area health data capture. The process is based on technical criteria, and assesses the feasibility and effectiveness of specific technologies according to the resources and constraints of a given setting. Our characterization of current ICTs compares different system architectures for remote-area health data capture systems. Ultimately, we believe that our criteria-based assessment process will remain useful for characterizing future ICTs.

  5. Anesthesia information management systems in the ambulatory setting: benefits and challenges.

    Science.gov (United States)

    Gottlieb, Ori

    2014-06-01

    Adopting an anesthesia information management system (AIMS) is a challenge for anesthesia departments. The transition requires a physician champion and the support of members in every section. This change can be facilitated by visiting similar institutions that are already using AIMS, shadow charting for a sufficient period of time, and understanding that optimization continues after the go-live date. Once implemented, the benefits outweigh the challenges, but understanding where the potential obstacles lie is critical to removing them efficiently and effectively. As different AIMS continue to spread throughout the medical world, so will their benefits. Copyright © 2014 Elsevier Inc. All rights reserved.

  6. Setting up The Geological information and modelling Thematic Core Service for EPOS

    Science.gov (United States)

    Grellet, Sylvain; Häner, Rainer; Pedersen, Mikael; Lorenz, Henning; Carter, Mary; Cipolloni, Carlo; Robida, François

    2017-04-01

    Geological data and models are key assets for the EPOS community. The Geological information and modelling Thematic Core Service of EPOS is being designed as an efficient and sustainable access system for geological multi-scale data assets for EPOS through the integration of distributed infrastructure components (nodes) of geological surveys, research institutes and the international drilling community (ICDP/IODP). The TCS will develop and take benefit of the synergy between the existing data infrastructures of the Geological Surveys of Europe (EuroGeoSurveys / OneGeology-Europe / EGDI) and of the large amount of information produced by the research organisations. These nodes will offer a broad range of resources including: geological maps, borehole data, borehole associated observations (borehole log data, groundwater level, groundwater quality…) and archived information on physical material (samples, cores), geological models (3D, 4D), geohazards, geophysical data such as active seismic data and other analyses of rocks, soils and minerals. The services will be implemented based on international standards (such as INSPIRE, IUGS/CGI, OGC, W3C, ISO) in order to guarantee their interoperability with other EPOS TCS as well as their compliance with INSPIRE European Directive or international initiatives (such as OneGeology). We present the implementation of the thematic core services for geology and modelling, including scheduling of the development of the different components. The activity with the OGC groups already started in 2016 through an ad-hoc meeting on Borehole and 3D/4D and the way both will be interlinked will also be introduced. This will provide future virtual research environments with means to facilitate the use of existing information for future applications. In addition, workflows will be established that allow the integration of other existing and new data and applications. Processing and the use of simulation and visualization tools will

  7. Clinicians, security and information technology support services in practice settings--a pilot study.

    Science.gov (United States)

    Fernando, Juanita

    2010-01-01

    This case study of 9 information technology (IT) support staff in 3 Australian (Victoria) public hospitals juxtaposes their experiences at the user-level of eHealth security in the Natural Hospital Environment with that previously reported by 26 medical, nursing and allied healthcare clinicians. IT support responsibilities comprised the entire hospital, of which clinician eHealth security needs were only part. IT staff believed their support tasks were often fragmented while work responsibilities were hampered by resources shortages. They perceived clinicians as an ongoing security risk to private health information. By comparison clinicians believed IT staff would not adequately support the private and secure application of eHealth for patient care. Preliminary data analysis suggests the tension between these cohorts manifests as an eHealth environment where silos of clinical work are disconnected from silos of IT support work. The discipline-based silos hamper health privacy outcomes. Privacy and security policies, especially those influencing the audit process, will benefit by further research of this phenomenon.

  8. Informational structure of genetic sequences and nature of gene splicing

    Science.gov (United States)

    Trifonov, E. N.

    1991-10-01

    Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

  9. The analysis of translation-related gene set boosts debates around origin and evolution of mimiviruses

    Science.gov (United States)

    Colson, Philippe; La Scola, Bernard

    2017-01-01

    The giant mimiviruses challenged the well-established concept of viruses, blurring the roots of the tree of life, mainly due to their genetic content. Along with other nucleo-cytoplasmic large DNA viruses, they compose a new proposed order—named Megavirales—whose origin and evolution generate heated debate in the scientific community. The presence of an arsenal of genes not widespread in the virosphere related to important steps of the translational process, including transfer RNAs, aminoacyl-tRNA synthetases, and translation factors for peptide synthesis, constitutes an important element of this debate. In this review, we highlight the main findings to date about the translational machinery of the mimiviruses and compare their distribution along the distinct members of the family Mimiviridae. Furthermore, we discuss how the presence and/or absence of the translation-related genes among mimiviruses raises important insights to boost the debate on their origin and evolutionary history. PMID:28207761

  10. GAMYB controls different sets of genes and is differentially regulated by microRNA in aleurone cells and anthers.

    Science.gov (United States)

    Tsuji, Hiroyuki; Aya, Koichiro; Ueguchi-Tanaka, Miyako; Shimada, Yukihisa; Nakazono, Mikio; Watanabe, Ryosuke; Nishizawa, Naoko K; Gomi, Kenji; Shimada, Asako; Kitano, Hidemi; Ashikari, Motoyuki; Matsuoka, Makoto

    2006-08-01

    GAMYB is a component of gibberellin (GA) signaling in cereal aleurone cells, and has an important role in flower development. However, it is unclear how GAMYB function is regulated. We examined the involvement of a microRNA, miR159, in the regulation of GAMYB expression in cereal aleurone cells and flower development. In aleurone cells, no miR159 expression was observed with or without GA treatment, suggesting that miR159 is not involved in the regulation of GAMYB and GAMYB-like genes in this tissue. miR159 was expressed in tissues other than aleurone, and miR159 over-expressors showed similar but more severe phenotypes than the gamyb mutant. GAMYB and GAMYB-like genes are co-expressed with miR159 in anthers, and the mRNA levels for GAMYB and GAMYB-like genes are negatively correlated with miR159 levels during anther development. Thus, OsGAMYB and OsGAMYB-like genes are regulated by miR159 in flowers. A microarray analysis revealed that OsGAMYB and its upstream regulator SLR1 are involved in the regulation of almost all GA-mediated gene expression in rice aleurone cells. Moreover, different sets of genes are regulated by GAMYB in aleurone cells and anthers. GAMYB binds directly to promoter regions of its target genes in anthers as well as aleurone cells. Based on these observations, we suggest that the regulation of GAMYB expression and GAMYB function are different in aleurone cells and flowers in rice.

  11. Transcript and protein profiling identify candidate gene sets of potential adaptive significance in New Zealand Pachycladon

    Directory of Open Access Journals (Sweden)

    Schmidt Silvia

    2010-05-01

    Full Text Available Abstract Background Transcript profiling of closely related species provides a means for identifying genes potentially important in species diversification. However, the predictive value of transcript profiling for inferring downstream-physiological processes has been unclear. In the present study we use shotgun proteomics to validate inferences from microarray studies regarding physiological differences in three Pachycladon species. We compare transcript and protein profiling and evaluate their predictive value for inferring glucosinolate chemotypes characteristic of these species. Results Evidence from heterologous microarrays and shotgun proteomics revealed differential expression of genes involved in glucosinolate hydrolysis (myrosinase-associated proteins and biosynthesis (methylthioalkylmalate isomerase and dehydrogenase, the interconversion of carbon dioxide and bicarbonate (carbonic anhydrases, water use efficiency (ascorbate peroxidase, 2 cys peroxiredoxin, 20 kDa chloroplastic chaperonin, mitochondrial succinyl CoA ligase and others (glutathione-S-transferase, serine racemase, vegetative storage proteins, genes related to translation and photosynthesis. Differences in glucosinolate hydrolysis products were directly confirmed. Overall, prediction of protein abundances from transcript profiles was stronger than prediction of transcript abundance from protein profiles. Protein profiles also proved to be more accurate predictors of glucosinolate profiles than transcript profiles. The similarity of species profiles for both transcripts and proteins reflected previously inferred phylogenetic relationships while glucosinolate chemotypes did not. Conclusions We have used transcript and protein profiling to predict physiological processes that evolved differently during diversification of three Pachycladon species. This approach has also identified candidate genes potentially important in adaptation, which are now the focus of ongoing study

  12. Microarray analysis identifies a common set of cellular genes modulated by different HCV replicon clones

    OpenAIRE

    Gerosolimo Germano; Dallapiccola Bruno; Bruni Roberto; Ferraris Alessandro; Tataseo Paola; Tritarelli Elena; Marcantonio Cinzia; Ciccaglione Anna; Costantino Angela; Rapicetta Maria

    2008-01-01

    Abstract Background Hepatitis C virus (HCV) RNA synthesis and protein expression affect cell homeostasis by modulation of gene expression. The impact of HCV replication on global cell transcription has not been fully evaluated. Thus, we analysed the expression profiles of different clones of human hepatoma-derived Huh-7 cells carrying a self-replicating HCV RNA which express all viral proteins (HCV replicon system). Results First, we compared the expression profile of HCV replicon clone 21-5 ...

  13. The smallest known genomes of multicellular and toxic cyanobacteria: comparison, minimal gene sets for linked traits and the evolutionary implications.

    Directory of Open Access Journals (Sweden)

    Karina Stucken

    Full Text Available Cyanobacterial morphology is diverse, ranging from unicellular spheres or rods to multicellular structures such as colonies and filaments. Multicellular species represent an evolutionary strategy to differentiate and compartmentalize certain metabolic functions for reproduction and nitrogen (N(2 fixation into specialized cell types (e.g. akinetes, heterocysts and diazocytes. Only a few filamentous, differentiated cyanobacterial species, with genome sizes over 5 Mb, have been sequenced. We sequenced the genomes of two strains of closely related filamentous cyanobacterial species to yield further insights into the molecular basis of the traits of N(2 fixation, filament formation and cell differentiation. Cylindrospermopsis raciborskii CS-505 is a cylindrospermopsin-producing strain from Australia, whereas Raphidiopsis brookii D9 from Brazil synthesizes neurotoxins associated with paralytic shellfish poisoning (PSP. Despite their different morphology, toxin composition and disjunct geographical distribution, these strains form a monophyletic group. With genome sizes of approximately 3.9 (CS-505 and 3.2 (D9 Mb, these are the smallest genomes described for free-living filamentous cyanobacteria. We observed remarkable gene order conservation (synteny between these genomes despite the difference in repetitive element content, which accounts for most of the genome size difference between them. We show here that the strains share a specific set of 2539 genes with >90% average nucleotide identity. The fact that the CS-505 and D9 genomes are small and streamlined compared to those of other filamentous cyanobacterial species and the lack of the ability for heterocyst formation in strain D9 allowed us to define a core set of genes responsible for each trait in filamentous species. We presume that in strain D9 the ability to form proper heterocysts was secondarily lost together with N(2 fixation capacity. Further comparisons to all available cyanobacterial

  14. The smallest known genomes of multicellular and toxic cyanobacteria: comparison, minimal gene sets for linked traits and the evolutionary implications.

    Science.gov (United States)

    Stucken, Karina; John, Uwe; Cembella, Allan; Murillo, Alejandro A; Soto-Liebe, Katia; Fuentes-Valdés, Juan J; Friedel, Maik; Plominsky, Alvaro M; Vásquez, Mónica; Glöckner, Gernot

    2010-02-16

    Cyanobacterial morphology is diverse, ranging from unicellular spheres or rods to multicellular structures such as colonies and filaments. Multicellular species represent an evolutionary strategy to differentiate and compartmentalize certain metabolic functions for reproduction and nitrogen (N(2)) fixation into specialized cell types (e.g. akinetes, heterocysts and diazocytes). Only a few filamentous, differentiated cyanobacterial species, with genome sizes over 5 Mb, have been sequenced. We sequenced the genomes of two strains of closely related filamentous cyanobacterial species to yield further insights into the molecular basis of the traits of N(2) fixation, filament formation and cell differentiation. Cylindrospermopsis raciborskii CS-505 is a cylindrospermopsin-producing strain from Australia, whereas Raphidiopsis brookii D9 from Brazil synthesizes neurotoxins associated with paralytic shellfish poisoning (PSP). Despite their different morphology, toxin composition and disjunct geographical distribution, these strains form a monophyletic group. With genome sizes of approximately 3.9 (CS-505) and 3.2 (D9) Mb, these are the smallest genomes described for free-living filamentous cyanobacteria. We observed remarkable gene order conservation (synteny) between these genomes despite the difference in repetitive element content, which accounts for most of the genome size difference between them. We show here that the strains share a specific set of 2539 genes with >90% average nucleotide identity. The fact that the CS-505 and D9 genomes are small and streamlined compared to those of other filamentous cyanobacterial species and the lack of the ability for heterocyst formation in strain D9 allowed us to define a core set of genes responsible for each trait in filamentous species. We presume that in strain D9 the ability to form proper heterocysts was secondarily lost together with N(2) fixation capacity. Further comparisons to all available cyanobacterial genomes

  15. In vivo validation of a computationally predicted conserved Ath5 target gene set.

    Directory of Open Access Journals (Sweden)

    Filippo Del Bene

    2007-09-01

    Full Text Available So far, the computational identification of transcription factor binding sites is hampered by the complexity of vertebrate genomes. Here we present an in silico procedure to predict target sites of a transcription factor in complex genomes using its binding site. In a first step sequence, comparison of closely related genomes identifies the binding sites in conserved cis-regulatory regions (phylogenetic footprinting. Subsequently, more remote genomes are introduced into the comparison to identify highly conserved and therefore putatively functional binding sites (phylogenetic filtering. When applied to the binding site of atonal homolog 5 (Ath5 or ATOH7, this procedure efficiently filters evolutionarily conserved binding sites out of more than 300,000 instances in a vertebrate genome. We validate a selection of the linked target genes by showing coexpression with and transcriptional regulation by Ath5. Finally, chromatin immunoprecipitation demonstrates the occupancy of the target gene promoters by Ath5. Thus, our procedure, applied to whole genomes, is a fast and predictive tool to in silico filter the target genes of a given transcription factor with defined binding site.

  16. PKA phosphorylation redirects ERα to promoters of a unique gene set to induce tamoxifen resistance.

    Science.gov (United States)

    de Leeuw, R; Flach, K; Bentin Toaldo, C; Alexi, X; Canisius, S; Neefjes, J; Michalides, R; Zwart, W

    2013-07-25

    Protein kinase A (PKA)-induced estrogen receptor alpha (ERα) phosphorylation at serine residue 305 (ERαS305-P) can induce tamoxifen (TAM) resistance in breast cancer. How this phospho-modification affects ERα specificity and translates into TAM resistance is unclear. Here, we show that S305-P modification of ERα reprograms the receptor, redirecting it to new transcriptional start sites, thus modulating the transcriptome. By altering the chromatin-binding pattern, Ser305 phosphorylation of ERα translates into a 26-gene expression classifier that identifies breast cancer patients with a poor disease outcome after TAM treatment. MYC-target genes and networks were significantly enriched in this gene classifier that includes a number of selective targets for ERαS305-P. The enhanced expression of MYC increased cell proliferation in the presence of TAM. We demonstrate that activation of the PKA signaling pathway alters the transcriptome by redirecting ERα to new transcriptional start sites, resulting in altered transcription and TAM resistance.

  17. Health Information Technology, Patient Safety, and Professional Nursing Care Documentation in Acute Care Settings.

    Science.gov (United States)

    Lavin, Mary Ann; Harper, Ellen; Barr, Nancy

    2015-04-14

    The electronic health record (EHR) is a documentation tool that yields data useful in enhancing patient safety, evaluating care quality, maximizing efficiency, and measuring staffing needs. Although nurses applaud the EHR, they also indicate dissatisfaction with its design and cumbersome electronic processes. This article describes the views of nurses shared by members of the Nursing Practice Committee of the Missouri Nurses Association; it encourages nurses to share their EHR concerns with Information Technology (IT) staff and vendors and to take their place at the table when nursing-related IT decisions are made. In this article, we describe the experiential-reflective reasoning and action model used to understand staff nurses' perspectives, share committee reflections and recommendations for improving both documentation and documentation technology, and conclude by encouraging nurses to develop their documentation and informatics skills. Nursing issues include medication safety, documentation and standards of practice, and EHR efficiency. IT concerns include interoperability, vendors, innovation, nursing voice, education, and collaboration.

  18. Stereoscopy in Astronomical Visualizations to Support Learning at Informal Education Settings

    Science.gov (United States)

    Price, Aaron; Lee, Hee-Sun

    2015-08-01

    Stereoscopy has been used in science education for 100 years. Recent innovations in low cost technology as well as trends in the entertainment industry have made stereoscopy popular among educators and audiences alike. However, experimental studies addressing whether stereoscopy actually impacts science learning are limited. Over the last decade, we have conducted a series of quasi-experimental and experimental studies on how children and adult visitors in science museums and planetariums learned about the structure and function of highly spatial scientific objects such as galaxies, supernova, etc. We present a synthesis of the results from these studies and implications for stereoscopic visualization development. The overall finding is that the impact of stereoscopy on perceptions of scientific objects is limited when presented as static imagery. However, when presented as full motion films, a significantly positive impact was detected. To conclude, we present a set of stereoscopic design principles that can help design astronomical stereoscopic films that support deep and effective learning. Our studies cover astronomical content such as the engineering of and imagery from the Mars rovers, artistic stereoscopic imagery of nebulae and a high-resolution stereoscopic film about how astronomers measure and model the structure of our galaxy.

  19. PatentMatrix: an automated tool to survey patents related to large sets of genes or proteins

    Directory of Open Access Journals (Sweden)

    de Rinaldis Emanuele

    2007-09-01

    Full Text Available Abstract Background The number of patents associated with genes and proteins and the amount of information contained in each patent often present a real obstacle to the rapid evaluation of the novelty of findings associated to genes from an intellectual property (IP perspective. This assessment, normally carried out by expert patent professionals, can therefore become cumbersome and time consuming. Here we present PatentMatrix, a novel software tool for the automated analysis of patent sequence text entries. Methods and Results PatentMatrix is written in the Awk language and requires installation of the Derwent GENESEQ™ patent sequence database under the sequence retrieval system SRS. The software works by taking as input two files: i a list of genes or proteins with the associated GENESEQ™ patent sequence accession numbers ii a list of keywords describing the research context of interest (e.g. 'lung', 'cancer', 'therapeutics', 'diagnostics'. The GENESEQ™ database is interrogated through the SRS system and each patent entry of interest is screened for the occurrence of user-defined keywords. Moreover, the software extracts the basic information useful for a preliminary assessment of the IP coverage of each patent from the GENESEQ™ database. As output, two tab-delimited files are generated which provide the user with a detailed and an aggregated view of the results. An example is given where the IP position of five genes is evaluated in the context of 'development of antibodies for cancer treatment' Conclusion PatentMatrix allows a rapid survey of patents associated with genes or proteins in a particular area of interest as defined by keywords. It can be efficiently used to evaluate the IP-related novelty of scientific findings and to rank genes or proteins according to their IP position.

  20. Inference of Gene Regulatory Networks Using Bayesian Nonparametric Regression and Topology Information

    Science.gov (United States)

    2017-01-01

    Gene regulatory networks (GRNs) play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result. PMID:28133490

  1. Inference of Gene Regulatory Networks Using Bayesian Nonparametric Regression and Topology Information

    Directory of Open Access Journals (Sweden)

    Yue Fan

    2017-01-01

    Full Text Available Gene regulatory networks (GRNs play an important role in cellular systems and are important for understanding biological processes. Many algorithms have been developed to infer the GRNs. However, most algorithms only pay attention to the gene expression data but do not consider the topology information in their inference process, while incorporating this information can partially compensate for the lack of reliable expression data. Here we develop a Bayesian group lasso with spike and slab priors to perform gene selection and estimation for nonparametric models. B-spline basis functions are used to capture the nonlinear relationships flexibly and penalties are used to avoid overfitting. Further, we incorporate the topology information into the Bayesian method as a prior. We present the application of our method on DREAM3 and DREAM4 datasets and two real biological datasets. The results show that our method performs better than existing methods and the topology information prior can improve the result.

  2. Informational gene phylogenies do not support a fourth domain of life for nucleocytoplasmic large DNA viruses.

    Directory of Open Access Journals (Sweden)

    Tom A Williams

    Full Text Available Mimivirus is a nucleocytoplasmic large DNA virus (NCLDV with a genome size (1.2 Mb and coding capacity ( 1000 genes comparable to that of some cellular organisms. Unlike other viruses, Mimivirus and its NCLDV relatives encode homologs of broadly conserved informational genes found in Bacteria, Archaea, and Eukaryotes, raising the possibility that they could be placed on the tree of life. A recent phylogenetic analysis of these genes showed the NCLDVs emerging as a monophyletic group branching between Eukaryotes and Archaea. These trees were interpreted as evidence for an independent "fourth domain" of life that may have contributed DNA processing genes to the ancestral eukaryote. However, the analysis of ancient evolutionary events is challenging, and tree reconstruction is susceptible to bias resulting from non-phylogenetic signals in the data. These include compositional heterogeneity and homoplasy, which can lead to the spurious grouping of compositionally-similar or fast-evolving sequences. Here, we show that these informational gene alignments contain both significant compositional heterogeneity and homoplasy, which were not adequately modelled in the original analysis. When we use more realistic evolutionary models that better fit the data, the resulting trees are unable to reject a simple null hypothesis in which these informational genes, like many other NCLDV genes, were acquired by horizontal transfer from eukaryotic hosts. Our results suggest that a fourth domain is not required to explain the available sequence data.

  3. Gene

    Data.gov (United States)

    U.S. Department of Health & Human Services — Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes,...

  4. Comparative genomic analysis of SET domain family reveals the origin, expansion, and putative function of the arthropod-specific SmydA genes as histone modifiers in insects

    Science.gov (United States)

    Jiang, Feng; Liu, Qing; Wang, Yanli; Zhang, Jie; Wang, Huimin; Song, Tianqi; Yang, Meiling

    2017-01-01

    Abstract The SET domain is an evolutionarily conserved motif present in histone lysine methyltransferases, which are important in the regulation of chromatin and gene expression in animals. In this study, we searched for SET domain–containing genes (SET genes) in all of the 147 arthropod genomes sequenced at the time of carrying out this experiment to understand the evolutionary history by which SET domains have evolved in insects. Phylogenetic and ancestral state reconstruction analysis revealed an arthropod-specific SET gene family, named SmydA, that is ancestral to arthropod animals and specifically diversified during insect evolution. Considering that pseudogenization is the most probable fate of the new emerging gene copies, we provided experimental and evolutionary evidence to demonstrate their essential functions. Fluorescence in situ hybridization analysis and in vitro methyltransferase activity assays showed that the SmydA-2 gene was transcriptionally active and retained the original histone methylation activity. Expression knockdown by RNA interference significantly increased mortality, implying that the SmydA genes may be essential for insect survival. We further showed predominantly strong purifying selection on the SmydA gene family and a potential association between the regulation of gene expression and insect phenotypic plasticity by transcriptome analysis. Overall, these data suggest that the SmydA gene family retains essential functions that may possibly define novel regulatory pathways in insects. This work provides insights into the roles of lineage-specific domain duplication in insect evolution. PMID:28444351

  5. Functional Gene-Set Analysis Does Not Support a Major Role for Synaptic Function in Attention Deficit/Hyperactivity Disorder (ADHD

    Directory of Open Access Journals (Sweden)

    Anke R. Hammerschlag

    2014-07-01

    Full Text Available Attention Deficit/Hyperactivity Disorder (ADHD is one of the most common childhood-onset neuropsychiatric disorders. Despite high heritability estimates, genome-wide association studies (GWAS have failed to find significant genetic associations, likely due to the polygenic character of ADHD. Nevertheless, genetic studies suggested the involvement of several processes important for synaptic function. Therefore, we applied a functional gene-set analysis to formally test whether synaptic functions are associated with ADHD. Gene-set analysis tests the joint effect of multiple genetic variants in groups of functionally related genes. This method provides increased statistical power compared to conventional GWAS. We used data from the Psychiatric Genomics Consortium including 896 ADHD cases and 2455 controls, and 2064 parent-affected offspring trios, providing sufficient statistical power to detect gene sets representing a genotype relative risk of at least 1.17. Although all synaptic genes together showed a significant association with ADHD, this association was not stronger than that of randomly generated gene sets matched for same number of genes. Further analyses showed no association of specific synaptic function categories with ADHD after correction for multiple testing. Given current sample size and gene sets based on current knowledge of genes related to synaptic function, our results do not support a major role for common genetic variants in synaptic genes in the etiology of ADHD.

  6. Root Exudates of Various Host Plants of Rhizobium leguminosarum Contain Different Sets of Inducers of Rhizobium Nodulation Genes.

    Science.gov (United States)

    Zaat, S A; Wijffelman, C A; Mulders, I H; van Brussel, A A; Lugtenberg, B J

    1988-04-01

    Rhizobium promoters involved in the formation of root nodules on leguminous plants are activated by flavonoids in plant root exudate. A series of Rhizobium strains which all contain the inducible Rhizobium leguminosarum nodA promoter fused to the Escherichia coli lacZ gene, and which differ only in the source of the regulatory nodD gene, were recently used to show that the regulatory nodD gene determines which flavonoids are able to activate the nodA promoter (HP Spaink, CA Wijffelman, E Pees, RJH Okker, BJJ Lugtenberg 1987 Nature 328: 337-340). Since these strains therefore are able to discriminate between various flavonoids, they were used to determine whether or not plants that are nodulated by R. leguminosarum produce different inducers. After chromatographic separation of root exudate constituents from Vicia sativa L. subsp. nigra (L.), V. hirsuta (L.) S.F. Gray, Pisum sativum L. cv Rondo, and Trifolium subterraneum L., the fractions were tested with a set of strains containing a nodD gene of R. leguminosarum, R. trifolii, or Rhizobium meliloti, respectively. It appeared that the source of nodD determined whether, and to what extent, the R. leguminosarum nodA promoter was induced. Lack of induction could not be attributed to the presence of inhibitors. Most of the inducers were able to activate the nodA promoter in the presence of one particular nodD gene only. The inducers that were active in the presence of the R. leguminosarum nodD gene were different in each root exudate.

  7. Gene by Social-Context Interactions for Number of Sexual Partners Among White Male Youths: Genetics-informed Sociology

    Science.gov (United States)

    Guo, Guang; Tong, Yuying; Cai, Tianji

    2010-01-01

    In this study, we set out to investigate whether introducing molecular genetic measures into an analysis of sexual partner variety will yield novel sociological insights. The data source is the white male DNA sample in the National Longitudinal Study of Adolescent Health. Our empirical analysis has produced a robust protective effect of the 9R/9R genotype relative to the Any10R genotype in the dopamine transporter gene (DAT1). The gene-environment interaction analysis demonstrates that the protective effect of 9R/9R tends to be lost in schools in which higher proportions of students start having sex early or among those with relatively low levels of cognitive ability. Our genetics-informed sociological analysis suggests that the “one size” of a single social theory may not fit all. Explaining a human trait or behavior may require a theory that accommodates the complex interplay between social contextual and individual influences and genetic predispositions. PMID:19569400

  8. The transcriptional response to encystation stimuli in Giardia lamblia is restricted to a small set of genes.

    Science.gov (United States)

    Morf, Laura; Spycher, Cornelia; Rehrauer, Hubert; Fournier, Catharine Aquino; Morrison, Hilary G; Hehl, Adrian B

    2010-10-01

    The protozoan parasite Giardia lamblia undergoes stage differentiation in the small intestine of the host to an environmentally resistant and infectious cyst. Encystation involves the secretion of an extracellular matrix comprised of cyst wall proteins (CWPs) and a β(1-3)-GalNAc homopolymer. Upon the induction of encystation, genes coding for CWPs are switched on, and mRNAs coding for a Myb transcription factor and enzymes involved in cyst wall glycan synthesis are upregulated. Encystation in vitro is triggered by several protocols, which call for changes in bile concentrations or availability of lipids, and elevated pH. However, the conditions for induction are not standardized and we predicted significant protocol-specific side effects. This makes reliable identification of encystation factors difficult. Here, we exploited the possibility of inducing encystation with two different protocols, which we show to be equally effective, for a comparative mRNA profile analysis. The standard encystation protocol induced a bipartite transcriptional response with surprisingly minor involvement of stress genes. A comparative analysis revealed a core set of only 18 encystation genes and showed that a majority of genes was indeed upregulated as a side effect of inducing conditions. We also established a Myb binding sequence as a signature motif in encystation promoters, suggesting coordinated regulation of these factors.

  9. HoxBlinc RNA Recruits Set1/MLL Complexes to Activate Hox Gene Expression Patterns and Mesoderm Lineage Development

    Directory of Open Access Journals (Sweden)

    Changwang Deng

    2016-01-01

    Full Text Available Trithorax proteins and long-intergenic noncoding RNAs are critical regulators of embryonic stem cell pluripotency; however, how they cooperatively regulate germ layer mesoderm specification remains elusive. We report here that HoxBlinc RNA first specifies Flk1+ mesoderm and then promotes hematopoietic differentiation through regulation of hoxb pathways. HoxBlinc binds to the hoxb genes, recruits Setd1a/MLL1 complexes, and mediates long-range chromatin interactions to activate transcription of the hoxb genes. Depletion of HoxBlinc by shRNA-mediated knockdown or CRISPR-Cas9-mediated genetic deletion inhibits expression of hoxb genes and other factors regulating cardiac/hematopoietic differentiation. Reduced hoxb expression is accompanied by decreased recruitment of Set1/MLL1 and H3K4me3 modification, as well as by reduced chromatin loop formation. Re-expression of hoxb2–b4 genes in HoxBlinc-depleted embryoid bodies rescues Flk1+ precursors that undergo hematopoietic differentiation. Thus, HoxBlinc plays an important role in controlling hoxb transcription networks that mediate specification of mesoderm-derived Flk1+ precursors and differentiation of Flk1+ cells into hematopoietic lineages.

  10. The mammalian adult neurogenesis gene ontology (MANGO) provides a structural framework for published information on genes regulating adult hippocampal neurogenesis.

    Science.gov (United States)

    Overall, Rupert W; Paszkowski-Rogacz, Maciej; Kempermann, Gerd

    2012-01-01

    Adult hippocampal neurogenesis is not a single phenotype, but consists of a number of sub-processes, each of which is under complex genetic control. Interpretation of gene expression studies using existing resources often does not lead to results that address the interrelatedness of these processes. Formal structure, such as provided by ontologies, is essential in any field for comprehensive interpretation of existing knowledge but, until now, such a structure has been lacking for adult neurogenesis. We have created a resource with three components 1. A structured ontology describing the key stages in the development of adult hippocampal neural stem cells into functional granule cell neurons. 2. A comprehensive survey of the literature to annotate the results of all published reports on gene function in adult hippocampal neurogenesis (257 manuscripts covering 228 genes) to the appropriate terms in our ontology. 3. An easy-to-use searchable interface to the resulting database made freely available online. The manuscript presents an overview of the database highlighting global trends such as the current bias towards research on early proliferative stages, and an example gene set enrichment analysis. A limitation of the resource is the current scope of the literature which, however, is growing by around 100 publications per year. With the ontology and database in place, new findings can be rapidly annotated and regular updates of the database will be made publicly available. The resource we present allows relevant interpretation of gene expression screens in terms of defined stages of postnatal neuronal development. Annotation of genes by hand from the adult neurogenesis literature ensures the data are directly applicable to the system under study. We believe this approach could also serve as an example to other fields in a 'bottom-up' community effort complementing the already successful 'top-down' approach of the Gene Ontology.

  11. The mammalian adult neurogenesis gene ontology (MANGO provides a structural framework for published information on genes regulating adult hippocampal neurogenesis.

    Directory of Open Access Journals (Sweden)

    Rupert W Overall

    Full Text Available BACKGROUND: Adult hippocampal neurogenesis is not a single phenotype, but consists of a number of sub-processes, each of which is under complex genetic control. Interpretation of gene expression studies using existing resources often does not lead to results that address the interrelatedness of these processes. Formal structure, such as provided by ontologies, is essential in any field for comprehensive interpretation of existing knowledge but, until now, such a structure has been lacking for adult neurogenesis. METHODOLOGY/PRINCIPAL FINDINGS: We have created a resource with three components 1. A structured ontology describing the key stages in the development of adult hippocampal neural stem cells into functional granule cell neurons. 2. A comprehensive survey of the literature to annotate the results of all published reports on gene function in adult hippocampal neurogenesis (257 manuscripts covering 228 genes to the appropriate terms in our ontology. 3. An easy-to-use searchable interface to the resulting database made freely available online. The manuscript presents an overview of the database highlighting global trends such as the current bias towards research on early proliferative stages, and an example gene set enrichment analysis. A limitation of the resource is the current scope of the literature which, however, is growing by around 100 publications per year. With the ontology and database in place, new findings can be rapidly annotated and regular updates of the database will be made publicly available. CONCLUSIONS/SIGNIFICANCE: The resource we present allows relevant interpretation of gene expression screens in terms of defined stages of postnatal neuronal development. Annotation of genes by hand from the adult neurogenesis literature ensures the data are directly applicable to the system under study. We believe this approach could also serve as an example to other fields in a 'bottom-up' community effort complementing the already

  12. Use of Information Measures and Their Approximations to Detect Predictive Gene-Gene Interaction

    Directory of Open Access Journals (Sweden)

    Jan Mielniczuk

    2017-01-01

    Full Text Available We reconsider the properties and relationships of the interaction information and its modified versions in the context of detecting the interaction of two SNPs for the prediction of a binary outcome when interaction information is positive. This property is called predictive interaction, and we state some new sufficient conditions for it to hold true. We also study chi square approximations to these measures. It is argued that interaction information is a different and sometimes more natural measure of interaction than the logistic interaction parameter especially when SNPs are dependent. We introduce a novel measure of predictive interaction based on interaction information and its modified version. In numerical experiments, which use copulas to model dependence, we study examples when the logistic interaction parameter is zero or close to zero for which predictive interaction is detected by the new measure, while it remains undetected by the likelihood ratio test.

  13. A Method of Gene-Function Annotation Based on Variable Precision Rough Sets

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    It is very important in the field of bioinformatics to apply computer to perform the function annotation for new sequenced bio-sequences. Based on GO database and BLAST program, a novel method for the function annotation of new biological sequences is presented by using the variable-precision rough set theory. The proposed method is applied to the real data in GO database to examine its effectiveness. Numerical results show that the proposed method has better precision, recall-rate and harmonic mean value compared with existing methods.

  14. Search of phenotype related candidate genes using gene ontology-based semantic similarity and protein interaction information: application to Brugada syndrome.

    Science.gov (United States)

    Massanet, Raimon; Gallardo-Chacon, Joan-Josep; Caminal, Pere; Perera, Alexandre

    2009-01-01

    This work presents a methodology for finding phenotype candidate genes starting from a set of known related genes. This is accomplished by automatically mining and organizing the available scientific literature using Gene Ontology-based semantic similarity. As a case study, Brugada syndrome related genes have been used as input in order to obtain a list of other possible candidate genes related with this disease. Brugada anomaly produces a typical alteration in the Electrocardiogram and carriers of the disease show an increased probability of sudden death. Results show a set of semantically coherent proteins that are shown to be related with synaptic transmission and muscle contraction physiological processes.

  15. Mutual information and the fidelity of response of gene regulatory models

    Science.gov (United States)

    Tabbaa, Omar P.; Jayaprakash, C.

    2014-08-01

    We investigate cellular response to extracellular signals by using information theory techniques motivated by recent experiments. We present results for the steady state of the following gene regulatory models found in both prokaryotic and eukaryotic cells: a linear transcription-translation model and a positive or negative auto-regulatory model. We calculate both the information capacity and the mutual information exactly for simple models and approximately for the full model. We find that (1) small changes in mutual information can lead to potentially important changes in cellular response and (2) there are diminishing returns in the fidelity of response as the mutual information increases. We calculate the information capacity using Gillespie simulations of a model for the TNF-α-NF-κ B network and find good agreement with the measured value for an experimental realization of this network. Our results provide a quantitative understanding of the differences in cellular response when comparing experimentally measured mutual information values of different gene regulatory models. Our calculations demonstrate that Gillespie simulations can be used to compute the mutual information of more complex gene regulatory models, providing a potentially useful tool in synthetic biology.

  16. Mutual information and the fidelity of response of gene regulatory models.

    Science.gov (United States)

    Tabbaa, Omar P; Jayaprakash, C

    2014-08-01

    We investigate cellular response to extracellular signals by using information theory techniques motivated by recent experiments. We present results for the steady state of the following gene regulatory models found in both prokaryotic and eukaryotic cells: a linear transcription-translation model and a positive or negative auto-regulatory model. We calculate both the information capacity and the mutual information exactly for simple models and approximately for the full model. We find that (1) small changes in mutual information can lead to potentially important changes in cellular response and (2) there are diminishing returns in the fidelity of response as the mutual information increases. We calculate the information capacity using Gillespie simulations of a model for the TNF-α-NF-κB network and find good agreement with the measured value for an experimental realization of this network. Our results provide a quantitative understanding of the differences in cellular response when comparing experimentally measured mutual information values of different gene regulatory models. Our calculations demonstrate that Gillespie simulations can be used to compute the mutual information of more complex gene regulatory models, providing a potentially useful tool in synthetic biology.

  17. Fuzzy approach to analysis of flood risk based on variable fuzzy sets and improved information diffusion methods

    Directory of Open Access Journals (Sweden)

    Q. Li

    2013-02-01

    Full Text Available The predictive analysis of natural disasters and their consequences is challenging because of uncertainties and incomplete data. The present article studies the use of variable fuzzy sets (VFS and improved information diffusion method (IIDM to construct a composite method. The proposed method aims to integrate multiple factors and quantification of uncertainties within a consistent system for catastrophic risk assessment. The fuzzy methodology is proposed in the area of flood disaster risk assessment to improve probability estimation. The purpose of the current study is to establish a fuzzy model to evaluate flood risk with incomplete data sets. The results of the example indicate that the methodology is effective and practical; thus, it has the potential to forecast the flood risk in flood risk management.

  18. [Expression of SET-NUP214 fusion gene in patients with T-cell acute lymphoblastic leukemia and its clinical significance].

    Science.gov (United States)

    Dai, Hai-Ping; Wang, Qian; Wu, Li-Li; Ping, Na-Na; Wu, Chun-Xiao; Xie, Jun-Dan; Pan, Jin-Lan; Xue, Yong-Quan; Wu, De-Pei; Chen, Su-Ning

    2012-10-01

    This study was aimed to investigate the occurrence and clinical significance of the SET-NUP214 fusion gene in patients with T-cell acute lymphoblastic leukemia (T-ALL), analyse clinical and biological characteristics in this disease. RT-PCR was used to detect the expression of SET-NUP214 fusion gene in 58 T-ALL cases. Interphase FISH and Array-CGH were used to detect the deletion of 9q34. Direct sequencing was applied to detect mutations of PHF6 and NOTCH1. The results showed that 6 out of 58 T-ALL cases (10.3%) were detected to have the SET-NUP214 fusion gene by RT-PCR. Besides T-lineage antigens, expression of CD13 and(or) CD33 were detected in all the 6 cases. Deletions of 9q34 were detected in 4 out of the 6 patients by FISH. Array-CGH results of 3 SET-NUP214 positive T-ALL patients confirmed that this fusion gene was resulted from a cryptic deletion of 9q34.11q34.13. PHF6 and NOTCH1 gene mutations were found in 4 and 5 out of 6 SET-NUP214 positive T-ALL patients, respectively. It is concluded that SET-NUP214 fusion gene is often resulted from del(9)(q34). PHF6 and NOTCH1 mutations may be potential leukemogenic event in SET-NUP214 fusion gene.

  19. Fibroblast and lymphoblast gene expression profiles in schizophrenia: are non-neural cells informative?

    Directory of Open Access Journals (Sweden)

    Nicholas A Matigian

    Full Text Available Lymphoblastoid cell lines (LCLs and fibroblasts provide conveniently derived non-neuronal samples in which to investigate the aetiology of schizophrenia (SZ using gene expression profiling. This assumes that heritable mechanisms associated with risk of SZ have systemic effects and result in changes to gene expression in all tissues. The broad aim of this and other similar studies is that comparison of the transcriptomes of non-neuronal tissues from SZ patients and healthy controls may identify gene/pathway dysregulation underpinning the neurobiological defects associated with SZ. Using microarrays consisting of 18,664 probes we compared gene expression profiles of LCLs from SZ cases and healthy controls. To identify robust associations with SZ that were not patient or tissue specific, we also examined fibroblasts from an independent series of SZ cases and controls using the same microarrays. In both tissue types ANOVA analysis returned approximately the number of differentially expressed genes expected by chance. No genes were significantly differentially expressed in either tissue when corrected for multiple testing. Even using relaxed parameters (p or = 2-fold change between the groups of SZ cases and controls common to both LCLs and fibroblasts. We conclude that despite encouraging data from previous microarray studies assessing non-neural tissues, the lack of a convergent set of differentially expressed genes associated with SZ using fibroblasts and LCLs indicates the utility of non-neuronal tissues for detection of gene expression differences and/or pathways associated with SZ remains to be demonstrated.

  20. Pressure sores and pressure sore prevention in a rehabilitation setting: building information for improving outcomes and allocating resources.

    Science.gov (United States)

    Baggerly, J; DiBlasi, M

    1996-01-01

    Quantifiable information regarding pressure sore prevention and management is a prerequisite for program development, outcome evaluation, and resource allocation. In this study, all patients admitted to an acute rehabilitation setting (N = 446) during a 2-month period were assessed for the presence of a pressure sore, the risk for developing a pressure sore, the rate of agreement between "objective" (Braden scale) and "subjective" (standard nursing admission data) measures of risk and outcome, and the status of pressure sores at discharge. This article provides the details of the project and implications for rehabilitation nursing practice.

  1. Gene set enrichment analysis and ingenuity pathway analysis of metastatic clear cell renal cell carcinoma cell line.

    Science.gov (United States)

    Khan, Mohammed I; Dębski, Konrad J; Dabrowski, Michał; Czarnecka, Anna M; Szczylik, Cezary

    2016-08-01

    In recent years, genome-wide RNA expression analysis has become a routine tool that offers a great opportunity to study and understand the key role of genes that contribute to carcinogenesis. Various microarray platforms and statistical approaches can be used to identify genes that might serve as prognostic biomarkers and be developed as antitumor therapies in the future. Metastatic renal cell carcinoma (mRCC) is a serious, life-threatening disease, and there are few treatment options for patients. In this study, we performed one-color microarray gene expression (4×44K) analysis of the mRCC cell line Caki-1 and the healthy kidney cell line ASE-5063. A total of 1,921 genes were differentially expressed in the Caki-1 cell line (1,023 upregulated and 898 downregulated). Gene Set Enrichment Analysis (GSEA) and Ingenuity Pathway Analysis (IPA) approaches were used to analyze the differential-expression data. The objective of this research was to identify complex biological changes that occur during metastatic development using Caki-1 as a model mRCC cell line. Our data suggest that there are multiple deregulated pathways associated with metastatic clear cell renal cell carcinoma (mccRCC), including integrin-linked kinase (ILK) signaling, leukocyte extravasation signaling, IGF-I signaling, CXCR4 signaling, and phosphoinositol 3-kinase/AKT/mammalian target of rapamycin signaling. The IPA upstream analysis predicted top transcriptional regulators that are either activated or inhibited, such as estrogen receptors, TP53, KDM5B, SPDEF, and CDKN1A. The GSEA approach was used to further confirm enriched pathway data following IPA.

  2. Agenda-setting for Canadian caregivers: using media analysis of the maternity leave benefit to inform the compassionate care benefit.

    Science.gov (United States)

    Dykeman, Sarah; Williams, Allison M

    2014-04-24

    The Compassionate Care Benefit was implemented in Canada in 2004 to support employed informal caregivers, the majority of which we know are women given the gendered nature of caregiving. In order to examine how this policy might evolve over time, we examine the evolution of a similar employment insurance program, Canada's Maternity Leave Benefit. National media articles were reviewed (n = 2,698) and, based on explicit criteria, were analyzed using content analysis. Through the application of Kingdon's policy agenda-setting framework, the results define key recommendations for the Compassionate Care Benefit, as informed by the developmental trajectory of the Maternity Leave Benefit. Recommendations for revising the Compassionate Care Benefit are made.

  3. A statistical approach towards the derivation of predictive gene sets for potency ranking of chemicals in the mouse embryonic stem cell test.

    Science.gov (United States)

    Schulpen, Sjors H W; Pennings, Jeroen L A; Tonk, Elisa C M; Piersma, Aldert H

    2014-03-21

    The embryonic stem cell test (EST) is applied as a model system for detection of embryotoxicants. The application of transcriptomics allows a more detailed effect assessment compared to the morphological endpoint. Genes involved in cell differentiation, modulated by chemical exposures, may be useful as biomarkers of developmental toxicity. We describe a statistical approach to obtain a predictive gene set for toxicity potency ranking of compounds within one class. This resulted in a gene set based on differential gene expression across concentration-response series of phthalatic monoesters. We determined the concentration at which gene expression was changed at least 1.5-fold. Genes responding with the same potency ranking in vitro and in vivo embryotoxicity were selected. A leave-one-out cross-validation showed that the relative potency of each phthalate was always predicted correctly. The classical morphological 50% effect level (ID50) in EST was similar to the predicted concentration using gene set expression responses. A general down-regulation of development-related genes and up-regulation of cell-cycle related genes was observed, reminiscent of the differentiation inhibition in EST. This study illustrates the feasibility of applying dedicated gene set selections as biomarkers for developmental toxicity potency ranking on the basis of in vitro testing in the EST.

  4. Setting the most robust effluent level under severe uncertainty: application of information-gap decision theory to chemical management.

    Science.gov (United States)

    Yokomizo, Hiroyuki; Naito, Wataru; Tanaka, Yoshinari; Kamo, Masashi

    2013-11-01

    Decisions in ecological risk management for chemical substances must be made based on incomplete information due to uncertainties. To protect the ecosystems from the adverse effect of chemicals, a precautionary approach is often taken. The precautionary approach, which is based on conservative assumptions about the risks of chemical substances, can be applied selecting management models and data. This approach can lead to an adequate margin of safety for ecosystems by reducing exposure to harmful substances, either by reducing the use of target chemicals or putting in place strict water quality criteria. However, the reduction of chemical use or effluent concentrations typically entails a financial burden. The cost effectiveness of the precautionary approach may be small. Hence, we need to develop a formulaic methodology in chemical risk management that can sufficiently protect ecosystems in a cost-effective way, even when we do not have sufficient information for chemical management. Information-gap decision theory can provide the formulaic methodology. Information-gap decision theory determines which action is the most robust to uncertainty by guaranteeing an acceptable outcome under the largest degree of uncertainty without requiring information about the extent of parameter uncertainty at the outset. In this paper, we illustrate the application of information-gap decision theory to derive a framework for setting effluent limits of pollutants for point sources under uncertainty. Our application incorporates a cost for reduction in pollutant emission and a cost to wildlife species affected by the pollutant. Our framework enables us to settle upon actions to deal with severe uncertainty in ecological risk management of chemicals.

  5. A new sequence data set of SSU rRNA gene for Scleractinia and its phylogenetic and ecological applications

    KAUST Repository

    Arrigoni, Roberto

    2016-11-27

    Scleractinian corals (i.e. hard corals) play a fundamental role in building and maintaining coral reefs, one of the most diverse ecosystems on Earth. Nevertheless, their phylogenies remain largely unresolved and little is known about dispersal and survival of their planktonic larval phase. The small subunit ribosomal RNA (SSU rRNA) is a commonly used gene for DNA barcoding in several metazoans, and small variable regions of SSU rRNA are widely adopted as barcode marker to investigate marine plankton community structure worldwide. Here, we provide a large sequence data set of the complete SSU rRNA gene from 298 specimens, representing all known extant reef coral families and a total of 106 genera. The secondary structure was extremely conserved within the order with few exceptions due to insertions or deletions occurring in the variable regions. Remarkable differences in SSU rRNA length and base composition were detected between and within acroporids (Acropora, Montipora, Isopora and Alveopora) compared to other corals. The V4 and V9 regions seem to be promising barcode loci because variation at commonly used barcode primer binding sites was extremely low, while their levels of divergence allowed families and genera to be distinguished. A time-calibrated phylogeny of Scleractinia is provided, and mutation rate heterogeneity is demonstrated across main lineages. The use of this data set as a valuable reference for investigating aspects of ecology, biology, molecular taxonomy and evolution of scleractinian corals is discussed.

  6. Transcriptional Differences between Normal and Glioma-Derived Glial Progenitor Cells Identify a Core Set of Dysregulated Genes

    Directory of Open Access Journals (Sweden)

    Romane M. Auvergne

    2013-06-01

    Full Text Available Glial progenitor cells (GPCs are a potential source of malignant gliomas. We used A2B5-based sorting to extract tumorigenic GPCs from human gliomas spanning World Health Organization grades II–IV. Messenger RNA profiling identified a cohort of genes that distinguished A2B5+ glioma tumor progenitor cells (TPCs from A2B5+ GPCs isolated from normal white matter. A core set of genes and pathways was substantially dysregulated in A2B5+ TPCs, which included the transcription factor SIX1 and its principal cofactors, EYA1 and DACH2. Small hairpin RNAi silencing of SIX1 inhibited the expansion of glioma TPCs in vitro and in vivo, suggesting a critical and unrecognized role of the SIX1-EYA1-DACH2 system in glioma genesis or progression. By comparing the expression patterns of glioma TPCs with those of normal GPCs, we have identified a discrete set of pathways by which glial tumorigenesis may be better understood and more specifically targeted.

  7. Transcriptional repression and DNA hypermethylation of a small set of ES cell marker genes in male germline stem cells

    Directory of Open Access Journals (Sweden)

    Kanatsu-Shinohara Mito

    2006-07-01

    Full Text Available Abstract Background We previously identified a set of genes called ECATs (ES cell-associated transcripts that are expressed at high levels in mouse ES cells. Here, we examine the expression and DNA methylation of ECATs in somatic cells and germ cells. Results In all ECATs examined, the promoter region had low methylation levels in ES cells, but higher levels in somatic cells. In contrast, in spite of their lack of pluripotency, male germline stem (GS cells expressed most ECATs and exhibited hypomethylation of ECAT promoter regions. We observed a similar hypomethylation of ECAT loci in adult testis and isolated sperm. Some ECATs were even less methylated in male germ cells than in ES cells. However, a few ECATs were not expressed in GS cells, and most of them targets of Oct3/4 and Sox2. The Octamer/Sox regulatory elements were hypermethylated in these genes. In addition, we found that GS cells express little Sox2 protein and low Oct3/4 protein despite abundant expression of their transcripts. Conclusion Our results suggest that DNA hypermethylation and transcriptional repression of a small set of ECATs, together with post-transcriptional repression of Oct3/4 and Sox2, contribute to the loss of pluripotency in male germ cells.

  8. Comprehensive screening for a complete set of Japanese-population-specific filaggrin gene mutations.

    Science.gov (United States)

    Kono, M; Nomura, T; Ohguchi, Y; Mizuno, O; Suzuki, S; Tsujiuchi, H; Hamajima, N; McLean, W H I; Shimizu, H; Akiyama, M

    2014-04-01

    Mutations in FLG coding profilaggrin cause ichthyosis vulgaris and are an important predisposing factor for atopic dermatitis. Until now, most case-control studies and population-based screenings have been performed only for prevalent mutations. In this study, we established a high-throughput FLG mutation detection system by real-time PCR with a set of two double-dye probes and conducted comprehensive screening for almost all of the Japanese-population-specific FLG mutations (ten FLG mutations). The present comprehensive screening for all ten FLG mutations provided a more precise prevalence rate for FLG mutations (11.1%, n = 820), which seemed high compared with data of previous reports based on screening for limited numbers of FLG mutations. Our comprehensive screening suggested that population-specific FLG mutations may be a significant predisposing factor for hay fever (odds ratio = 2.01 [95% CI: 1.027-3.936, P < 0.05]), although the sample sizes of this study were too small for reliable subphenotype analysis on the association between FLG mutations and hay fever in the eczema patients and the noneczema individuals, and it is not clear whether the association between FLG mutations and hay fever is due to the close association between FLG mutations and hay fever patients with eczema.

  9. DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

    Science.gov (United States)

    Mazandu, Gaston K; Mulder, Nicola J

    2013-09-25

    The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.

  10. Information theory in systems biology. Part I: Gene regulatory and metabolic networks.

    Science.gov (United States)

    Mousavian, Zaynab; Kavousi, Kaveh; Masoudi-Nejad, Ali

    2016-03-01

    "A Mathematical Theory of Communication", was published in 1948 by Claude Shannon to establish a framework that is now known as information theory. In recent decades, information theory has gained much attention in the area of systems biology. The aim of this paper is to provide a systematic review of those contributions that have applied information theory in inferring or understanding of biological systems. Based on the type of system components and the interactions between them, we classify the biological systems into 4 main classes: gene regulatory, metabolic, protein-protein interaction and signaling networks. In the first part of this review, we attempt to introduce most of the existing studies on two types of biological networks, including gene regulatory and metabolic networks, which are founded on the concepts of information theory. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    OpenAIRE

    Alamar Santiago; Arribas Raquel; Forment Javier; Alonso-Cantabrana Hugo; Marques M Carmen; Conejero Vicente; Perez-Amador Miguel A

    2009-01-01

    Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information an...

  12. Sector Information Data Set

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Fishing sectors were established in the Greater Atlantic region in 2010 under catch share management initiatives. Sector data kept at GARFO is mostly a collection of...

  13. Transport and transformation of genetic information in the critical zone: The case of antibiotic resistance genes

    Science.gov (United States)

    Zhu, Y. G.

    2015-12-01

    In addition to material and energy flows, the dynamics and functions of the Earth's critical zone are intensively mediated by biological actions performed by diverse organisms. These biological actions are modulated by the expression of functional genes and their translation into enzymes that catalyze geochemical reactions, such as nutrient turnover and pollutant biodegradation. Although geobiology, as an interdisciplinary research area, is playing and vital role in linking biological and geochemical processes at different temporal and spatial scales, the distribution and transport of functional genes have rarely been investigated from the Earth's critical zone perspectives. To illustrate the framework of studies on the transport and transformation of genetic information in the critical zone, antibiotic resistance is taken as an example. Antibiotic resistance genes are considered as a group of emerging contaminants, and their emergence and spread within the critical zone on one hand are induced by anthropogenic activities, and on other hand are threatening human health worldwide. The transport and transformation of antibiotic resistance genes are controlled by both horizontal gene transfer between bacterial cells and the movement of bacteria harboring antibiotic resistance genes. In this paper, the fate and behavior of antibiotic resistance genes will be discussed in the following aspects: 1) general overview of environmental antibiotic resistance; 2) high through quantification of the resistome in various environmental media; 3) pathways of resistance gene flow within the critical zone; and 4) potential strategies in mitigating antibiotic resistance, particularly from the critical zone perspectives.

  14. Using answer set programming to integrate RNA expression with signalling pathway information to infer how mutations affect ageing.

    Directory of Open Access Journals (Sweden)

    Irene Papatheodorou

    Full Text Available A challenge of systems biology is to integrate incomplete knowledge on pathways with existing experimental data sets and relate these to measured phenotypes. Research on ageing often generates such incomplete data, creating difficulties in integrating RNA expression with information about biological processes and the phenotypes of ageing, including longevity. Here, we develop a logic-based method that employs Answer Set Programming, and use it to infer signalling effects of genetic perturbations, based on a model of the insulin signalling pathway. We apply our method to RNA expression data from Drosophila mutants in the insulin pathway that alter lifespan, in a foxo dependent fashion. We use this information to deduce how the pathway influences lifespan in the mutant animals. We also develop a method for inferring the largest common sub-paths within each of our signalling predictions. Our comparisons reveal consistent homeostatic mechanisms across both long- and short-lived mutants. The transcriptional changes observed in each mutation usually provide negative feedback to signalling predicted for that mutation. We also identify an S6K-mediated feedback in two long-lived mutants that suggests a crosstalk between these pathways in mutants of the insulin pathway, in vivo. By formulating the problem as a logic-based theory in a qualitative fashion, we are able to use the efficient search facilities of Answer Set Programming, allowing us to explore larger pathways, combine molecular changes with pathways and phenotype and infer effects on signalling in in vivo, whole-organism, mutants, where direct signalling stimulation assays are difficult to perform. Our methods are available in the web-service NetEffects: http://www.ebi.ac.uk/thornton-srv/software/NetEffects.

  15. Using answer set programming to integrate RNA expression with signalling pathway information to infer how mutations affect ageing.

    Science.gov (United States)

    Papatheodorou, Irene; Ziehm, Matthias; Wieser, Daniela; Alic, Nazif; Partridge, Linda; Thornton, Janet M

    2012-01-01

    A challenge of systems biology is to integrate incomplete knowledge on pathways with existing experimental data sets and relate these to measured phenotypes. Research on ageing often generates such incomplete data, creating difficulties in integrating RNA expression with information about biological processes and the phenotypes of ageing, including longevity. Here, we develop a logic-based method that employs Answer Set Programming, and use it to infer signalling effects of genetic perturbations, based on a model of the insulin signalling pathway. We apply our method to RNA expression data from Drosophila mutants in the insulin pathway that alter lifespan, in a foxo dependent fashion. We use this information to deduce how the pathway influences lifespan in the mutant animals. We also develop a method for inferring the largest common sub-paths within each of our signalling predictions. Our comparisons reveal consistent homeostatic mechanisms across both long- and short-lived mutants. The transcriptional changes observed in each mutation usually provide negative feedback to signalling predicted for that mutation. We also identify an S6K-mediated feedback in two long-lived mutants that suggests a crosstalk between these pathways in mutants of the insulin pathway, in vivo. By formulating the problem as a logic-based theory in a qualitative fashion, we are able to use the efficient search facilities of Answer Set Programming, allowing us to explore larger pathways, combine molecular changes with pathways and phenotype and infer effects on signalling in in vivo, whole-organism, mutants, where direct signalling stimulation assays are difficult to perform. Our methods are available in the web-service NetEffects: http://www.ebi.ac.uk/thornton-srv/software/NetEffects.

  16. Beyond the French Flag Model: Exploiting Spatial and Gene Regulatory Interactions for Positional Information

    Science.gov (United States)

    Hillenbrand, Patrick; Gerland, Ulrich; Tkačik, Gašper

    2016-01-01

    A crucial step in the early development of multicellular organisms involves the establishment of spatial patterns of gene expression which later direct proliferating cells to take on different cell fates. These patterns enable the cells to infer their global position within a tissue or an organism by reading out local gene expression levels. The patterning system is thus said to encode positional information, a concept that was formalized recently in the framework of information theory. Here we introduce a toy model of patterning in one spatial dimension, which can be seen as an extension of Wolpert’s paradigmatic “French Flag” model, to patterning by several interacting, spatially coupled genes subject to intrinsic and extrinsic noise. Our model, a variant of an Ising spin system, allows us to systematically explore expression patterns that optimally encode positional information. We find that optimal patterning systems use positional cues, as in the French Flag model, together with gene-gene interactions to generate combinatorial codes for position which we call “Counter” patterns. Counter patterns can also be stabilized against noise and variations in system size or morphogen dosage by longer-range spatial interactions of the type invoked in the Turing model. The simple setup proposed here qualitatively captures many of the experimentally observed properties of biological patterning systems and allows them to be studied in a single, theoretically consistent framework. PMID:27676252

  17. Comparative genomic analysis of the family Iridoviridae: re-annotating and defining the core set of iridovirus genes

    Directory of Open Access Journals (Sweden)

    Upton Chris

    2007-01-01

    Full Text Available Abstract Background Members of the family Iridoviridae can cause severe diseases resulting in significant economic and environmental losses. Very little is known about how iridoviruses cause disease in their host. In the present study, we describe the re-analysis of the Iridoviridae family of complex DNA viruses using a variety of comparative genomic tools to yield a greater consensus among the annotated sequences of its members. Results A series of genomic sequence comparisons were made among, and between the Ranavirus and Megalocytivirus genera in order to identify novel conserved ORFs. Of these two genera, the Megalocytivirus genomes required the greatest number of altered annotations. Prior to our re-analysis, the Megalocytivirus species orange-spotted grouper iridovirus and rock bream iridovirus shared 99% sequence identity, but only 82 out of 118 potential ORFs were annotated; in contrast, we predict that these species share an identical complement of genes. These annotation changes allowed the redefinition of the group of core genes shared by all iridoviruses. Seven new core genes were identified, bringing the total number to 26. Conclusion Our re-analysis of genomes within the Iridoviridae family provides a unifying framework to understand the biology of these viruses. Further re-defining the core set of iridovirus genes will continue to lead us to a better understanding of the phylogenetic relationships between individual iridoviruses as well as giving us a much deeper understanding of iridovirus replication. In addition, this analysis will provide a better framework for characterizing and annotating currently unclassified iridoviruses.

  18. Data set for diet specific differential gene expression analysis in three Spodoptera moths

    Directory of Open Access Journals (Sweden)

    A. Roy

    2016-09-01

    Full Text Available Examination of closely related species pairs is suggested for evolutionary comparisons of different degrees of polyphagy, which we did here with three taxa of lepidopteran herbivores, Spodoptera spp (S. littoralis, S. frugiperda maize (C and rice (R strains for a RNAseq analysis of the midguts from the 3rd instar insect larvae for differential metabolic responses after feeding on pinto bean based artificial diet vs maize leaves. Paired-end (2×100 bp Illumina HiSeq2500 sequencing resulted in a total of 24, 23, 24, and 21 million reads for the SF-C-Maize, SF-C-Pinto, SF-R-Maize, SF-R Pinto, and a total of 35 and 36 million reads for the SL-Maize and SL-Pinto samples, respectively. After quality control measures, a total of 62.2 million reads from SL and 71.7 million reads from SF were used for transcriptome assembly (TA. The resulting final de novo reference TA (backbone for the SF taxa contained 37,985 contigs with a N50 contig size of 1030 bp and a maximum contig length of 17,093 bp, while for SL, 28,329 contigs were generated with a N50 contig size of 1980 bp and a maximum contig length of 18,267 bp. The data presented herein contains supporting information related to our research article Roy et al. (2016 http://dx.doi.org/10.1016/j.ibmb.2016.02.006 [1].

  19. Utilizing social media to study information-seeking and ethical issues in gene therapy.

    Science.gov (United States)

    Robillard, Julie M; Whiteley, Louise; Johnson, Thomas Wade; Lim, Jonathan; Wasserman, Wyeth W; Illes, Judy

    2013-03-04

    The field of gene therapy is rapidly evolving, and while hopes of treating disorders of the central nervous system and ethical concerns have been articulated within the academic community, little is known about views and opinions of different stakeholder groups. To address this gap, we utilized social media to investigate the kind of information public users are seeking about gene therapy and the hopes, concerns, and attitudes they express. We conducted a content analysis of questions containing the keywords "gene therapy" from the Q&A site "Yahoo! Answers" for the 5-year period between 2006 and 2010. From the pool of questions retrieved (N=903), we identified those containing at least one theme related to ethics, environment, economics, law, or society (n=173) and then characterized the content of relevant answers (n=399) through emergent coding. The results show that users seek a wide range of information regarding gene therapy, with requests for scientific information and ethical issues at the forefront of enquiry. The question sample reveals high expectations for gene therapy that range from cures for genetic and nongenetic diseases to pre- and postnatal enhancement of physiological attributes. Ethics questions are commonly expressed as fears about the impact of gene therapy on self and society. The answer sample echoes these concerns but further suggests that the acceptability of gene therapy varies depending on the specific application. Overall, the findings highlight the powerful role of social media as a rich resource for research into attitudes toward biomedicine and as a platform for knowledge exchange and public engagement for topics relating to health and disease.

  20. Identification of a core set of 58 gene transcripts with broad and specific expression in the microvasculature.

    Science.gov (United States)

    Wallgard, Elisabet; Larsson, Erik; He, Liqun; Hellström, Mats; Armulik, Annika; Nisancioglu, Maya H; Genove, Guillem; Lindahl, Per; Betsholtz, Christer

    2008-08-01

    Pathological angiogenesis is an integral component of many diseases. Antiangiogenesis and vascular targeting are therefore promising new therapeutic principles. However, few endothelial-specific putative drug targets have been identified, and information is still limited about endothelial-specific molecular processes. Here we aimed at determining the endothelial cell-specific core transcriptome in vivo. Analysis of publicly available microarray data identified a mixed vascular/lung cluster of 132 genes that correlated with known endothelial markers. Filtering against kidney glomerular/nonglomerular and brain vascular/nonvascular microarray profiles separated contaminating lung markers, leaving 58 genes with broad and specific microvascular expression. More than half of these have not previously been linked to endothelial functions or studied in detail before. The endothelial cell-specific expression of a selected subset of these, Eltd1, Gpr116, Ramp2, Slc9a3r2, Slc43a3, Rasip1, and NM_023516, was confirmed by real-time quantitative polymerase chain reaction and/or immunohistochemistry. We have used a combination of publicly available and own microarray data to identify 58 gene transcripts with broad yet specific expression in microvascular endothelium. Most of these have unknown functions, but many of them are predicted to be cell surface expressed or implicated in cell signaling processes and should therefore be explored as putative microvascular drug targets.

  1. MIrExpress: A Database for Gene Coexpression Correlation in Immune Cells Based on Mutual Information and Pearson Correlation

    OpenAIRE

    Luman Wang; Qiaochu Mo; Jianxin Wang

    2015-01-01

    Most current gene coexpression databases support the analysis for linear correlation of gene pairs, but not nonlinear correlation of them, which hinders precisely evaluating the gene-gene coexpression strengths. Here, we report a new database, MIrExpress, which takes advantage of the information theory, as well as the Pearson linear correlation method, to measure the linear correlation, nonlinear correlation, and their hybrid of cell-specific gene coexpressions in immune cells. For a given ge...

  2. Systematic Approach to Computational Design of Gene Regulatory Networks with Information Processing Capabilities.

    Science.gov (United States)

    Moskon, Miha; Mraz, Miha

    2014-01-01

    We present several measures that can be used in de novo computational design of biological systems with information processing capabilities. Their main purpose is to objectively evaluate the behavior and identify the biological information processing structures with the best dynamical properties. They can be used to define constraints that allow one to simplify the design of more complex biological systems. These measures can be applied to existent computational design approaches in synthetic biology, i.e., rational and automatic design approaches. We demonstrate their use on a) the computational models of several basic information processing structures implemented with gene regulatory networks and b) on a modular design of a synchronous toggle switch.

  3. The effects of shared information on semantic calculations in the gene ontology.

    Science.gov (United States)

    Bible, Paul W; Sun, Hong-Wei; Morasso, Maria I; Loganantharaj, Rasiah; Wei, Lai

    2017-01-01

    The structured vocabulary that describes gene function, the gene ontology (GO), serves as a powerful tool in biological research. One application of GO in computational biology calculates semantic similarity between two concepts to make inferences about the functional similarity of genes. A class of term similarity algorithms explicitly calculates the shared information (SI) between concepts then substitutes this calculation into traditional term similarity measures such as Resnik, Lin, and Jiang-Conrath. Alternative SI approaches, when combined with ontology choice and term similarity type, lead to many gene-to-gene similarity measures. No thorough investigation has been made into the behavior, complexity, and performance of semantic methods derived from distinct SI approaches. We apply bootstrapping to compare the generalized performance of 57 gene-to-gene semantic measures across six benchmarks. Considering the number of measures, we additionally evaluate whether these methods can be leveraged through ensemble machine learning to improve prediction performance. Results showed that the choice of ontology type most strongly influenced performance across all evaluations. Combining measures into an ensemble classifier reduces cross-validation error beyond any individual measure for protein interaction prediction. This improvement resulted from information gained through the combination of ontology types as ensemble methods within each GO type offered no improvement. These results demonstrate that multiple SI measures can be leveraged for machine learning tasks such as automated gene function prediction by incorporating methods from across the ontologies. To facilitate future research in this area, we developed the GO Graph Tool Kit (GGTK), an open source C++ library with Python interface (github.com/paulbible/ggtk).

  4. P-sets,Inverse P-sets and the Intelligent Fusion-filter Identification of Information%P-集合,逆P-集合与信息智能融合-过滤辨识

    Institute of Scientific and Technical Information of China (English)

    史开泉

    2012-01-01

    , which have also dynamic characteristic. Inverse P-sets have the opposite mathematics structure to P-sets, which are a set pair composed of inemal inverse P-set XF( internal inverse packet set XF) and outer inverse P-set XF(outer inverse packet set XF) ,or (XF,XF) is inverse P-sets. Under a certain condition, inverse P-sets can be restored to finite general set X. Inverse P-sets are the mathematical expression of the other class dynamic system. P-reasoning(packet reasoning) is the dynamic reasoning generated by P-sets, and inverse P-reasoning (inverse packet reasoning) is the dynamic reasoning generated by inverse P-sets. By intersecting and infiltrating P-sets, inverse P-sets,P-reasoning,inverse P-reasoning with information fusion, the intelligent fusion-filter identification theory of information and its application study were given. The paper gave the structures, the separations and the equivalence class characteristics of P-sets and inverse P-sets, P-information fusion and inverse P-information fusion, the reasoning discovery,fusion measure,filter-identification of P-information fusion and inverse P-information fusion, the application of the intelligent fusion-filter identification of information. P-sets and inverse P-sets are new theories and methods of studying information fusion theory and application.

  5. Human Disease Insight: An integrated knowledge-based platform for disease-gene-drug information.

    Science.gov (United States)

    Tasleem, Munazzah; Ishrat, Romana; Islam, Asimul; Ahmad, Faizan; Hassan, Md Imtaiyaz

    2016-01-01

    The scope of the Human Disease Insight (HDI) database is not limited to researchers or physicians as it also provides basic information to non-professionals and creates disease awareness, thereby reducing the chances of patient suffering due to ignorance. HDI is a knowledge-based resource providing information on human diseases to both scientists and the general public. Here, our mission is to provide a comprehensive human disease database containing most of the available useful information, with extensive cross-referencing. HDI is a knowledge management system that acts as a central hub to access information about human diseases and associated drugs and genes. In addition, HDI contains well-classified bioinformatics tools with helpful descriptions. These integrated bioinformatics tools enable researchers to annotate disease-specific genes and perform protein analysis, search for biomarkers and identify potential vaccine candidates. Eventually, these tools will facilitate the analysis of disease-associated data. The HDI provides two types of search capabilities and includes provisions for downloading, uploading and searching disease/gene/drug-related information. The logistical design of the HDI allows for regular updating. The database is designed to work best with Mozilla Firefox and Google Chrome and is freely accessible at http://humandiseaseinsight.com.

  6. Exploring the Solar System Activities Outline: Hands-On Planetary Science for Formal Education K-14 and Informal Settings

    Science.gov (United States)

    Allen, J. S.; Tobola, K. W.; Lindstrom, M. L.

    2003-01-01

    Activities by NASA scientists and teachers focus on integrating Planetary Science activities with existing Earth science, math, and language arts curriculum. The wealth of activities that highlight missions and research pertaining to the exploring the solar system allows educators to choose activities that fit a particular concept or theme within their curriculum. Most of the activities use simple, inexpensive techniques that help students understand the how and why of what scientists are learning about comets, asteroids, meteorites, moons and planets. With these NASA developed activities students experience recent mission information about our solar system such as Mars geology and the search for life using Mars meteorites and robotic data. The Johnson Space Center ARES Education team has compiled a variety of NASA solar system activities to produce an annotated thematic outline useful to classroom educators and informal educators as they teach space science. An important aspect of the outline annotation is that it highlights appropriate science content information and key science and math concepts so educators can easily identify activities that will enhance curriculum development. The outline contains URLs for the activities and NASA educator guides as well as links to NASA mission science and technology. In the informal setting educators can use solar system exploration activities to reinforce learning in association with thematic displays, planetarium programs, youth group gatherings, or community events. Within formal education at the primary level some of the activities are appropriately designed to excite interest and arouse curiosity. Middle school educators will find activities that enhance thematic science and encourage students to think about the scientific process of investigation. Some of the activities offered are appropriate for the upper levels of high school and early college in that they require students to use and analyze data.

  7. Optimization to the Culture Conditions for Phellinus Production with Regression Analysis and Gene-Set Based Genetic Algorithm.

    Science.gov (United States)

    Li, Zhongwei; Xin, Yuezhen; Wang, Xun; Sun, Beibei; Xia, Shengyu; Li, Hui; Zhu, Hu

    2016-01-01

    Phellinus is a kind of fungus and is known as one of the elemental components in drugs to avoid cancers. With the purpose of finding optimized culture conditions for Phellinus production in the laboratory, plenty of experiments focusing on single factor were operated and large scale of experimental data were generated. In this work, we use the data collected from experiments for regression analysis, and then a mathematical model of predicting Phellinus production is achieved. Subsequently, a gene-set based genetic algorithm is developed to optimize the values of parameters involved in culture conditions, including inoculum size, PH value, initial liquid volume, temperature, seed age, fermentation time, and rotation speed. These optimized values of the parameters have accordance with biological experimental results, which indicate that our method has a good predictability for culture conditions optimization.

  8. Gene Sets for Utilization of Primary and Secondary Nutrition Supplies in the Distal Gut of Endangered Iberian Lynx

    Science.gov (United States)

    Alcaide, María; Messina, Enzo; Richter, Michael; Bargiela, Rafael; Peplies, Jörg; Huws, Sharon A.; Newbold, Charles J.; Golyshin, Peter N.; Simón, Miguel A.; López, Guillermo; Yakimov, Michail M.; Ferrer, Manuel

    2012-01-01

    Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus) fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads) related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of ‘presumptive’ aquaporin aqpZ genes and genes encoding ‘active’ lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(amino)lipids, glyco(amino)glycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases) in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80–100% wild rabbits) but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely

  9. Gene sets for utilization of primary and secondary nutrition supplies in the distal gut of endangered Iberian lynx.

    Directory of Open Access Journals (Sweden)

    María Alcaide

    Full Text Available Recent studies have indicated the existence of an extensive trans-genomic trans-mural co-metabolism between gut microbes and animal hosts that is diet-, host phylogeny- and provenance-influenced. Here, we analyzed the biodiversity at the level of small subunit rRNA gene sequence and the metabolic composition of 18 Mbp of consensus metagenome sequences and activity characteristics of bacterial intra-cellular extracts, in wild Iberian lynx (Lynx pardinus fecal samples. Bacterial signatures (14.43% of all of the Firmicutes reads and 6.36% of total reads related to the uncultured anaerobic commensals Anaeroplasma spp., which are typically found in ovine and bovine rumen, were first identified. The lynx gut was further characterized by an over-representation of 'presumptive' aquaporin aqpZ genes and genes encoding 'active' lysosomal-like digestive enzymes that are possibly needed to acquire glycerol, sugars and amino acids from glycoproteins, glyco(aminolipids, glyco(aminoglycans and nucleoside diphosphate sugars. Lynx gut was highly enriched (28% of the total glycosidases in genes encoding α-amylase and related enzymes, although it exhibited low rate of enzymatic activity indicative of starch degradation. The preponderance of β-xylosidase activity in protein extracts further suggests lynx gut microbes being most active for the metabolism of β-xylose containing plant N-glycans, although β-xylosidases sequences constituted only 1.5% of total glycosidases. These collective and unique bacterial, genetic and enzymatic activity signatures suggest that the wild lynx gut microbiota not only harbors gene sets underpinning sugar uptake from primary animal tissues (with the monotypic dietary profile of the wild lynx consisting of 80-100% wild rabbits but also for the hydrolysis of prey-derived plant biomass. Although, the present investigation corresponds to a single sample and some of the statements should be considered qualitative, the data most likely

  10. Combining qualitative and quantitative operational research methods to inform quality improvement in pathways that span multiple settings.

    Science.gov (United States)

    Crowe, Sonya; Brown, Katherine; Tregay, Jenifer; Wray, Jo; Knowles, Rachel; Ridout, Deborah A; Bull, Catherine; Utley, Martin

    2017-08-01

    Improving integration and continuity of care across sectors within resource constraints is a priority in many health systems. Qualitative operational research methods of problem structuring have been used to address quality improvement in services involving multiple sectors but not in combination with quantitative operational research methods that enable targeting of interventions according to patient risk. We aimed to combine these methods to augment and inform an improvement initiative concerning infants with congenital heart disease (CHD) whose complex care pathway spans multiple sectors. Soft systems methodology was used to consider systematically changes to services from the perspectives of community, primary, secondary and tertiary care professionals and a patient group, incorporating relevant evidence. Classification and regression tree (CART) analysis of national audit datasets was conducted along with data visualisation designed to inform service improvement within the context of limited resources. A 'Rich Picture' was developed capturing the main features of services for infants with CHD pertinent to service improvement. This was used, along with a graphical summary of the CART analysis, to guide discussions about targeting interventions at specific patient risk groups. Agreement was reached across representatives of relevant health professions and patients on a coherent set of targeted recommendations for quality improvement. These fed into national decisions about service provision and commissioning. When tackling complex problems in service provision across multiple settings, it is important to acknowledge and work with multiple perspectives systematically and to consider targeting service improvements in response to confined resources. Our research demonstrates that applying a combination of qualitative and quantitative operational research methods is one approach to doing so that warrants further consideration. Published by the BMJ Publishing Group

  11. Hands-on Activities for Exploring the Solar System in K-14 Formal and Informal Education Settings

    Science.gov (United States)

    Allen, J. S.; Tobola, K. W.

    2004-12-01

    Introduction: Activities developed by NASA scientists and teachers focus on integrating Planetary Science activities with existing Earth science, math, and language arts curriculum. Educators may choose activities that fit a particular concept or theme within their curriculum from activities that highlight missions and research pertaining to exploring the solar system. Most of the activities use simple, inexpensive techniques that help students understand the how and why of what scientists are learning about comets, asteroids, meteorites, moons and planets. The web sites for the activities contain current information so students experience recent mission information such as data from Mars rovers or the status of Stardust sample return. The Johnson Space Center Astromaterials Research and Exploration Science education team has compiled a variety of NASA solar system activities to produce an annotated thematic syllabus useful to classroom educators and informal educators as they teach space science. An important aspect of the syllabus is that it highlights appropriate science content information and key science and math concepts so educators can easily identify activities that will enhance curriculum development. The outline contains URLs for the activities and NASA educator guides as well as links to NASA mission science and technology. In the informal setting, educators can use solar system exploration activities to reinforce learning in association with thematic displays, planetarium programs, youth group gatherings, or community events. In both the informal and the primary education levels the activities are appropriately designed to excite interest, arouse curiosity and easily take the participants from pre-awareness to the awareness stage. Middle school educators will find activities that enhance thematic science and encourage students to think about the scientific process of investigation. Some of the activities offered may easily be adapted for the upper

  12. SARANEA: a freely available program to mine structure-activity and structure-selectivity relationship information in compound data sets.

    Science.gov (United States)

    Lounkine, Eugen; Wawer, Mathias; Wassermann, Anne Mai; Bajorath, Jürgen

    2010-01-01

    We introduce SARANEA, an open-source Java application for interactive exploration of structure-activity relationship (SAR) and structure-selectivity relationship (SSR) information in compound sets of any source. SARANEA integrates various SAR and SSR analysis functions and utilizes a network-like similarity graph data structure for visualization. The program enables the systematic detection of activity and selectivity cliffs and corresponding key compounds across multiple targets. Advanced SAR analysis functions implemented in SARANEA include, among others, layered chemical neighborhood graphs, cliff indices, selectivity trees, editing functions for molecular networks and pathways, bioactivity summaries of key compounds, and markers for bioactive compounds having potential side effects. We report the application of SARANEA to identify SAR and SSR determinants in different sets of serine protease inhibitors. It is found that key compounds can influence SARs and SSRs in rather different ways. Such compounds and their SAR/SSR characteristics can be systematically identified and explored using SARANEA. The program and source code are made freely available under the GNU General Public License.

  13. Incorporating 16S Gene Copy Number Information Improves Estimates of Microbial Diversity and Abundance

    Science.gov (United States)

    Kembel, Steven W.; Wu, Martin; Eisen, Jonathan A.; Green, Jessica L.

    2012-01-01

    The abundance of different SSU rRNA (“16S”) gene sequences in environmental samples is widely used in studies of microbial ecology as a measure of microbial community structure and diversity. However, the genomic copy number of the 16S gene varies greatly – from one in many species to up to 15 in some bacteria and to hundreds in some microbial eukaryotes. As a result of this variation the relative abundance of 16S genes in environmental samples can be attributed both to variation in the relative abundance of different organisms, and to variation in genomic 16S copy number among those organisms. Despite this fact, many studies assume that the abundance of 16S gene sequences is a surrogate measure of the relative abundance of the organisms containing those sequences. Here we present a method that uses data on sequences and genomic copy number of 16S genes along with phylogenetic placement and ancestral state estimation to estimate organismal abundances from environmental DNA sequence data. We use theory and simulations to demonstrate that 16S genomic copy number can be accurately estimated from the short reads typically obtained from high-throughput environmental sequencing of the 16S gene, and that organismal abundances in microbial communities are more strongly correlated with estimated abundances obtained from our method than with gene abundances. We re-analyze several published empirical data sets and demonstrate that the use of gene abundance versus estimated organismal abundance can lead to different inferences about community diversity and structure and the identity of the dominant taxa in microbial communities. Our approach will allow microbial ecologists to make more accurate inferences about microbial diversity and abundance based on 16S sequence data. PMID:23133348

  14. A set of vectors with a tetracycline-regulatable promoter system for modulated gene expression in Saccharomyces cerevisiae.

    Science.gov (United States)

    Garí, E; Piedrafita, L; Aldea, M; Herrero, E

    1997-07-01

    A set of Saccharomyces cerevisiae expression vectors has been developed in which transcription is driven by a hybrid tetO-CYC1 promoter through the action of a tetR-VP16 (tTA) activator. Expression from the promoter is regulated by tetracycline or derivatives. Various modalities of promoter and activator are used in order to achieve different levels of maximal expression. In the presence of antibiotic in the growth medium at concentrations that do not affect cell growth, expression from the tetO promoter is negligible, and upon antibiotic removal induction ratios of up to 1000-fold are observed with a lacZ reporter system. With the strongest system, overexpression levels comparable with those observed with GAL1-driven promoters are reached. For each particular promoter/tTA combination, expression can be modulated by changing the tetracycline concentration in the growth medium. These vectors may be useful for the study of the function of essential genes in yeast, as well as for phenotypic analysis of genes in overexpression conditions, without restrictions imposed by growth medium composition.

  15. Does information form matter when giving tailored risk information to patients in clinical settings? A review of patients’ preferences and responses

    Directory of Open Access Journals (Sweden)

    Harris R

    2017-03-01

    Full Text Available Rebecca Harris, Claire Noble, Victoria Lowers Institute of Psychology, Health and Society, University of Liverpool, Liverpool, UK Abstract: Neoliberal emphasis on “responsibility” has colonized many aspects of public life, including how health care is provided. Clinical risk assessment of patients based on a range of data concerned with lifestyle, behavior, and health status has assumed a growing importance in many health systems. It is a mechanism whereby responsibility for self (preventive care can be shifted to patients, provided that risk assessment data is communicated to patients in a way which is engaging and motivates change. This study aimed to look at whether the form in which tailored risk information was presented in a clinical setting (for example, using photographs, online data, diagrams etc., was associated with differences in patients’ responses and preferences to the material presented. We undertook a systematic review using electronic searching of nine databases, along with handsearching specialist journals and backward and forward citation searching. We identified eleven studies (eight with a randomized controlled trial design. Seven studies involved the use of computerized health risk assessments in primary care. Beneficial effects were relatively modest, even in studies merely aiming to enhance patient–clinician communication or to modify patients’ risk perceptions. In our paper, we discuss the apparent importance of the accompanying discourse between patient and clinician, which appears to be necessary in order to impart meaning to information on “risk,” irrespective of whether the material is personalized, or even presented in a vivid way. Thus, while expanding computer technologies might be able to generate a highly personalized account of patients’ risk in a time efficient way, the need for face-to-face interactions to impart meaning to the data means that these new technologies cannot fully address the

  16. The politics of agenda setting at the global level: key informant interviews regarding the International Labour Organization Decent Work Agenda.

    Science.gov (United States)

    Di Ruggiero, Erica; Cohen, Joanna E; Cole, Donald C

    2014-07-01

    Global labour markets continue to undergo significant transformations resulting from socio-political instability combined with rises in structural inequality, employment insecurity, and poor working conditions. Confronted by these challenges, global institutions are providing policy guidance to protect and promote the health and well-being of workers. This article provides an account of how the International Labour Organization's Decent Work Agenda contributes to the work policy agendas of the World Health Organization and the World Bank. This qualitative study involved semi-structured interviews with representatives from three global institutions--the International Labour Organization (ILO), the World Health Organization and the World Bank. Of the 25 key informants invited to participate, 16 took part in the study. Analysis for key themes was followed by interpretation using selected agenda setting theories. Interviews indicated that through the Decent Work Agenda, the International Labour Organization is shaping the global policy narrative about work among UN agencies, and that the pursuit of decent work and the Agenda were perceived as important goals with the potential to promote just policies. The Agenda was closely linked to the World Health Organization's conception of health as a human right. However, decent work was consistently identified by World Bank informants as ILO terminology in contrast to terms such as job creation and job access. The limited evidence base and its conceptual nature were offered as partial explanations for why the Agenda has yet to fully influence other global institutions. Catalytic events such as the economic crisis were identified as creating the enabling conditions to influence global work policy agendas. Our evidence aids our understanding of how an issue like decent work enters and stays on the policy agendas of global institutions, using the Decent Work Agenda as an illustrative example. Catalytic events and policy

  17. Level of data quality from Health Management Information Systems in a resources limited setting and its associated factors, eastern Ethiopia

    Directory of Open Access Journals (Sweden)

    Kidist Teklegiorgis

    2016-04-01

    Full Text Available Background: A Health Information System (HIS is a system that integrates data collection, processing, reporting, and use of the information necessary for improving health service effectiveness and efficiency through better management at all levels of health services. Despite the credible use of HIS for evidence-based decision-making, countries with the highest burden of ill health and the most in need of accurate and timely data have the weakest HIS in the vast majority of world’s poorest countries. Although a Health Management Information System (HMIS forms a backbone for strong health systems, most developing countries still face a challenge in strengthening routine HIS. The main focus of this study was to assess the current HIS performance and identify factors affecting data quality in a resource-limited setting, such as Ethiopian health facilities.Methods: A cross-sectional study was conducted by using structured questionnaires in Dire Dawa Administration health facilities. All unit and/or department heads from all government health facilities were selected. The data was analysed using STATA version 11. Frequency and percentages were computed to present the descriptive findings. Association between variables was computed using binary logistic regression.Results: Over all data quality was found to be 75.3% in unit and/or departments. Trained staff to fill format, decision based on supervisor directives and department heads seek feedback were significantly associated with data quality and their magnitudes were (AOR = 2.253, 95% CI [1.082, 4.692], (AOR = 2.131, 95% CI [1.073, 4.233] and (AOR = 2.481, 95% CI [1.262, 4.876], respectively.Conclusion: Overall data quality was found to be below the national expectation level. Low data quality was found at health posts compared to health centres and hospitals. There was also a shortage of assigned HIS personnel, separate HIS offices, and assigned budgets for HIS across all units and/or departments.

  18. Missing value imputation for microarray gene expression data using histone acetylation information

    Directory of Open Access Journals (Sweden)

    Feng Jihua

    2008-05-01

    Full Text Available Abstract Background It is an important pre-processing step to accurately estimate missing values in microarray data, because complete datasets are required in numerous expression profile analysis in bioinformatics. Although several methods have been suggested, their performances are not satisfactory for datasets with high missing percentages. Results The paper explores the feasibility of doing missing value imputation with the help of gene regulatory mechanism. An imputation framework called histone acetylation information aided imputation method (HAIimpute method is presented. It incorporates the histone acetylation information into the conventional KNN(k-nearest neighbor and LLS(local least square imputation algorithms for final prediction of the missing values. The experimental results indicated that the use of acetylation information can provide significant improvements in microarray imputation accuracy. The HAIimpute methods consistently improve the widely used methods such as KNN and LLS in terms of normalized root mean squared error (NRMSE. Meanwhile, the genes imputed by HAIimpute methods are more correlated with the original complete genes in terms of Pearson correlation coefficients. Furthermore, the proposed methods also outperform GOimpute, which is one of the existing related methods that use the functional similarity as the external information. Conclusion We demonstrated that the using of histone acetylation information could greatly improve the performance of the imputation especially at high missing percentages. This idea can be generalized to various imputation methods to facilitate the performance. Moreover, with more knowledge accumulated on gene regulatory mechanism in addition to histone acetylation, the performance of our approach can be further improved and verified.

  19. Comparative genomic analysis of Brucella abortus vaccine strain 104M reveals a set of candidate genes associated with its virulence attenuation.

    Science.gov (United States)

    Yu, Dong; Hui, Yiming; Zai, Xiaodong; Xu, Junjie; Liang, Long; Wang, Bingxiang; Yue, Junjie; Li, Shanhu

    2015-01-01

    The Brucella abortus strain 104M, a spontaneously attenuated strain, has been used as a vaccine strain in humans against brucellosis for 6 decades in China. Despite many studies, the molecular mechanisms that cause the attenuation are still unclear. Here, we determined the whole-genome sequence of 104M and conducted a comprehensive comparative analysis against the whole genome sequences of the virulent strain, A13334, and other reference strains. This analysis revealed a highly similar genome structure between 104M and A13334. The further comparative genomic analysis between 104M and A13334 revealed a set of genes missing in 104M. Some of these genes were identified to be directly or indirectly associated with virulence. Similarly, a set of mutations in the virulence-related genes was also identified, which may be related to virulence alteration. This study provides a set of candidate genes associated with virulence attenuation in B.abortus vaccine strain 104M.

  20. Transcriptional regulatory network refinement and quantification through kinetic modeling, gene expression microarray data and information theory

    Science.gov (United States)

    Sayyed-Ahmad, Abdallah; Tuncay, Kagan; Ortoleva, Peter J

    2007-01-01

    Background Gene expression microarray and other multiplex data hold promise for addressing the challenges of cellular complexity, refined diagnoses and the discovery of well-targeted treatments. A new approach to the construction and quantification of transcriptional regulatory networks (TRNs) is presented that integrates gene expression microarray data and cell modeling through information theory. Given a partial TRN and time series data, a probability density is constructed that is a functional of the time course of transcription factor (TF) thermodynamic activities at the site of gene control, and is a function of mRNA degradation and transcription rate coefficients, and equilibrium constants for TF/gene binding. Results Our approach yields more physicochemical information that compliments the results of network structure delineation methods, and thereby can serve as an element of a comprehensive TRN discovery/quantification system. The most probable TF time courses and values of the aforementioned parameters are obtained by maximizing the probability obtained through entropy maximization. Observed time delays between mRNA expression and activity are accounted for implicitly since the time course of the activity of a TF is coupled by probability functional maximization, and is not assumed to be proportional to expression level of the mRNA type that translates into the TF. This allows one to investigate post-translational and TF activation mechanisms of gene regulation. Accuracy and robustness of the method are evaluated. A kinetic formulation is used to facilitate the analysis of phenomena with a strongly dynamical character while a physically-motivated regularization of the TF time course is found to overcome difficulties due to omnipresent noise and data sparsity that plague other methods of gene expression data analysis. An application to Escherichia coli is presented. Conclusion Multiplex time series data can be used for the construction of the network of

  1. Transcriptional regulatory network refinement and quantification through kinetic modeling, gene expression microarray data and information theory

    Directory of Open Access Journals (Sweden)

    Tuncay Kagan

    2007-01-01

    Full Text Available Abstract Background Gene expression microarray and other multiplex data hold promise for addressing the challenges of cellular complexity, refined diagnoses and the discovery of well-targeted treatments. A new approach to the construction and quantification of transcriptional regulatory networks (TRNs is presented that integrates gene expression microarray data and cell modeling through information theory. Given a partial TRN and time series data, a probability density is constructed that is a functional of the time course of transcription factor (TF thermodynamic activities at the site of gene control, and is a function of mRNA degradation and transcription rate coefficients, and equilibrium constants for TF/gene binding. Results Our approach yields more physicochemical information that compliments the results of network structure delineation methods, and thereby can serve as an element of a comprehensive TRN discovery/quantification system. The most probable TF time courses and values of the aforementioned parameters are obtained by maximizing the probability obtained through entropy maximization. Observed time delays between mRNA expression and activity are accounted for implicitly since the time course of the activity of a TF is coupled by probability functional maximization, and is not assumed to be proportional to expression level of the mRNA type that translates into the TF. This allows one to investigate post-translational and TF activation mechanisms of gene regulation. Accuracy and robustness of the method are evaluated. A kinetic formulation is used to facilitate the analysis of phenomena with a strongly dynamical character while a physically-motivated regularization of the TF time course is found to overcome difficulties due to omnipresent noise and data sparsity that plague other methods of gene expression data analysis. An application to Escherichia coli is presented. Conclusion Multiplex time series data can be used for the

  2. Adaptive modelling of gene regulatory network using Bayesian information criterion-guided sparse regression approach.

    Science.gov (United States)

    Shi, Ming; Shen, Weiming; Wang, Hong-Qiang; Chong, Yanwen

    2016-12-01

    Inferring gene regulatory networks (GRNs) from microarray expression data are an important but challenging issue in systems biology. In this study, the authors propose a Bayesian information criterion (BIC)-guided sparse regression approach for GRN reconstruction. This approach can adaptively model GRNs by optimising the l1-norm regularisation of sparse regression based on a modified version of BIC. The use of the regularisation strategy ensures the inferred GRNs to be as sparse as natural, while the modified BIC allows incorporating prior knowledge on expression regulation and thus avoids the overestimation of expression regulators as usual. Especially, the proposed method provides a clear interpretation of combinatorial regulations of gene expression by optimally extracting regulation coordination for a given target gene. Experimental results on both simulation data and real-world microarray data demonstrate the competent performance of discovering regulatory relationships in GRN reconstruction.

  3. Integration of Known Transcription Factor Binding Site Information and Gene Expression Data to Advance from Co-Expression to Co-Regulation