WorldWideScience

Sample records for large information set

  1. Reducing Information Overload in Large Seismic Data Sets

    Energy Technology Data Exchange (ETDEWEB)

    HAMPTON,JEFFERY W.; YOUNG,CHRISTOPHER J.; MERCHANT,BION J.; CARR,DORTHE B.; AGUILAR-CHANG,JULIO

    2000-08-02

    Event catalogs for seismic data can become very large. Furthermore, as researchers collect multiple catalogs and reconcile them into a single catalog that is stored in a relational database, the reconciled set becomes even larger. The sheer number of these events makes searching for relevant events to compare with events of interest problematic. Information overload in this form can lead to the data sets being under-utilized and/or used incorrectly or inconsistently. Thus, efforts have been initiated to research techniques and strategies for helping researchers to make better use of large data sets. In this paper, the authors present their efforts to do so in two ways: (1) the Event Search Engine, which is a waveform correlation tool and (2) some content analysis tools, which area combination of custom-built and commercial off-the-shelf tools for accessing, managing, and querying seismic data stored in a relational database. The current Event Search Engine is based on a hierarchical clustering tool known as the dendrogram tool, which is written as a MatSeis graphical user interface. The dendrogram tool allows the user to build dendrogram diagrams for a set of waveforms by controlling phase windowing, down-sampling, filtering, enveloping, and the clustering method (e.g. single linkage, complete linkage, flexible method). It also allows the clustering to be based on two or more stations simultaneously, which is important to bridge gaps in the sparsely recorded event sets anticipated in such a large reconciled event set. Current efforts are focusing on tools to help the researcher winnow the clusters defined using the dendrogram tool down to the minimum optimal identification set. This will become critical as the number of reference events in the reconciled event set continually grows. The dendrogram tool is part of the MatSeis analysis package, which is available on the Nuclear Explosion Monitoring Research and Engineering Program Web Site. As part of the research

  2. Querying Large Physics Data Sets Over an Information Grid

    CERN Document Server

    Baker, N; Kovács, Z; Le Goff, J M; McClatchey, R

    2001-01-01

    Optimising use of the Web (WWW) for LHC data analysis is a complex problem and illustrates the challenges arising from the integration of and computation across massive amounts of information distributed worldwide. Finding the right piece of information can, at times, be extremely time-consuming, if not impossible. So-called Grids have been proposed to facilitate LHC computing and many groups have embarked on studies of data replication, data migration and networking philosophies. Other aspects such as the role of 'middleware' for Grids are emerging as requiring research. This paper positions the need for appropriate middleware that enables users to resolve physics queries across massive data sets. It identifies the role of meta-data for query resolution and the importance of Information Grids for high-energy physics analysis rather than just Computational or Data Grids. This paper identifies software that is being implemented at CERN to enable the querying of very large collaborating HEP data-sets, initially...

  3. SAR matrices: automated extraction of information-rich SAR tables from large compound data sets.

    Science.gov (United States)

    Wassermann, Anne Mai; Haebel, Peter; Weskamp, Nils; Bajorath, Jürgen

    2012-07-23

    We introduce the SAR matrix data structure that is designed to elucidate SAR patterns produced by groups of structurally related active compounds, which are extracted from large data sets. SAR matrices are systematically generated and sorted on the basis of SAR information content. Matrix generation is computationally efficient and enables processing of large compound sets. The matrix format is reminiscent of SAR tables, and SAR patterns revealed by different categories of matrices are easily interpretable. The structural organization underlying matrix formation is more flexible than standard R-group decomposition schemes. Hence, the resulting matrices capture SAR information in a comprehensive manner.

  4. Looking at large data sets using binned data plots

    Energy Technology Data Exchange (ETDEWEB)

    Carr, D.B.

    1990-04-01

    This report addresses the monumental challenge of developing exploratory analysis methods for large data sets. The goals of the report are to increase awareness of large data sets problems and to contribute simple graphical methods that address some of the problems. The graphical methods focus on two- and three-dimensional data and common task such as finding outliers and tail structure, assessing central structure and comparing central structures. The methods handle large sample size problems through binning, incorporate information from statistical models and adapt image processing algorithms. Examples demonstrate the application of methods to a variety of publicly available large data sets. The most novel application addresses the too many plots to examine'' problem by using cognostics, computer guiding diagnostics, to prioritize plots. The particular application prioritizes views of computational fluid dynamics solution sets on the fly. That is, as each time step of a solution set is generated on a parallel processor the cognostics algorithms assess virtual plots based on the previous time step. Work in such areas is in its infancy and the examples suggest numerous challenges that remain. 35 refs., 15 figs.

  5. Discovering highly informative feature set over high dimensions

    KAUST Repository

    Zhang, Chongsheng; Masseglia, Florent; Zhang, Xiangliang

    2012-01-01

    For many textual collections, the number of features is often overly large. These features can be very redundant, it is therefore desirable to have a small, succinct, yet highly informative collection of features that describes the key characteristics of a dataset. Information theory is one such tool for us to obtain this feature collection. With this paper, we mainly contribute to the improvement of efficiency for the process of selecting the most informative feature set over high-dimensional unlabeled data. We propose a heuristic theory for informative feature set selection from high dimensional data. Moreover, we design data structures that enable us to compute the entropies of the candidate feature sets efficiently. We also develop a simple pruning strategy that eliminates the hopeless candidates at each forward selection step. We test our method through experiments on real-world data sets, showing that our proposal is very efficient. © 2012 IEEE.

  6. Discovering highly informative feature set over high dimensions

    KAUST Repository

    Zhang, Chongsheng

    2012-11-01

    For many textual collections, the number of features is often overly large. These features can be very redundant, it is therefore desirable to have a small, succinct, yet highly informative collection of features that describes the key characteristics of a dataset. Information theory is one such tool for us to obtain this feature collection. With this paper, we mainly contribute to the improvement of efficiency for the process of selecting the most informative feature set over high-dimensional unlabeled data. We propose a heuristic theory for informative feature set selection from high dimensional data. Moreover, we design data structures that enable us to compute the entropies of the candidate feature sets efficiently. We also develop a simple pruning strategy that eliminates the hopeless candidates at each forward selection step. We test our method through experiments on real-world data sets, showing that our proposal is very efficient. © 2012 IEEE.

  7. Large Data Set Mining

    NARCIS (Netherlands)

    Leemans, I.B.; Broomhall, Susan

    2017-01-01

    Digital emotion research has yet to make history. Until now large data set mining has not been a very active field of research in early modern emotion studies. This is indeed surprising since first, the early modern field has such rich, copyright-free, digitized data sets and second, emotion studies

  8. Visualization of diversity in large multivariate data sets.

    Science.gov (United States)

    Pham, Tuan; Hess, Rob; Ju, Crystal; Zhang, Eugene; Metoyer, Ronald

    2010-01-01

    Understanding the diversity of a set of multivariate objects is an important problem in many domains, including ecology, college admissions, investing, machine learning, and others. However, to date, very little work has been done to help users achieve this kind of understanding. Visual representation is especially appealing for this task because it offers the potential to allow users to efficiently observe the objects of interest in a direct and holistic way. Thus, in this paper, we attempt to formalize the problem of visualizing the diversity of a large (more than 1000 objects), multivariate (more than 5 attributes) data set as one worth deeper investigation by the information visualization community. In doing so, we contribute a precise definition of diversity, a set of requirements for diversity visualizations based on this definition, and a formal user study design intended to evaluate the capacity of a visual representation for communicating diversity information. Our primary contribution, however, is a visual representation, called the Diversity Map, for visualizing diversity. An evaluation of the Diversity Map using our study design shows that users can judge elements of diversity consistently and as or more accurately than when using the only other representation specifically designed to visualize diversity.

  9. Multidimensional scaling for large genomic data sets

    Directory of Open Access Journals (Sweden)

    Lu Henry

    2008-04-01

    Full Text Available Abstract Background Multi-dimensional scaling (MDS is aimed to represent high dimensional data in a low dimensional space with preservation of the similarities between data points. This reduction in dimensionality is crucial for analyzing and revealing the genuine structure hidden in the data. For noisy data, dimension reduction can effectively reduce the effect of noise on the embedded structure. For large data set, dimension reduction can effectively reduce information retrieval complexity. Thus, MDS techniques are used in many applications of data mining and gene network research. However, although there have been a number of studies that applied MDS techniques to genomics research, the number of analyzed data points was restricted by the high computational complexity of MDS. In general, a non-metric MDS method is faster than a metric MDS, but it does not preserve the true relationships. The computational complexity of most metric MDS methods is over O(N2, so that it is difficult to process a data set of a large number of genes N, such as in the case of whole genome microarray data. Results We developed a new rapid metric MDS method with a low computational complexity, making metric MDS applicable for large data sets. Computer simulation showed that the new method of split-and-combine MDS (SC-MDS is fast, accurate and efficient. Our empirical studies using microarray data on the yeast cell cycle showed that the performance of K-means in the reduced dimensional space is similar to or slightly better than that of K-means in the original space, but about three times faster to obtain the clustering results. Our clustering results using SC-MDS are more stable than those in the original space. Hence, the proposed SC-MDS is useful for analyzing whole genome data. Conclusion Our new method reduces the computational complexity from O(N3 to O(N when the dimension of the feature space is far less than the number of genes N, and it successfully

  10. Shortest triplet clustering: reconstructing large phylogenies using representative sets

    Directory of Open Access Journals (Sweden)

    Sy Vinh Le

    2005-04-01

    Full Text Available Abstract Background Understanding the evolutionary relationships among species based on their genetic information is one of the primary objectives in phylogenetic analysis. Reconstructing phylogenies for large data sets is still a challenging task in Bioinformatics. Results We propose a new distance-based clustering method, the shortest triplet clustering algorithm (STC, to reconstruct phylogenies. The main idea is the introduction of a natural definition of so-called k-representative sets. Based on k-representative sets, shortest triplets are reconstructed and serve as building blocks for the STC algorithm to agglomerate sequences for tree reconstruction in O(n2 time for n sequences. Simulations show that STC gives better topological accuracy than other tested methods that also build a first starting tree. STC appears as a very good method to start the tree reconstruction. However, all tested methods give similar results if balanced nearest neighbor interchange (BNNI is applied as a post-processing step. BNNI leads to an improvement in all instances. The program is available at http://www.bi.uni-duesseldorf.de/software/stc/. Conclusion The results demonstrate that the new approach efficiently reconstructs phylogenies for large data sets. We found that BNNI boosts the topological accuracy of all methods including STC, therefore, one should use BNNI as a post-processing step to get better topological accuracy.

  11. Metastrategies in large-scale bargaining settings

    NARCIS (Netherlands)

    Hennes, D.; Jong, S. de; Tuyls, K.; Gal, Y.

    2015-01-01

    This article presents novel methods for representing and analyzing a special class of multiagent bargaining settings that feature multiple players, large action spaces, and a relationship among players' goals, tasks, and resources. We show how to reduce these interactions to a set of bilateral

  12. The development of the Older Persons and Informal Caregivers Survey Minimum DataSet (TOPICS-MDS): a large-scale data sharing initiative.

    Science.gov (United States)

    Lutomski, Jennifer E; Baars, Maria A E; Schalk, Bianca W M; Boter, Han; Buurman, Bianca M; den Elzen, Wendy P J; Jansen, Aaltje P D; Kempen, Gertrudis I J M; Steunenberg, Bas; Steyerberg, Ewout W; Olde Rikkert, Marcel G M; Melis, René J F

    2013-01-01

    In 2008, the Ministry of Health, Welfare and Sport commissioned the National Care for the Elderly Programme. While numerous research projects in older persons' health care were to be conducted under this national agenda, the Programme further advocated the development of The Older Persons and Informal Caregivers Survey Minimum DataSet (TOPICS-MDS) which would be integrated into all funded research protocols. In this context, we describe TOPICS data sharing initiative (www.topics-mds.eu). A working group drafted TOPICS-MDS prototype, which was subsequently approved by a multidisciplinary panel. Using instruments validated for older populations, information was collected on demographics, morbidity, quality of life, functional limitations, mental health, social functioning and health service utilisation. For informal caregivers, information was collected on demographics, hours of informal care and quality of life (including subjective care-related burden). Between 2010 and 2013, a total of 41 research projects contributed data to TOPICS-MDS, resulting in preliminary data available for 32,310 older persons and 3,940 informal caregivers. The majority of studies sampled were from primary care settings and inclusion criteria differed across studies. TOPICS-MDS is a public data repository which contains essential data to better understand health challenges experienced by older persons and informal caregivers. Such findings are relevant for countries where increasing health-related expenditure has necessitated the evaluation of contemporary health care delivery. Although open sharing of data can be difficult to achieve in practice, proactively addressing issues of data protection, conflicting data analysis requests and funding limitations during TOPICS-MDS developmental phase has fostered a data sharing culture. To date, TOPICS-MDS has been successfully incorporated into 41 research projects, thus supporting the feasibility of constructing a large (>30,000 observations

  13. Using Content-Specific Lyrics to Familiar Tunes in a Large Lecture Setting

    Science.gov (United States)

    McLachlin, Derek T.

    2009-01-01

    Music can be used in lectures to increase student engagement and help students retain information. In this paper, I describe my use of biochemistry-related lyrics written to the tune of the theme to the television show, The Flintstones, in a large class setting (400-800 students). To determine student perceptions, the class was surveyed several…

  14. Using SETS to find minimal cut sets in large fault trees

    International Nuclear Information System (INIS)

    Worrell, R.B.; Stack, D.W.

    1978-01-01

    An efficient algebraic algorithm for finding the minimal cut sets for a large fault tree was defined and a new procedure which implements the algorithm was added to the Set Equation Transformation System (SETS). The algorithm includes the identification and separate processing of independent subtrees, the coalescing of consecutive gates of the same kind, the creation of additional independent subtrees, and the derivation of the fault tree stem equation in stages. The computer time required to determine the minimal cut sets using these techniques is shown to be substantially less than the computer time required to determine the minimal cut sets when these techniques are not employed. It is shown for a given example that the execution time required to determine the minimal cut sets can be reduced from 7,686 seconds to 7 seconds when all of these techniques are employed

  15. The utility of imputed matched sets. Analyzing probabilistically linked databases in a low information setting.

    Science.gov (United States)

    Thomas, A M; Cook, L J; Dean, J M; Olson, L M

    2014-01-01

    To compare results from high probability matched sets versus imputed matched sets across differing levels of linkage information. A series of linkages with varying amounts of available information were performed on two simulated datasets derived from multiyear motor vehicle crash (MVC) and hospital databases, where true matches were known. Distributions of high probability and imputed matched sets were compared against the true match population for occupant age, MVC county, and MVC hour. Regression models were fit to simulated log hospital charges and hospitalization status. High probability and imputed matched sets were not significantly different from occupant age, MVC county, and MVC hour in high information settings (p > 0.999). In low information settings, high probability matched sets were significantly different from occupant age and MVC county (p sets were not (p > 0.493). High information settings saw no significant differences in inference of simulated log hospital charges and hospitalization status between the two methods. High probability and imputed matched sets were significantly different from the outcomes in low information settings; however, imputed matched sets were more robust. The level of information available to a linkage is an important consideration. High probability matched sets are suitable for high to moderate information settings and for situations involving case-specific analysis. Conversely, imputed matched sets are preferable for low information settings when conducting population-based analyses.

  16. Study on default setting for risk-informed regulation

    International Nuclear Information System (INIS)

    Jang, S.C.; Ha, J.J.; Jung, W.D.; Jeong, K.S.; Han, S.H.

    1998-12-01

    Both performing and validating a detailed risk analysis of a complex system are costly and time-consuming undertakings. With the increased use of probabilistic safety analysis (PSA) in regulatory decision making, both regulated parties and regulators have generally favored the use of defaults, because they can greatly facilitate the process of performing a PSA in the first place as well as the process of reviewing and verifying the PSA. The use of defaults may also ensure more uniform standards of PSA quality. However, regulatory agencies differ in their approaches to the use of default values, and the implications of these differences are not yet well understood. Moreover, large heterogeneity among licensees makes it difficult to set suitable defaults. This study focus on the development of model for setting defaults in order to achieve more applicability of risk-informed regulation. In particular, explored are the effects of different levels of conservatism in setting defaults, and their implications for the crafting of regularity incentives. (author). 17 refs., 1 tab

  17. Zebrafish Expression Ontology of Gene Sets (ZEOGS): A Tool to Analyze Enrichment of Zebrafish Anatomical Terms in Large Gene Sets

    Science.gov (United States)

    Marsico, Annalisa

    2013-01-01

    Abstract The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene

  18. Zebrafish Expression Ontology of Gene Sets (ZEOGS): a tool to analyze enrichment of zebrafish anatomical terms in large gene sets.

    Science.gov (United States)

    Prykhozhij, Sergey V; Marsico, Annalisa; Meijsing, Sebastiaan H

    2013-09-01

    The zebrafish (Danio rerio) is an established model organism for developmental and biomedical research. It is frequently used for high-throughput functional genomics experiments, such as genome-wide gene expression measurements, to systematically analyze molecular mechanisms. However, the use of whole embryos or larvae in such experiments leads to a loss of the spatial information. To address this problem, we have developed a tool called Zebrafish Expression Ontology of Gene Sets (ZEOGS) to assess the enrichment of anatomical terms in large gene sets. ZEOGS uses gene expression pattern data from several sources: first, in situ hybridization experiments from the Zebrafish Model Organism Database (ZFIN); second, it uses the Zebrafish Anatomical Ontology, a controlled vocabulary that describes connected anatomical structures; and third, the available connections between expression patterns and anatomical terms contained in ZFIN. Upon input of a gene set, ZEOGS determines which anatomical structures are overrepresented in the input gene set. ZEOGS allows one for the first time to look at groups of genes and to describe them in terms of shared anatomical structures. To establish ZEOGS, we first tested it on random gene selections and on two public microarray datasets with known tissue-specific gene expression changes. These tests showed that ZEOGS could reliably identify the tissues affected, whereas only very few enriched terms to none were found in the random gene sets. Next we applied ZEOGS to microarray datasets of 24 and 72 h postfertilization zebrafish embryos treated with beclomethasone, a potent glucocorticoid. This analysis resulted in the identification of several anatomical terms related to glucocorticoid-responsive tissues, some of which were stage-specific. Our studies highlight the ability of ZEOGS to extract spatial information from datasets derived from whole embryos, indicating that ZEOGS could be a useful tool to automatically analyze gene expression

  19. Influences of large sets of environmental exposures on immune responses in healthy adult men.

    Science.gov (United States)

    Yi, Buqing; Rykova, Marina; Jäger, Gundula; Feuerecker, Matthias; Hörl, Marion; Matzel, Sandra; Ponomarev, Sergey; Vassilieva, Galina; Nichiporuk, Igor; Choukèr, Alexander

    2015-08-26

    Environmental factors have long been known to influence immune responses. In particular, clinical studies about the association between migration and increased risk of atopy/asthma have provided important information on the role of migration associated large sets of environmental exposures in the development of allergic diseases. However, investigations about environmental effects on immune responses are mostly limited in candidate environmental exposures, such as air pollution. The influences of large sets of environmental exposures on immune responses are still largely unknown. A simulated 520-d Mars mission provided an opportunity to investigate this topic. Six healthy males lived in a closed habitat simulating a spacecraft for 520 days. When they exited their "spacecraft" after the mission, the scenario was similar to that of migration, involving exposure to a new set of environmental pollutants and allergens. We measured multiple immune parameters with blood samples at chosen time points after the mission. At the early adaptation stage, highly enhanced cytokine responses were observed upon ex vivo antigen stimulations. For cell population frequencies, we found the subjects displayed increased neutrophils. These results may presumably represent the immune changes occurred in healthy humans when migrating, indicating that large sets of environmental exposures may trigger aberrant immune activity.

  20. Analysis Methods for Extracting Knowledge from Large-Scale WiFi Monitoring to Inform Building Facility Planning

    DEFF Research Database (Denmark)

    Ruiz-Ruiz, Antonio; Blunck, Henrik; Prentow, Thor Siiger

    2014-01-01

    realistic data to inform facility planning. In this paper, we propose analysis methods to extract knowledge from large sets of network collected WiFi traces to better inform facility management and planning in large building complexes. The analysis methods, which build on a rich set of temporal and spatial......The optimization of logistics in large building com- plexes with many resources, such as hospitals, require realistic facility management and planning. Current planning practices rely foremost on manual observations or coarse unverified as- sumptions and therefore do not properly scale or provide....... Spatio-temporal visualization tools built on top of these methods enable planners to inspect and explore extracted information to inform facility-planning activities. To evaluate the methods, we present results for a large hospital complex covering more than 10 hectares. The evaluation is based on Wi...

  1. Information overload or search-amplified risk? Set size and order effects on decisions from experience.

    Science.gov (United States)

    Hills, Thomas T; Noguchi, Takao; Gibbert, Michael

    2013-10-01

    How do changes in choice-set size influence information search and subsequent decisions? Moreover, does information overload influence information processing with larger choice sets? We investigated these questions by letting people freely explore sets of gambles before choosing one of them, with the choice sets either increasing or decreasing in number for each participant (from two to 32 gambles). Set size influenced information search, with participants taking more samples overall, but sampling a smaller proportion of gambles and taking fewer samples per gamble, when set sizes were larger. The order of choice sets also influenced search, with participants sampling from more gambles and taking more samples overall if they started with smaller as opposed to larger choice sets. Inconsistent with information overload, information processing appeared consistent across set sizes and choice order conditions, reliably favoring gambles with higher sample means. Despite the lack of evidence for information overload, changes in information search did lead to systematic changes in choice: People who started with smaller choice sets were more likely to choose gambles with the highest expected values, but only for small set sizes. For large set sizes, the increase in total samples increased the likelihood of encountering rare events at the same time that the reduction in samples per gamble amplified the effect of these rare events when they occurred-what we call search-amplified risk. This led to riskier choices for individuals whose choices most closely followed the sample mean.

  2. Core information sets for informed consent to surgical interventions: baseline information of importance to patients and clinicians.

    Science.gov (United States)

    Main, Barry G; McNair, Angus G K; Huxtable, Richard; Donovan, Jenny L; Thomas, Steven J; Kinnersley, Paul; Blazeby, Jane M

    2017-04-26

    Consent remains a crucial, yet challenging, cornerstone of clinical practice. The ethical, legal and professional understandings of this construct have evolved away from a doctor-centred act to a patient-centred process that encompasses the patient's values, beliefs and goals. This alignment of consent with the philosophy of shared decision-making was affirmed in a recent high-profile Supreme Court ruling in England. The communication of information is central to this model of health care delivery but it can be difficult for doctors to gauge the information needs of the individual patient. The aim of this paper is to describe 'core information sets' which are defined as a minimum set of consensus-derived information about a given procedure to be discussed with all patients. Importantly, they are intended to catalyse discussion of subjective importance to individuals. The model described in this paper applies health services research and Delphi consensus-building methods to an idea orginally proposed 30 years ago. The hypothesis is that, first, large amounts of potentially-important information are distilled down to discrete information domains. These are then, secondly, rated by key stakeholders in multiple iterations, so that core information of agreed importance can be defined. We argue that this scientific approach is key to identifying information important to all stakeholders, which may otherwise be communicated poorly or omitted from discussions entirely. Our methods apply systematic review, qualitative, survey and consensus-building techniques to define this 'core information'. We propose that such information addresses the 'reasonable patient' standard for information disclosure but, more importantly, can serve as a spring board for high-value discussion of importance to the individual patient. The application of established research methods can define information of core importance to informed consent. Further work will establish how best to incorporate

  3. Spatial occupancy models for large data sets

    Science.gov (United States)

    Johnson, Devin S.; Conn, Paul B.; Hooten, Mevin B.; Ray, Justina C.; Pond, Bruce A.

    2013-01-01

    Since its development, occupancy modeling has become a popular and useful tool for ecologists wishing to learn about the dynamics of species occurrence over time and space. Such models require presence–absence data to be collected at spatially indexed survey units. However, only recently have researchers recognized the need to correct for spatially induced overdisperison by explicitly accounting for spatial autocorrelation in occupancy probability. Previous efforts to incorporate such autocorrelation have largely focused on logit-normal formulations for occupancy, with spatial autocorrelation induced by a random effect within a hierarchical modeling framework. Although useful, computational time generally limits such an approach to relatively small data sets, and there are often problems with algorithm instability, yielding unsatisfactory results. Further, recent research has revealed a hidden form of multicollinearity in such applications, which may lead to parameter bias if not explicitly addressed. Combining several techniques, we present a unifying hierarchical spatial occupancy model specification that is particularly effective over large spatial extents. This approach employs a probit mixture framework for occupancy and can easily accommodate a reduced-dimensional spatial process to resolve issues with multicollinearity and spatial confounding while improving algorithm convergence. Using open-source software, we demonstrate this new model specification using a case study involving occupancy of caribou (Rangifer tarandus) over a set of 1080 survey units spanning a large contiguous region (108 000 km2) in northern Ontario, Canada. Overall, the combination of a more efficient specification and open-source software allows for a facile and stable implementation of spatial occupancy models for large data sets.

  4. Information Measures of Roughness of Knowledge and Rough Sets for Incomplete Information Systems

    Institute of Scientific and Technical Information of China (English)

    LIANG Ji-ye; QU Kai-she

    2001-01-01

    In this paper we address information measures of roughness of knowledge and rough sets for incomplete information systems. The definition of rough entropy of knowledge and its important properties are given. In particular, the relationship between rough entropy of knowledge and the Hartley measure of uncertainty is established. We show that rough entropy of knowledge decreases monotonously as granularity of information become smaller. This gives an information interpretation for roughness of knowledge. Based on rough entropy of knowledge and roughness of rough set. a definition of rough entropy of rough set is proposed, and we show that rough entropy of rough set decreases monotonously as granularity of information become smaller. This gives more accurate measure for roughness of rough set.

  5. Parallel clustering algorithm for large-scale biological data sets.

    Science.gov (United States)

    Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

    2014-01-01

    Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies.

  6. Irreducible descriptive sets of attributes for information systems

    KAUST Repository

    Moshkov, Mikhail

    2010-01-01

    The maximal consistent extension Ext(S) of a given information system S consists of all objects corresponding to attribute values from S which are consistent with all true and realizable rules extracted from the original information system S. An irreducible descriptive set for the considered information system S is a minimal (relative to the inclusion) set B of attributes which defines exactly the set Ext(S) by means of true and realizable rules constructed over attributes from the considered set B. We show that there exists only one irreducible descriptive set of attributes. We present a polynomial algorithm for this set construction. We also study relationships between the cardinality of irreducible descriptive set of attributes and the number of attributes in S. The obtained results will be useful for the design of concurrent data models from experimental data. © 2010 Springer-Verlag.

  7. Large Sets in Boolean and Non-Boolean Groups and Topology

    Directory of Open Access Journals (Sweden)

    Ol’ga V. Sipacheva

    2017-10-01

    Full Text Available Various notions of large sets in groups, including the classical notions of thick, syndetic, and piecewise syndetic sets and the new notion of vast sets in groups, are studied with emphasis on the interplay between such sets in Boolean groups. Natural topologies closely related to vast sets are considered; as a byproduct, interesting relations between vast sets and ultrafilters are revealed.

  8. Caught you: threats to confidentiality due to the public release of large-scale genetic data sets.

    Science.gov (United States)

    Wjst, Matthias

    2010-12-29

    Large-scale genetic data sets are frequently shared with other research groups and even released on the Internet to allow for secondary analysis. Study participants are usually not informed about such data sharing because data sets are assumed to be anonymous after stripping off personal identifiers. The assumption of anonymity of genetic data sets, however, is tenuous because genetic data are intrinsically self-identifying. Two types of re-identification are possible: the "Netflix" type and the "profiling" type. The "Netflix" type needs another small genetic data set, usually with less than 100 SNPs but including a personal identifier. This second data set might originate from another clinical examination, a study of leftover samples or forensic testing. When merged to the primary, unidentified set it will re-identify all samples of that individual. Even with no second data set at hand, a "profiling" strategy can be developed to extract as much information as possible from a sample collection. Starting with the identification of ethnic subgroups along with predictions of body characteristics and diseases, the asthma kids case as a real-life example is used to illustrate that approach. Depending on the degree of supplemental information, there is a good chance that at least a few individuals can be identified from an anonymized data set. Any re-identification, however, may potentially harm study participants because it will release individual genetic disease risks to the public.

  9. Caught you: threats to confidentiality due to the public release of large-scale genetic data sets

    Directory of Open Access Journals (Sweden)

    Wjst Matthias

    2010-12-01

    Full Text Available Abstract Background Large-scale genetic data sets are frequently shared with other research groups and even released on the Internet to allow for secondary analysis. Study participants are usually not informed about such data sharing because data sets are assumed to be anonymous after stripping off personal identifiers. Discussion The assumption of anonymity of genetic data sets, however, is tenuous because genetic data are intrinsically self-identifying. Two types of re-identification are possible: the "Netflix" type and the "profiling" type. The "Netflix" type needs another small genetic data set, usually with less than 100 SNPs but including a personal identifier. This second data set might originate from another clinical examination, a study of leftover samples or forensic testing. When merged to the primary, unidentified set it will re-identify all samples of that individual. Even with no second data set at hand, a "profiling" strategy can be developed to extract as much information as possible from a sample collection. Starting with the identification of ethnic subgroups along with predictions of body characteristics and diseases, the asthma kids case as a real-life example is used to illustrate that approach. Summary Depending on the degree of supplemental information, there is a good chance that at least a few individuals can be identified from an anonymized data set. Any re-identification, however, may potentially harm study participants because it will release individual genetic disease risks to the public.

  10. Operational Aspects of Dealing with the Large BaBar Data Set

    Energy Technology Data Exchange (ETDEWEB)

    Trunov, Artem G

    2003-06-13

    To date, the BaBar experiment has stored over 0.7PB of data in an Objectivity/DB database. Approximately half this data-set comprises simulated data of which more than 70% has been produced at more than 20 collaborating institutes outside of SLAC. The operational aspects of managing such a large data set and providing access to the physicists in a timely manner is a challenging and complex problem. We describe the operational aspects of managing such a large distributed data-set as well as importing and exporting data from geographically spread BaBar collaborators. We also describe problems common to dealing with such large datasets.

  11. Irreducible descriptive sets of attributes for information systems

    KAUST Repository

    Moshkov, Mikhail; Skowron, Andrzej; Suraj, Zbigniew

    2010-01-01

    . An irreducible descriptive set for the considered information system S is a minimal (relative to the inclusion) set B of attributes which defines exactly the set Ext(S) by means of true and realizable rules constructed over attributes from the considered set B

  12. Informed consent comprehension in African research settings.

    Science.gov (United States)

    Afolabi, Muhammed O; Okebe, Joseph U; McGrath, Nuala; Larson, Heidi J; Bojang, Kalifa; Chandramohan, Daniel

    2014-06-01

    Previous reviews on participants' comprehension of informed consent information have focused on developed countries. Experience has shown that ethical standards developed on Western values may not be appropriate for African settings where research concepts are unfamiliar. We undertook this review to describe how informed consent comprehension is defined and measured in African research settings. We conducted a comprehensive search involving five electronic databases: Medline, Embase, Global Health, EthxWeb and Bioethics Literature Database (BELIT). We also examined African Index Medicus and Google Scholar for relevant publications on informed consent comprehension in clinical studies conducted in sub-Saharan Africa. 29 studies satisfied the inclusion criteria; meta-analysis was possible in 21 studies. We further conducted a direct comparison of participants' comprehension on domains of informed consent in all eligible studies. Comprehension of key concepts of informed consent varies considerably from country to country and depends on the nature and complexity of the study. Meta-analysis showed that 47% of a total of 1633 participants across four studies demonstrated comprehension about randomisation (95% CI 13.9-80.9%). Similarly, 48% of 3946 participants in six studies had understanding about placebo (95% CI 19.0-77.5%), while only 30% of 753 participants in five studies understood the concept of therapeutic misconception (95% CI 4.6-66.7%). Measurement tools for informed consent comprehension were developed with little or no validation. Assessment of comprehension was carried out at variable times after disclosure of study information. No uniform definition of informed consent comprehension exists to form the basis for development of an appropriate tool to measure comprehension in African participants. Comprehension of key concepts of informed consent is poor among study participants across Africa. There is a vital need to develop a uniform definition for

  13. Impact of problem-based learning in a large classroom setting: student perception and problem-solving skills.

    Science.gov (United States)

    Klegeris, Andis; Hurren, Heather

    2011-12-01

    Problem-based learning (PBL) can be described as a learning environment where the problem drives the learning. This technique usually involves learning in small groups, which are supervised by tutors. It is becoming evident that PBL in a small-group setting has a robust positive effect on student learning and skills, including better problem-solving skills and an increase in overall motivation. However, very little research has been done on the educational benefits of PBL in a large classroom setting. Here, we describe a PBL approach (using tutorless groups) that was introduced as a supplement to standard didactic lectures in University of British Columbia Okanagan undergraduate biochemistry classes consisting of 45-85 students. PBL was chosen as an effective method to assist students in learning biochemical and physiological processes. By monitoring student attendance and using informal and formal surveys, we demonstrated that PBL has a significant positive impact on student motivation to attend and participate in the course work. Student responses indicated that PBL is superior to traditional lecture format with regard to the understanding of course content and retention of information. We also demonstrated that student problem-solving skills are significantly improved, but additional controlled studies are needed to determine how much PBL exercises contribute to this improvement. These preliminary data indicated several positive outcomes of using PBL in a large classroom setting, although further studies aimed at assessing student learning are needed to further justify implementation of this technique in courses delivered to large undergraduate classes.

  14. MiniWall Tool for Analyzing CFD and Wind Tunnel Large Data Sets

    Science.gov (United States)

    Schuh, Michael J.; Melton, John E.; Stremel, Paul M.

    2017-01-01

    It is challenging to review and assimilate large data sets created by Computational Fluid Dynamics (CFD) simulations and wind tunnel tests. Over the past 10 years, NASA Ames Research Center has developed and refined a software tool dubbed the MiniWall to increase productivity in reviewing and understanding large CFD-generated data sets. Under the recent NASA ERA project, the application of the tool expanded to enable rapid comparison of experimental and computational data. The MiniWall software is browser based so that it runs on any computer or device that can display a web page. It can also be used remotely and securely by using web server software such as the Apache HTTP server. The MiniWall software has recently been rewritten and enhanced to make it even easier for analysts to review large data sets and extract knowledge and understanding from these data sets. This paper describes the MiniWall software and demonstrates how the different features are used to review and assimilate large data sets.

  15. Iterative dictionary construction for compression of large DNA data sets.

    Science.gov (United States)

    Kuruppu, Shanika; Beresford-Smith, Bryan; Conway, Thomas; Zobel, Justin

    2012-01-01

    Genomic repositories increasingly include individual as well as reference sequences, which tend to share long identical and near-identical strings of nucleotides. However, the sequential processing used by most compression algorithms, and the volumes of data involved, mean that these long-range repetitions are not detected. An order-insensitive, disk-based dictionary construction method can detect this repeated content and use it to compress collections of sequences. We explore a dictionary construction method that improves repeat identification in large DNA data sets. Our adaptation, COMRAD, of an existing disk-based method identifies exact repeated content in collections of sequences with similarities within and across the set of input sequences. COMRAD compresses the data over multiple passes, which is an expensive process, but allows COMRAD to compress large data sets within reasonable time and space. COMRAD allows for random access to individual sequences and subsequences without decompressing the whole data set. COMRAD has no competitor in terms of the size of data sets that it can compress (extending to many hundreds of gigabytes) and, even for smaller data sets, the results are competitive compared to alternatives; as an example, 39 S. cerevisiae genomes compressed to 0.25 bits per base.

  16. Data Programming: Creating Large Training Sets, Quickly

    Science.gov (United States)

    Ratner, Alexander; De Sa, Christopher; Wu, Sen; Selsam, Daniel; Ré, Christopher

    2018-01-01

    Large labeled training sets are the critical building blocks of supervised learning methods and are key enablers of deep learning techniques. For some applications, creating labeled training sets is the most time-consuming and expensive part of applying machine learning. We therefore propose a paradigm for the programmatic creation of training sets called data programming in which users express weak supervision strategies or domain heuristics as labeling functions, which are programs that label subsets of the data, but that are noisy and may conflict. We show that by explicitly representing this training set labeling process as a generative model, we can “denoise” the generated training set, and establish theoretically that we can recover the parameters of these generative models in a handful of settings. We then show how to modify a discriminative loss function to make it noise-aware, and demonstrate our method over a range of discriminative models including logistic regression and LSTMs. Experimentally, on the 2014 TAC-KBP Slot Filling challenge, we show that data programming would have led to a new winning score, and also show that applying data programming to an LSTM model leads to a TAC-KBP score almost 6 F1 points over a state-of-the-art LSTM baseline (and into second place in the competition). Additionally, in initial user studies we observed that data programming may be an easier way for non-experts to create machine learning models when training data is limited or unavailable. PMID:29872252

  17. Accelerated EM-based clustering of large data sets

    NARCIS (Netherlands)

    Verbeek, J.J.; Nunnink, J.R.J.; Vlassis, N.

    2006-01-01

    Motivated by the poor performance (linear complexity) of the EM algorithm in clustering large data sets, and inspired by the successful accelerated versions of related algorithms like k-means, we derive an accelerated variant of the EM algorithm for Gaussian mixtures that: (1) offers speedups that

  18. Developing the Role of a Health Information Professional in a Clinical Research Setting

    Directory of Open Access Journals (Sweden)

    Helen M. Seeley

    2010-06-01

    Full Text Available Objective ‐ This paper examines the role of a health information professional in a large multidisciplinary project to improve services for head injury.Methods ‐ An action research approach was taken, with the information professional acting as co‐ordinator. Change management processes were guided by theory and evidence. The health information professional was responsible for an ongoing literature review on knowledge management (clinical and political issues, data collection and analysis (from patient records, collating and comparing data (to help develop standards, and devising appropriate dissemination strategies.Results ‐ Important elements of the health information management role proved to be 1 co‐ordination; 2 setting up mechanisms for collaborative learning through information sharing; and 3 using the theoretical frameworks (identified from the literature review to help guide implementation. The role that emerged here has some similarities to the informationist role that stresses domain knowledge, continuous learning and working in context (embedding. This project also emphasised the importance of co‐ordination, and the ability to work across traditional library information analysis (research literature discovery and appraisal and information analysis of patient data sets (the information management role.Conclusion ‐ Experience with this project indicates that health information professionals will need to be prepared to work with patient record data and synthesis of that data, design systems to co‐ordinate patient data collection, as well as critically appraise external evidence.

  19. Making sense of large data sets without annotations: analyzing age-related correlations from lung CT scans

    Science.gov (United States)

    Dicente Cid, Yashin; Mamonov, Artem; Beers, Andrew; Thomas, Armin; Kovalev, Vassili; Kalpathy-Cramer, Jayashree; Müller, Henning

    2017-03-01

    The analysis of large data sets can help to gain knowledge about specific organs or on specific diseases, just as big data analysis does in many non-medical areas. This article aims to gain information from 3D volumes, so the visual content of lung CT scans of a large number of patients. In the case of the described data set, only little annotation is available on the patients that were all part of an ongoing screening program and besides age and gender no information on the patient and the findings was available for this work. This is a scenario that can happen regularly as image data sets are produced and become available in increasingly large quantities but manual annotations are often not available and also clinical data such as text reports are often harder to share. We extracted a set of visual features from 12,414 CT scans of 9,348 patients that had CT scans of the lung taken in the context of a national lung screening program in Belarus. Lung fields were segmented by two segmentation algorithms and only cases where both algorithms were able to find left and right lung and had a Dice coefficient above 0.95 were analyzed. This assures that only segmentations of good quality were used to extract features of the lung. Patients ranged in age from 0 to 106 years. Data analysis shows that age can be predicted with a fairly high accuracy for persons under 15 years. Relatively good results were also obtained between 30 and 65 years where a steady trend is seen. For young adults and older people the results are not as good as variability is very high in these groups. Several visualizations of the data show the evolution patters of the lung texture, size and density with age. The experiments allow learning the evolution of the lung and the gained results show that even with limited metadata we can extract interesting information from large-scale visual data. These age-related changes (for example of the lung volume, the density histogram of the tissue) can also be

  20. A large scale analysis of information-theoretic network complexity measures using chemical structures.

    Directory of Open Access Journals (Sweden)

    Matthias Dehmer

    Full Text Available This paper aims to investigate information-theoretic network complexity measures which have already been intensely used in mathematical- and medicinal chemistry including drug design. Numerous such measures have been developed so far but many of them lack a meaningful interpretation, e.g., we want to examine which kind of structural information they detect. Therefore, our main contribution is to shed light on the relatedness between some selected information measures for graphs by performing a large scale analysis using chemical networks. Starting from several sets containing real and synthetic chemical structures represented by graphs, we study the relatedness between a classical (partition-based complexity measure called the topological information content of a graph and some others inferred by a different paradigm leading to partition-independent measures. Moreover, we evaluate the uniqueness of network complexity measures numerically. Generally, a high uniqueness is an important and desirable property when designing novel topological descriptors having the potential to be applied to large chemical databases.

  1. Information retrieval pathways for health information exchange in multiple care settings

    DEFF Research Database (Denmark)

    Kierkegaard, Patrick; Kaushal, Rainu; Vest, Joshua R.

    2014-01-01

    Objectives To determine which health information exchange (HIE) technologies and information retrieval pathways healthcare professionals relied on to meet their information needs in the context of laboratory test results, radiological images and reports, and medication histories. Study Design...... The study reveals that healthcare professionals used a complex combination of information retrieval pathways for HIE to obtain clinical information from external organizations. The choice for each approach was setting- and information-specific, but was also highly dynamic across users and their information...... needs. Conclusions Our findings about the complex nature of information sharing in healthcare provide insights for informatics professionals about the usage of information; indicate the need for managerial support within each organization; and suggest approaches to improve systems for organizations...

  2. Rough Standard Neutrosophic Sets: An Application on Standard Neutrosophic Information Systems

    Directory of Open Access Journals (Sweden)

    Nguyen Xuan Thao

    2016-12-01

    Full Text Available A rough fuzzy set is the result of the approximation of a fuzzy set with respect to a crisp approximation space. It is a mathematical tool for the knowledge discovery in the fuzzy information systems. In this paper, we introduce the concepts of rough standard neutrosophic sets and standard neutrosophic information system, and give some results of the knowledge discovery on standard neutrosophic information system based on rough standard neutrosophic sets.

  3. Entropy Based Feature Selection for Fuzzy Set-Valued Information Systems

    Science.gov (United States)

    Ahmed, Waseem; Sufyan Beg, M. M.; Ahmad, Tanvir

    2018-06-01

    In Set-valued Information Systems (SIS), several objects contain more than one value for some attributes. Tolerance relation used for handling SIS sometimes leads to loss of certain information. To surmount this problem, fuzzy rough model was introduced. However, in some cases, SIS may contain some real or continuous set-values. Therefore, the existing fuzzy rough model for handling Information system with fuzzy set-values needs some changes. In this paper, Fuzzy Set-valued Information System (FSIS) is proposed and fuzzy similarity relation for FSIS is defined. Yager's relative conditional entropy was studied to find the significance measure of a candidate attribute of FSIS. Later, using these significance values, three greedy forward algorithms are discussed for finding the reduct and relative reduct for the proposed FSIS. An experiment was conducted on a sample population of the real dataset and a comparison of classification accuracies of the proposed FSIS with the existing SIS and single-valued Fuzzy Information Systems was made, which demonstrated the effectiveness of proposed FSIS.

  4. Non-local setting and outcome information for violation of Bell's inequality

    International Nuclear Information System (INIS)

    Pawlowski, Marcin; Kofler, Johannes; Paterek, Tomasz; Brukner, Caslav; Seevinck, Michael

    2010-01-01

    Bell's theorem is a no-go theorem stating that quantum mechanics cannot be reproduced by a physical theory based on realism, freedom to choose experimental settings and two locality conditions: setting (SI) and outcome (OI) independence. We provide a novel analysis of what it takes to violate Bell's inequality within the framework in which both realism and freedom of choice are assumed, by showing that it is impossible to model a violation without having information in one laboratory about both the setting and the outcome at the distant one. While it is possible that outcome information can be revealed from shared hidden variables, the assumed experimenter's freedom to choose the settings ensures that the setting information must be non-locally transferred even when the SI condition is obeyed. The amount of transmitted information about the setting that is sufficient to violate the CHSH inequality up to its quantum mechanical maximum is 0.736 bits.

  5. Management of a Large Qualitative Data Set: Establishing Trustworthiness of the Data

    Directory of Open Access Journals (Sweden)

    Debbie Elizabeth White RN, PhD

    2012-07-01

    Full Text Available Health services research is multifaceted and impacted by the multiple contexts and stakeholders involved. Hence, large data sets are necessary to fully understand the complex phenomena (e.g., scope of nursing practice being studied. The management of these large data sets can lead to numerous challenges in establishing trustworthiness of the study. This article reports on strategies utilized in data collection and analysis of a large qualitative study to establish trustworthiness. Specific strategies undertaken by the research team included training of interviewers and coders, variation in participant recruitment, consistency in data collection, completion of data cleaning, development of a conceptual framework for analysis, consistency in coding through regular communication and meetings between coders and key research team members, use of N6™ software to organize data, and creation of a comprehensive audit trail with internal and external audits. Finally, we make eight recommendations that will help ensure rigour for studies with large qualitative data sets: organization of the study by a single person; thorough documentation of the data collection and analysis process; attention to timelines; the use of an iterative process for data collection and analysis; internal and external audits; regular communication among the research team; adequate resources for timely completion; and time for reflection and diversion. Following these steps will enable researchers to complete a rigorous, qualitative research study when faced with large data sets to answer complex health services research questions.

  6. Information retrieval pathways for health information exchange in multiple care settings.

    Science.gov (United States)

    Kierkegaard, Patrick; Kaushal, Rainu; Vest, Joshua R

    2014-11-01

    To determine which health information exchange (HIE) technologies and information retrieval pathways healthcare professionals relied on to meet their information needs in the context of laboratory test results, radiological images and reports, and medication histories. Primary data was collected over a 2-month period across 3 emergency departments, 7 primary care practices, and 2 public health clinics in New York state. Qualitative research methods were used to collect and analyze data from semi-structured interviews and participant observation. The study reveals that healthcare professionals used a complex combination of information retrieval pathways for HIE to obtain clinical information from external organizations. The choice for each approach was setting- and information-specific, but was also highly dynamic across users and their information needs. Our findings about the complex nature of information sharing in healthcare provide insights for informatics professionals about the usage of information; indicate the need for managerial support within each organization; and suggest approaches to improve systems for organizations and agencies working to expand HIE adoption.

  7. Information sets as permutation cycles for quadratic residue codes

    Directory of Open Access Journals (Sweden)

    Richard A. Jenson

    1982-01-01

    Full Text Available The two cases p=7 and p=23 are the only known cases where the automorphism group of the [p+1,   (p+1/2] extended binary quadratic residue code, O(p, properly contains PSL(2,p. These codes have some of their information sets represented as permutation cycles from Aut(Q(p. Analysis proves that all information sets of Q(7 are so represented but those of Q(23 are not.

  8. A database paradigm for the management of DICOM-RT structure sets using a geographic information system

    International Nuclear Information System (INIS)

    Shao, Weber; Kupelian, Patrick A; Wang, Jason; Low, Daniel A; Ruan, Dan

    2014-01-01

    We devise a paradigm for representing the DICOM-RT structure sets in a database management system, in such way that secondary calculations of geometric information can be performed quickly from the existing contour definitions. The implementation of this paradigm is achieved using the PostgreSQL database system and the PostGIS extension, a geographic information system commonly used for encoding geographical map data. The proposed paradigm eliminates the overhead of retrieving large data records from the database, as well as the need to implement various numerical and data parsing routines, when additional information related to the geometry of the anatomy is desired.

  9. A database paradigm for the management of DICOM-RT structure sets using a geographic information system

    Science.gov (United States)

    Shao, Weber; Kupelian, Patrick A.; Wang, Jason; Low, Daniel A.; Ruan, Dan

    2014-03-01

    We devise a paradigm for representing the DICOM-RT structure sets in a database management system, in such way that secondary calculations of geometric information can be performed quickly from the existing contour definitions. The implementation of this paradigm is achieved using the PostgreSQL database system and the PostGIS extension, a geographic information system commonly used for encoding geographical map data. The proposed paradigm eliminates the overhead of retrieving large data records from the database, as well as the need to implement various numerical and data parsing routines, when additional information related to the geometry of the anatomy is desired.

  10. A full scale approximation of covariance functions for large spatial data sets

    KAUST Repository

    Sang, Huiyan

    2011-10-10

    Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n 3) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new approximation scheme is developed to provide a high quality approximation to the covariance function at both the large and the small spatial scales. The new approximation is the summation of two parts: a reduced rank covariance and a compactly supported covariance obtained by tapering the covariance of the residual of the reduced rank approximation. Whereas the former part mainly captures the large-scale spatial variation, the latter part captures the small-scale, local variation that is unexplained by the former part. By combining the reduced rank representation and sparse matrix techniques, our approach allows for efficient computation for maximum likelihood estimation, spatial prediction and Bayesian inference. We illustrate the new approach with simulated and real data sets. © 2011 Royal Statistical Society.

  11. A full scale approximation of covariance functions for large spatial data sets

    KAUST Repository

    Sang, Huiyan; Huang, Jianhua Z.

    2011-01-01

    Gaussian process models have been widely used in spatial statistics but face tremendous computational challenges for very large data sets. The model fitting and spatial prediction of such models typically require O(n 3) operations for a data set of size n. Various approximations of the covariance functions have been introduced to reduce the computational cost. However, most existing approximations cannot simultaneously capture both the large- and the small-scale spatial dependence. A new approximation scheme is developed to provide a high quality approximation to the covariance function at both the large and the small spatial scales. The new approximation is the summation of two parts: a reduced rank covariance and a compactly supported covariance obtained by tapering the covariance of the residual of the reduced rank approximation. Whereas the former part mainly captures the large-scale spatial variation, the latter part captures the small-scale, local variation that is unexplained by the former part. By combining the reduced rank representation and sparse matrix techniques, our approach allows for efficient computation for maximum likelihood estimation, spatial prediction and Bayesian inference. We illustrate the new approach with simulated and real data sets. © 2011 Royal Statistical Society.

  12. Data-Driven Derivation of an "Informer Compound Set" for Improved Selection of Active Compounds in High-Throughput Screening.

    Science.gov (United States)

    Paricharak, Shardul; IJzerman, Adriaan P; Jenkins, Jeremy L; Bender, Andreas; Nigsch, Florian

    2016-09-26

    Despite the usefulness of high-throughput screening (HTS) in drug discovery, for some systems, low assay throughput or high screening cost can prohibit the screening of large numbers of compounds. In such cases, iterative cycles of screening involving active learning (AL) are employed, creating the need for smaller "informer sets" that can be routinely screened to build predictive models for selecting compounds from the screening collection for follow-up screens. Here, we present a data-driven derivation of an informer compound set with improved predictivity of active compounds in HTS, and we validate its benefit over randomly selected training sets on 46 PubChem assays comprising at least 300,000 compounds and covering a wide range of assay biology. The informer compound set showed improvement in BEDROC(α = 100), PRAUC, and ROCAUC values averaged over all assays of 0.024, 0.014, and 0.016, respectively, compared to randomly selected training sets, all with paired t-test p-values agnostic fashion. This approach led to a consistent improvement in hit rates in follow-up screens without compromising scaffold retrieval. The informer set is adjustable in size depending on the number of compounds one intends to screen, as performance gains are realized for sets with more than 3,000 compounds, and this set is therefore applicable to a variety of situations. Finally, our results indicate that random sampling may not adequately cover descriptor space, drawing attention to the importance of the composition of the training set for predicting actives.

  13. The Generalization of Mutual Information as the Information between a Set of Variables: The Information Correlation Function Hierarchy and the Information Structure of Multi-Agent Systems

    Science.gov (United States)

    Wolf, David R.

    2004-01-01

    The topic of this paper is a hierarchy of information-like functions, here named the information correlation functions, where each function of the hierarchy may be thought of as the information between the variables it depends upon. The information correlation functions are particularly suited to the description of the emergence of complex behaviors due to many- body or many-agent processes. They are particularly well suited to the quantification of the decomposition of the information carried among a set of variables or agents, and its subsets. In more graphical language, they provide the information theoretic basis for understanding the synergistic and non-synergistic components of a system, and as such should serve as a forceful toolkit for the analysis of the complexity structure of complex many agent systems. The information correlation functions are the natural generalization to an arbitrary number of sets of variables of the sequence starting with the entropy function (one set of variables) and the mutual information function (two sets). We start by describing the traditional measures of information (entropy) and mutual information.

  14. Envision: An interactive system for the management and visualization of large geophysical data sets

    Science.gov (United States)

    Searight, K. R.; Wojtowicz, D. P.; Walsh, J. E.; Pathi, S.; Bowman, K. P.; Wilhelmson, R. B.

    1995-01-01

    Envision is a software project at the University of Illinois and Texas A&M, funded by NASA's Applied Information Systems Research Project. It provides researchers in the geophysical sciences convenient ways to manage, browse, and visualize large observed or model data sets. Envision integrates data management, analysis, and visualization of geophysical data in an interactive environment. It employs commonly used standards in data formats, operating systems, networking, and graphics. It also attempts, wherever possible, to integrate with existing scientific visualization and analysis software. Envision has an easy-to-use graphical interface, distributed process components, and an extensible design. It is a public domain package, freely available to the scientific community.

  15. Obtaining and providing health information in the community pharmacy setting.

    Science.gov (United States)

    Iwanowicz, Susan L; Marciniak, Macary Weck; Zeolla, Mario M

    2006-06-15

    Community pharmacists are a valuable information resource for patients and other healthcare providers. The advent of new information technology, most notably the Internet, coupled with the rapid availability of new healthcare information, has fueled this demand. Pharmacy students must receive training that enables them to meet this need. Community advanced pharmacy practice experiences (APPEs) provide an excellent opportunity for students to develop and master drug information skills in a real-world setting. Preceptors must ensure that students are familiar with drug information resources and can efficiently identify the most useful resource for a given topic. Students must also be trained to assess the quality of resources and use this information to effectively respond to drug or health information inquiries. This article will discuss key aspects of providing drug information in the community pharmacy setting and can serve as a guide and resource for APPE preceptors.

  16. Optimizing distance-based methods for large data sets

    Science.gov (United States)

    Scholl, Tobias; Brenner, Thomas

    2015-10-01

    Distance-based methods for measuring spatial concentration of industries have received an increasing popularity in the spatial econometrics community. However, a limiting factor for using these methods is their computational complexity since both their memory requirements and running times are in {{O}}(n^2). In this paper, we present an algorithm with constant memory requirements and shorter running time, enabling distance-based methods to deal with large data sets. We discuss three recent distance-based methods in spatial econometrics: the D&O-Index by Duranton and Overman (Rev Econ Stud 72(4):1077-1106, 2005), the M-function by Marcon and Puech (J Econ Geogr 10(5):745-762, 2010) and the Cluster-Index by Scholl and Brenner (Reg Stud (ahead-of-print):1-15, 2014). Finally, we present an alternative calculation for the latter index that allows the use of data sets with millions of firms.

  17. Considerations for Observational Research Using Large Data Sets in Radiation Oncology

    Energy Technology Data Exchange (ETDEWEB)

    Jagsi, Reshma, E-mail: rjagsi@med.umich.edu [Department of Radiation Oncology, University of Michigan, Ann Arbor, Michigan (United States); Bekelman, Justin E. [Departments of Radiation Oncology and Medical Ethics and Health Policy, University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania (United States); Chen, Aileen [Department of Radiation Oncology, Harvard Medical School, Boston, Massachusetts (United States); Chen, Ronald C. [Department of Radiation Oncology, University of North Carolina at Chapel Hill School of Medicine, Chapel Hill, North Carolina (United States); Hoffman, Karen [Department of Radiation Oncology, Division of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Tina Shih, Ya-Chen [Department of Medicine, Section of Hospital Medicine, The University of Chicago, Chicago, Illinois (United States); Smith, Benjamin D. [Department of Radiation Oncology, Division of Radiation Oncology, and Department of Health Services Research, The University of Texas MD Anderson Cancer Center, Houston, Texas (United States); Yu, James B. [Yale School of Medicine, New Haven, Connecticut (United States)

    2014-09-01

    The radiation oncology community has witnessed growing interest in observational research conducted using large-scale data sources such as registries and claims-based data sets. With the growing emphasis on observational analyses in health care, the radiation oncology community must possess a sophisticated understanding of the methodological considerations of such studies in order to evaluate evidence appropriately to guide practice and policy. Because observational research has unique features that distinguish it from clinical trials and other forms of traditional radiation oncology research, the International Journal of Radiation Oncology, Biology, Physics assembled a panel of experts in health services research to provide a concise and well-referenced review, intended to be informative for the lay reader, as well as for scholars who wish to embark on such research without prior experience. This review begins by discussing the types of research questions relevant to radiation oncology that large-scale databases may help illuminate. It then describes major potential data sources for such endeavors, including information regarding access and insights regarding the strengths and limitations of each. Finally, it provides guidance regarding the analytical challenges that observational studies must confront, along with discussion of the techniques that have been developed to help minimize the impact of certain common analytical issues in observational analysis. Features characterizing a well-designed observational study include clearly defined research questions, careful selection of an appropriate data source, consultation with investigators with relevant methodological expertise, inclusion of sensitivity analyses, caution not to overinterpret small but significant differences, and recognition of limitations when trying to evaluate causality. This review concludes that carefully designed and executed studies using observational data that possess these qualities hold

  18. Considerations for Observational Research Using Large Data Sets in Radiation Oncology

    International Nuclear Information System (INIS)

    Jagsi, Reshma; Bekelman, Justin E.; Chen, Aileen; Chen, Ronald C.; Hoffman, Karen; Tina Shih, Ya-Chen; Smith, Benjamin D.; Yu, James B.

    2014-01-01

    The radiation oncology community has witnessed growing interest in observational research conducted using large-scale data sources such as registries and claims-based data sets. With the growing emphasis on observational analyses in health care, the radiation oncology community must possess a sophisticated understanding of the methodological considerations of such studies in order to evaluate evidence appropriately to guide practice and policy. Because observational research has unique features that distinguish it from clinical trials and other forms of traditional radiation oncology research, the International Journal of Radiation Oncology, Biology, Physics assembled a panel of experts in health services research to provide a concise and well-referenced review, intended to be informative for the lay reader, as well as for scholars who wish to embark on such research without prior experience. This review begins by discussing the types of research questions relevant to radiation oncology that large-scale databases may help illuminate. It then describes major potential data sources for such endeavors, including information regarding access and insights regarding the strengths and limitations of each. Finally, it provides guidance regarding the analytical challenges that observational studies must confront, along with discussion of the techniques that have been developed to help minimize the impact of certain common analytical issues in observational analysis. Features characterizing a well-designed observational study include clearly defined research questions, careful selection of an appropriate data source, consultation with investigators with relevant methodological expertise, inclusion of sensitivity analyses, caution not to overinterpret small but significant differences, and recognition of limitations when trying to evaluate causality. This review concludes that carefully designed and executed studies using observational data that possess these qualities hold

  19. A summarization approach for Affymetrix GeneChip data using a reference training set from a large, biologically diverse database

    Directory of Open Access Journals (Sweden)

    Tripputi Mark

    2006-10-01

    Full Text Available Abstract Background Many of the most popular pre-processing methods for Affymetrix expression arrays, such as RMA, gcRMA, and PLIER, simultaneously analyze data across a set of predetermined arrays to improve precision of the final measures of expression. One problem associated with these algorithms is that expression measurements for a particular sample are highly dependent on the set of samples used for normalization and results obtained by normalization with a different set may not be comparable. A related problem is that an organization producing and/or storing large amounts of data in a sequential fashion will need to either re-run the pre-processing algorithm every time an array is added or store them in batches that are pre-processed together. Furthermore, pre-processing of large numbers of arrays requires loading all the feature-level data into memory which is a difficult task even with modern computers. We utilize a scheme that produces all the information necessary for pre-processing using a very large training set that can be used for summarization of samples outside of the training set. All subsequent pre-processing tasks can be done on an individual array basis. We demonstrate the utility of this approach by defining a new version of the Robust Multi-chip Averaging (RMA algorithm which we refer to as refRMA. Results We assess performance based on multiple sets of samples processed over HG U133A Affymetrix GeneChip® arrays. We show that the refRMA workflow, when used in conjunction with a large, biologically diverse training set, results in the same general characteristics as that of RMA in its classic form when comparing overall data structure, sample-to-sample correlation, and variation. Further, we demonstrate that the refRMA workflow and reference set can be robustly applied to naïve organ types and to benchmark data where its performance indicates respectable results. Conclusion Our results indicate that a biologically diverse

  20. A conceptual analysis of standard setting in large-scale assessments

    NARCIS (Netherlands)

    van der Linden, Willem J.

    1994-01-01

    Elements of arbitrariness in the standard setting process are explored, and an alternative to the use of cut scores is presented. The first part of the paper analyzes the use of cut scores in large-scale assessments, discussing three different functions: (1) cut scores define the qualifications used

  1. Teaching the Assessment of Normality Using Large Easily-Generated Real Data Sets

    Science.gov (United States)

    Kulp, Christopher W.; Sprechini, Gene D.

    2016-01-01

    A classroom activity is presented, which can be used in teaching students statistics with an easily generated, large, real world data set. The activity consists of analyzing a video recording of an object. The colour data of the recorded object can then be used as a data set to explore variation in the data using graphs including histograms,…

  2. Simultaneous identification of long similar substrings in large sets of sequences

    Directory of Open Access Journals (Sweden)

    Wittig Burghardt

    2007-05-01

    Full Text Available Abstract Background Sequence comparison faces new challenges today, with many complete genomes and large libraries of transcripts known. Gene annotation pipelines match these sequences in order to identify genes and their alternative splice forms. However, the software currently available cannot simultaneously compare sets of sequences as large as necessary especially if errors must be considered. Results We therefore present a new algorithm for the identification of almost perfectly matching substrings in very large sets of sequences. Its implementation, called ClustDB, is considerably faster and can handle 16 times more data than VMATCH, the most memory efficient exact program known today. ClustDB simultaneously generates large sets of exactly matching substrings of a given minimum length as seeds for a novel method of match extension with errors. It generates alignments of maximum length with a considered maximum number of errors within each overlapping window of a given size. Such alignments are not optimal in the usual sense but faster to calculate and often more appropriate than traditional alignments for genomic sequence comparisons, EST and full-length cDNA matching, and genomic sequence assembly. The method is used to check the overlaps and to reveal possible assembly errors for 1377 Medicago truncatula BAC-size sequences published at http://www.medicago.org/genome/assembly_table.php?chr=1. Conclusion The program ClustDB proves that window alignment is an efficient way to find long sequence sections of homogenous alignment quality, as expected in case of random errors, and to detect systematic errors resulting from sequence contaminations. Such inserts are systematically overlooked in long alignments controlled by only tuning penalties for mismatches and gaps. ClustDB is freely available for academic use.

  3. Requirements and principles for the implementation and construction of large-scale geographic information systems

    Science.gov (United States)

    Smith, Terence R.; Menon, Sudhakar; Star, Jeffrey L.; Estes, John E.

    1987-01-01

    This paper provides a brief survey of the history, structure and functions of 'traditional' geographic information systems (GIS), and then suggests a set of requirements that large-scale GIS should satisfy, together with a set of principles for their satisfaction. These principles, which include the systematic application of techniques from several subfields of computer science to the design and implementation of GIS and the integration of techniques from computer vision and image processing into standard GIS technology, are discussed in some detail. In particular, the paper provides a detailed discussion of questions relating to appropriate data models, data structures and computational procedures for the efficient storage, retrieval and analysis of spatially-indexed data.

  4. Settings and artefacts relevant for Doppler ultrasound in large vessel vasculitis

    DEFF Research Database (Denmark)

    Terslev, L; Diamantopoulos, A P; Døhn, U Møller

    2017-01-01

    Ultrasound is used increasingly for diagnosing large vessel vasculitis (LVV). The application of Doppler in LVV is very different from in arthritic conditions. This paper aims to explain the most important Doppler parameters, including spectral Doppler, and how the settings differ from those used...

  5. Secondary data analysis of large data sets in urology: successes and errors to avoid.

    Science.gov (United States)

    Schlomer, Bruce J; Copp, Hillary L

    2014-03-01

    Secondary data analysis is the use of data collected for research by someone other than the investigator. In the last several years there has been a dramatic increase in the number of these studies being published in urological journals and presented at urological meetings, especially involving secondary data analysis of large administrative data sets. Along with this expansion, skepticism for secondary data analysis studies has increased for many urologists. In this narrative review we discuss the types of large data sets that are commonly used for secondary data analysis in urology, and discuss the advantages and disadvantages of secondary data analysis. A literature search was performed to identify urological secondary data analysis studies published since 2008 using commonly used large data sets, and examples of high quality studies published in high impact journals are given. We outline an approach for performing a successful hypothesis or goal driven secondary data analysis study and highlight common errors to avoid. More than 350 secondary data analysis studies using large data sets have been published on urological topics since 2008 with likely many more studies presented at meetings but never published. Nonhypothesis or goal driven studies have likely constituted some of these studies and have probably contributed to the increased skepticism of this type of research. However, many high quality, hypothesis driven studies addressing research questions that would have been difficult to conduct with other methods have been performed in the last few years. Secondary data analysis is a powerful tool that can address questions which could not be adequately studied by another method. Knowledge of the limitations of secondary data analysis and of the data sets used is critical for a successful study. There are also important errors to avoid when planning and performing a secondary data analysis study. Investigators and the urological community need to strive to use

  6. Security Optimization for Distributed Applications Oriented on Very Large Data Sets

    Directory of Open Access Journals (Sweden)

    Mihai DOINEA

    2010-01-01

    Full Text Available The paper presents the main characteristics of applications which are working with very large data sets and the issues related to security. First section addresses the optimization process and how it is approached when dealing with security. The second section describes the concept of very large datasets management while in the third section the risks related are identified and classified. Finally, a security optimization schema is presented with a cost-efficiency analysis upon its feasibility. Conclusions are drawn and future approaches are identified.

  7. Informal Language Learning Setting: Technology or Social Interaction?

    Science.gov (United States)

    Bahrani, Taher; Sim, Tam Shu

    2012-01-01

    Based on the informal language learning theory, language learning can occur outside the classroom setting unconsciously and incidentally through interaction with the native speakers or exposure to authentic language input through technology. However, an EFL context lacks the social interaction which naturally occurs in an ESL context. To explore…

  8. Psychology of Agenda-Setting Effects. Mapping the Paths of Information Processing

    Directory of Open Access Journals (Sweden)

    Maxwell McCombs

    2014-01-01

    Full Text Available The concept of Need for Orientation introduced in the early years of agenda-setting research provided a psychological explanation for why agenda-setting effects occur in terms of what individuals bring to the media experience that determines the strength of these effects. Until recently, there had been no significant additions to our knowledge about the psychology of agenda-setting effects. However, the concept of Need for Orientation is only one part of the answer to the question about why agenda setting occurs. Recent research outlines a second way to answer the why question by describing the psychological process through which these effects occur. In this review, we integrate four contemporary studies that explicate dual psychological paths that lead to agenda-setting effects at the first and second levels. We then examine how information preferences and selective exposure can be profitably included in the agenda-setting framework. Complementing these new models of information processing and varying attention to media content and presentation cues, an expanded concept of psychological relevance, motivated reasoning goals (accuracy versus directional goals, and issue publics are discussed.

  9. Climate change adaptation in informal settings: Understanding and ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    People living in informal urban settings in Latin America and the Caribbean are ... by formal institutions in small- and medium-sized cities in Latin America and the ... the protection of humans and the built environment from water, and income ... for Women in Science for the Developing World (OWSD), IDRC is pleased to ...

  10. Environmental settings for selected U.S. Department of Energy installations - support information for the Programmatic Environmental Impact Statement

    International Nuclear Information System (INIS)

    Holdren, G.R.; Glantz, C.S.; Berg, L.K.; Delinger, K.; Goodwin, S.M.; Rustad, J.R.; Schalla, R.; Schramke, J.A.

    1994-12-01

    This report contains the environmental setting information developed for 20 U.S. Department of Energy (DOE) installations in support of the DOE's Programmatic Environmental Impact Study (PEIS). The objective of the PEIS is to provide the public with information about the types of radiological and hazardous wastes and environmental contamination problems associated with major DOE facilities across the country, and to assess the relative risks that these wastes pose to the public, onsite workers, and the environment. Environmental setting information consists of the site-specific data required to model (using the Multimedia Environmental Pollutant Assessment System) the atmospheric, groundwater, and surface water transport of contaminants within and near the boundaries of the installations. The environmental settings data describes the climate, atmospheric dispersion, hydrogeology, and surface water characteristics of the installations. The number of discrete environmental settings established for each installation was governed by two competing requirements: (1) the risks posed by contaminants released from numerous waste sites were to be modeled as accurately as possible, and (2) the modeling required for numerous release sites and a large number of contaminants had to be completed within the limits imposed by the PEIS schedule. The final product is the result of attempts to balance these competing concerns in a way that minimizes the number of settings per installation in order to meet the project schedule while at the same time providing adequate, if sometimes highly simplified, representations of the different areas within an installation. Environmental settings were developed in conjunction with installation experts in the fields of meteorology, geology, hydrology, and geochemistry. When possible, local experts participated in the initial development, fine tuning, and final review of the PEIS environmental settings

  11. Polish Phoneme Statistics Obtained On Large Set Of Written Texts

    Directory of Open Access Journals (Sweden)

    Bartosz Ziółko

    2009-01-01

    Full Text Available The phonetical statistics were collected from several Polish corpora. The paper is a summaryof the data which are phoneme n-grams and some phenomena in the statistics. Triphonestatistics apply context-dependent speech units which have an important role in speech recognitionsystems and were never calculated for a large set of Polish written texts. The standardphonetic alphabet for Polish, SAMPA, and methods of providing phonetic transcriptions are described.

  12. Cancer survival classification using integrated data sets and intermediate information.

    Science.gov (United States)

    Kim, Shinuk; Park, Taesung; Kon, Mark

    2014-09-01

    Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS

  13. A scalable method for identifying frequent subtrees in sets of large phylogenetic trees.

    Science.gov (United States)

    Ramu, Avinash; Kahveci, Tamer; Burleigh, J Gordon

    2012-10-03

    We consider the problem of finding the maximum frequent agreement subtrees (MFASTs) in a collection of phylogenetic trees. Existing methods for this problem often do not scale beyond datasets with around 100 taxa. Our goal is to address this problem for datasets with over a thousand taxa and hundreds of trees. We develop a heuristic solution that aims to find MFASTs in sets of many, large phylogenetic trees. Our method works in multiple phases. In the first phase, it identifies small candidate subtrees from the set of input trees which serve as the seeds of larger subtrees. In the second phase, it combines these small seeds to build larger candidate MFASTs. In the final phase, it performs a post-processing step that ensures that we find a frequent agreement subtree that is not contained in a larger frequent agreement subtree. We demonstrate that this heuristic can easily handle data sets with 1000 taxa, greatly extending the estimation of MFASTs beyond current methods. Although this heuristic does not guarantee to find all MFASTs or the largest MFAST, it found the MFAST in all of our synthetic datasets where we could verify the correctness of the result. It also performed well on large empirical data sets. Its performance is robust to the number and size of the input trees. Overall, this method provides a simple and fast way to identify strongly supported subtrees within large phylogenetic hypotheses.

  14. An investigation of children's levels of inquiry in an informal science setting

    Science.gov (United States)

    Clark-Thomas, Beth Anne

    Elementary school students' understanding of both science content and processes are enhanced by the higher level thinking associated with inquiry-based science investigations. Informal science setting personnel, elementary school teachers, and curriculum specialists charged with designing inquiry-based investigations would be well served by an understanding of the varying influence of certain present factors upon the students' willingness and ability to delve into such higher level inquiries. This study examined young children's use of inquiry-based materials and factors which may influence the level of inquiry they engaged in during informal science activities. An informal science setting was selected as the context for the examination of student inquiry behaviors because of the rich inquiry-based environment present at the site and the benefits previously noted in the research regarding the impact of informal science settings upon the construction of knowledge in science. The study revealed several patterns of behavior among children when they are engaged in inquiry-based activities at informal science exhibits. These repeated behaviors varied in the children's apparent purposeful use of the materials at the exhibits. These levels of inquiry behavior were taxonomically defined as high/medium/low within this study utilizing a researcher-developed tool. Furthermore, in this study adult interventions, questions, or prompting were found to impact the level of inquiry engaged in by the children. This study revealed that higher levels of inquiry were preceded by task directed and physical feature prompts. Moreover, the levels of inquiry behaviors were haltered, even lowered, when preceded by a prompt that focused on a science content or concept question. Results of this study have implications for the enhancement of inquiry-based science activities in elementary schools as well as in informal science settings. These findings have significance for all science educators

  15. SUPPORT Tools for evidence-informed health Policymaking (STP) 3: Setting priorities for supporting evidence-informed policymaking.

    Science.gov (United States)

    Lavis, John N; Oxman, Andrew D; Lewin, Simon; Fretheim, Atle

    2009-12-16

    This article is part of a series written for people responsible for making decisions about health policies and programmes and for those who support these decision makers. Policymakers have limited resources for developing--or supporting the development of--evidence-informed policies and programmes. These required resources include staff time, staff infrastructural needs (such as access to a librarian or journal article purchasing), and ongoing professional development. They may therefore prefer instead to contract out such work to independent units with more suitably skilled staff and appropriate infrastructure. However, policymakers may only have limited financial resources to do so. Regardless of whether the support for evidence-informed policymaking is provided in-house or contracted out, or whether it is centralised or decentralised, resources always need to be used wisely in order to maximise their impact. Examples of undesirable practices in a priority-setting approach include timelines to support evidence-informed policymaking being negotiated on a case-by-case basis (instead of having clear norms about the level of support that can be provided for each timeline), implicit (rather than explicit) criteria for setting priorities, ad hoc (rather than systematic and explicit) priority-setting process, and the absence of both a communications plan and a monitoring and evaluation plan. In this article, we suggest questions that can guide those setting priorities for finding and using research evidence to support evidence-informed policymaking. These are: 1. Does the approach to prioritisation make clear the timelines that have been set for addressing high-priority issues in different ways? 2. Does the approach incorporate explicit criteria for determining priorities? 3. Does the approach incorporate an explicit process for determining priorities? 4. Does the approach incorporate a communications strategy and a monitoring and evaluation plan?

  16. Rough Set Approach to Incomplete Multiscale Information System

    Science.gov (United States)

    Yang, Xibei; Qi, Yong; Yu, Dongjun; Yu, Hualong; Song, Xiaoning; Yang, Jingyu

    2014-01-01

    Multiscale information system is a new knowledge representation system for expressing the knowledge with different levels of granulations. In this paper, by considering the unknown values, which can be seen everywhere in real world applications, the incomplete multiscale information system is firstly investigated. The descriptor technique is employed to construct rough sets at different scales for analyzing the hierarchically structured data. The problem of unravelling decision rules at different scales is also addressed. Finally, the reduct descriptors are formulated to simplify decision rules, which can be derived from different scales. Some numerical examples are employed to substantiate the conceptual arguments. PMID:25276852

  17. Efficient algorithms for collaborative decision making for large scale settings

    DEFF Research Database (Denmark)

    Assent, Ira

    2011-01-01

    to bring about more effective and more efficient retrieval systems that support the users' decision making process. We sketch promising research directions for more efficient algorithms for collaborative decision making, especially for large scale systems.......Collaborative decision making is a successful approach in settings where data analysis and querying can be done interactively. In large scale systems with huge data volumes or many users, collaboration is often hindered by impractical runtimes. Existing work on improving collaboration focuses...... on avoiding redundancy for users working on the same task. While this improves the effectiveness of the user work process, the underlying query processing engine is typically considered a "black box" and left unchanged. Research in multiple query processing, on the other hand, ignores the application...

  18. On the choice of an optimal value-set of qualitative attributes for information retrieval in databases

    International Nuclear Information System (INIS)

    Ryjov, A.; Loginov, D.

    1994-01-01

    The problem of choosing an optimal set of significances of qualitative attributes for information retrieval in databases is addressed. Given a particular database, a set of significances is called optimal if it results in the minimization of losses of information and information noise for information retrieval in the data base. Obviously, such a set of significances depends on the statistical parameters of the data base. The software, which enables to calculate on the basis of the statistical parameters of the given data base, the losses of information and the information noise for arbitrary sets of significances of qualitative attributes, is described. The software also permits to compare various sets of significances of qualitative attributes and to choose the optimal set of significances

  19. A guide to innovation in informal settings | IDRC - International ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    2012-11-06

    Nov 6, 2012 ... Innovation in Informal Settings: A Research Agenda by Susan Cozzens and Judith Sutz presents a framework that can be used by researchers, ... will find this theoretical guide useful as it explores the five criteria of innovation: ...

  20. Collaboration and Virtualization in Large Information Systems Projects

    Directory of Open Access Journals (Sweden)

    Stefan Ioan NITCHI

    2009-01-01

    Full Text Available A project is evolving through different phases from idea and conception until the experiments, implementation and maintenance. The globalization, the Internet, the Web and the mobile computing changed many human activities, and in this respect, the realization of the Information System (IS projects. The projects are growing, the teams are geographically distributed, and the users are heterogeneous. In this respect, the realization of the large Information Technology (IT projects needs to use collaborative technologies. The distribution of the team, the users' heterogeneity and the project complexity determines the virtualization. This paper is an overview of these aspects for large IT projects. It shortly present a general framework developed by the authors for collaborative systems in general and adapted to collaborative project management. The general considerations are illustrated on the case of a large IT project in which the authors were involved.

  1. Knowledge discovery: Extracting usable information from large amounts of data

    International Nuclear Information System (INIS)

    Whiteson, R.

    1998-01-01

    The threat of nuclear weapons proliferation is a problem of world wide concern. Safeguards are the key to nuclear nonproliferation and data is the key to safeguards. The safeguards community has access to a huge and steadily growing volume of data. The advantages of this data rich environment are obvious, there is a great deal of information which can be utilized. The challenge is to effectively apply proven and developing technologies to find and extract usable information from that data. That information must then be assessed and evaluated to produce the knowledge needed for crucial decision making. Efficient and effective analysis of safeguards data will depend on utilizing technologies to interpret the large, heterogeneous data sets that are available from diverse sources. With an order-of-magnitude increase in the amount of data from a wide variety of technical, textual, and historical sources there is a vital need to apply advanced computer technologies to support all-source analysis. There are techniques of data warehousing, data mining, and data analysis that can provide analysts with tools that will expedite their extracting useable information from the huge amounts of data to which they have access. Computerized tools can aid analysts by integrating heterogeneous data, evaluating diverse data streams, automating retrieval of database information, prioritizing inputs, reconciling conflicting data, doing preliminary interpretations, discovering patterns or trends in data, and automating some of the simpler prescreening tasks that are time consuming and tedious. Thus knowledge discovery technologies can provide a foundation of support for the analyst. Rather than spending time sifting through often irrelevant information, analysts could use their specialized skills in a focused, productive fashion. This would allow them to make their analytical judgments with more confidence and spend more of their time doing what they do best

  2. Information Technology and Accounting Information Systems’ Quality in Croatian Middle and Large Companies

    Directory of Open Access Journals (Sweden)

    Ivana Mamić Sačer

    2013-12-01

    Full Text Available An accounting information system is of a great importance for preparing quality accounting information for a wide range of users. The study elaborates the impact of information technology on accounting process and as a consequence on accounting information systems quality. This paper analyzes the basic characteristics of accounting information systems quality, discussing the model of AIS’s quality measurement. The perception of the quality of accounting information systems by accountants in medium and large companies in Croatia is also presented. The paper presents the historical overview of AIS’s quality based on three empirical studies conducted in 2001, 2008 and 2012.

  3. Combining RP and SP data while accounting for large choice sets and travel mode

    DEFF Research Database (Denmark)

    Abildtrup, Jens; Olsen, Søren Bøye; Stenger, Anne

    2015-01-01

    set used for site selection modelling when the actual choice set considered is potentially large and unknown to the analyst. Easy access to forests also implies that around half of the visitors walk or bike to the forest. We apply an error-component mixed-logit model to simultaneously model the travel...

  4. Large margin image set representation and classification

    KAUST Repository

    Wang, Jim Jing-Yan; Alzahrani, Majed A.; Gao, Xin

    2014-01-01

    In this paper, we propose a novel image set representation and classification method by maximizing the margin of image sets. The margin of an image set is defined as the difference of the distance to its nearest image set from different classes and the distance to its nearest image set of the same class. By modeling the image sets by using both their image samples and their affine hull models, and maximizing the margins of the images sets, the image set representation parameter learning problem is formulated as an minimization problem, which is further optimized by an expectation - maximization (EM) strategy with accelerated proximal gradient (APG) optimization in an iterative algorithm. To classify a given test image set, we assign it to the class which could provide the largest margin. Experiments on two applications of video-sequence-based face recognition demonstrate that the proposed method significantly outperforms state-of-the-art image set classification methods in terms of both effectiveness and efficiency.

  5. Large margin image set representation and classification

    KAUST Repository

    Wang, Jim Jing-Yan

    2014-07-06

    In this paper, we propose a novel image set representation and classification method by maximizing the margin of image sets. The margin of an image set is defined as the difference of the distance to its nearest image set from different classes and the distance to its nearest image set of the same class. By modeling the image sets by using both their image samples and their affine hull models, and maximizing the margins of the images sets, the image set representation parameter learning problem is formulated as an minimization problem, which is further optimized by an expectation - maximization (EM) strategy with accelerated proximal gradient (APG) optimization in an iterative algorithm. To classify a given test image set, we assign it to the class which could provide the largest margin. Experiments on two applications of video-sequence-based face recognition demonstrate that the proposed method significantly outperforms state-of-the-art image set classification methods in terms of both effectiveness and efficiency.

  6. Improving probe set selection for microbial community analysis by leveraging taxonomic information of training sequences

    Directory of Open Access Journals (Sweden)

    Jiang Tao

    2011-10-01

    Full Text Available Abstract Background Population levels of microbial phylotypes can be examined using a hybridization-based method that utilizes a small set of computationally-designed DNA probes targeted to a gene common to all. Our previous algorithm attempts to select a set of probes such that each training sequence manifests a unique theoretical hybridization pattern (a binary fingerprint to a probe set. It does so without taking into account similarity between training gene sequences or their putative taxonomic classifications, however. We present an improved algorithm for probe set selection that utilizes the available taxonomic information of training gene sequences and attempts to choose probes such that the resultant binary fingerprints cluster into real taxonomic groups. Results Gene sequences manifesting identical fingerprints with probes chosen by the new algorithm are more likely to be from the same taxonomic group than probes chosen by the previous algorithm. In cases where they are from different taxonomic groups, underlying DNA sequences of identical fingerprints are more similar to each other in probe sets made with the new versus the previous algorithm. Complete removal of large taxonomic groups from training data does not greatly decrease the ability of probe sets to distinguish those groups. Conclusions Probe sets made from the new algorithm create fingerprints that more reliably cluster into biologically meaningful groups. The method can readily distinguish microbial phylotypes that were excluded from the training sequences, suggesting novel microbes can also be detected.

  7. Improving probe set selection for microbial community analysis by leveraging taxonomic information of training sequences.

    Science.gov (United States)

    Ruegger, Paul M; Della Vedova, Gianluca; Jiang, Tao; Borneman, James

    2011-10-10

    Population levels of microbial phylotypes can be examined using a hybridization-based method that utilizes a small set of computationally-designed DNA probes targeted to a gene common to all. Our previous algorithm attempts to select a set of probes such that each training sequence manifests a unique theoretical hybridization pattern (a binary fingerprint) to a probe set. It does so without taking into account similarity between training gene sequences or their putative taxonomic classifications, however. We present an improved algorithm for probe set selection that utilizes the available taxonomic information of training gene sequences and attempts to choose probes such that the resultant binary fingerprints cluster into real taxonomic groups. Gene sequences manifesting identical fingerprints with probes chosen by the new algorithm are more likely to be from the same taxonomic group than probes chosen by the previous algorithm. In cases where they are from different taxonomic groups, underlying DNA sequences of identical fingerprints are more similar to each other in probe sets made with the new versus the previous algorithm. Complete removal of large taxonomic groups from training data does not greatly decrease the ability of probe sets to distinguish those groups. Probe sets made from the new algorithm create fingerprints that more reliably cluster into biologically meaningful groups. The method can readily distinguish microbial phylotypes that were excluded from the training sequences, suggesting novel microbes can also be detected.

  8. Informational support of the investment process in a large city economy

    Directory of Open Access Journals (Sweden)

    Tamara Zurabovna Chargazia

    2016-12-01

    Full Text Available Large cities possess a sufficient potential to participate in the investment processes both at the national and international levels. A potential investor’s awareness of the possibilities and prospects of a city development is of a great importance for him or her to make a decision. So, providing a potential investor with relevant, laconic and reliable information, the local authorities increase the intensity of the investment process in the city economy and vice-versa. As a hypothesis, there is a proposition that a large city administration can sufficiently activate the investment processes in the economy of a corresponding territorial entity using the tools of the information providing. The purpose of this article is to develop measures for the improvement of the investment portal of a large city as an important instrument of the information providing, which will make it possible to brisk up the investment processes at the level under analysis. The reasons of the unsatisfactory information providing on the investment process in a large city economy are deeply analyzed; the national and international experience in this sphere is studied; advantages and disadvantages of the information providing of the investment process in the economy of the city of Makeyevka are considered; the investment portals of different cities are compared. There are suggested technical approaches for improving the investment portal of a large city. The research results can be used to improve the investment policy of large cities.

  9. Knowledge and theme discovery across very large biological data sets using distributed queries: a prototype combining unstructured and structured data.

    Directory of Open Access Journals (Sweden)

    Uma S Mudunuri

    Full Text Available As the discipline of biomedical science continues to apply new technologies capable of producing unprecedented volumes of noisy and complex biological data, it has become evident that available methods for deriving meaningful information from such data are simply not keeping pace. In order to achieve useful results, researchers require methods that consolidate, store and query combinations of structured and unstructured data sets efficiently and effectively. As we move towards personalized medicine, the need to combine unstructured data, such as medical literature, with large amounts of highly structured and high-throughput data such as human variation or expression data from very large cohorts, is especially urgent. For our study, we investigated a likely biomedical query using the Hadoop framework. We ran queries using native MapReduce tools we developed as well as other open source and proprietary tools. Our results suggest that the available technologies within the Big Data domain can reduce the time and effort needed to utilize and apply distributed queries over large datasets in practical clinical applications in the life sciences domain. The methodologies and technologies discussed in this paper set the stage for a more detailed evaluation that investigates how various data structures and data models are best mapped to the proper computational framework.

  10. Informal Leadership in the Clinical Setting: Occupational Therapist Perspectives

    Directory of Open Access Journals (Sweden)

    Clark Patrick Heard

    2018-04-01

    Full Text Available Background: Leadership is vital to clinical, organizational, and professional success. This has compelled a high volume of research primarily related to formal leadership concepts. However, as organizations flatten, eliminate departmental structures, or decentralize leadership structures the relevance of informal leaders has markedly enhanced. Methods: Using a qualitative phenomenological methodology consistent with interpretative phenomenological analysis, this study examines the impact of informal leadership in the clinical setting for occupational therapists. Data was collected through the completion of semi-structured interviews with 10 peer-identified informal occupational therapy leaders in Ontario, Canada. Collected data was transcribed verbatim and coded for themes by multiple coders. Several methods were employed to support trustworthiness. Results: The results identify that informal leaders are collaborative, accessible, and considered the “go to” staff. They demonstrate professional competence knowledge, experience, and accountability and are inspirational and creative. Practically, informal leaders organically shape the practice environment while building strength and capacity among their peers. Conclusion: Recommendations for supporting informal leaders include acknowledgement of the role and its centrality, enabling informal leaders time to undertake the role, and supporting consideration of informal leadership concepts at the curriculum and professional level.

  11. Third generation participatory design in health informatics--making user participation applicable to large-scale information system projects.

    Science.gov (United States)

    Pilemalm, Sofie; Timpka, Toomas

    2008-04-01

    Participatory Design (PD) methods in the field of health informatics have mainly been applied to the development of small-scale systems with homogeneous user groups in local settings. Meanwhile, health service organizations are becoming increasingly large and complex in character, making it necessary to extend the scope of the systems that are used for managing data, information and knowledge. This study reports participatory action research on the development of a PD framework for large-scale system design. The research was conducted in a public health informatics project aimed at developing a system for 175,000 users. A renewed PD framework was developed in response to six major limitations experienced to be associated with the existing methods. The resulting framework preserves the theoretical grounding, but extends the toolbox to suit applications in networked health service organizations. Future research should involve evaluations of the framework in other health service settings where comprehensive HISs are developed.

  12. Large and small sets with respect to homomorphisms and products of groups

    Directory of Open Access Journals (Sweden)

    Riccardo Gusso

    2002-10-01

    Full Text Available We study the behaviour of large, small and medium subsets with respect to homomorphisms and products of groups. Then we introduce the definition af a P-small set in abelian groups and we investigate the relations between this kind of smallness and the previous one, giving some examples that distinguish them.

  13. Teaching Children to Organise and Represent Large Data Sets in a Histogram

    Science.gov (United States)

    Nisbet, Steven; Putt, Ian

    2004-01-01

    Although some bright students in primary school are able to organise numerical data into classes, most attend to the characteristics of individuals rather than the group, and "see the trees rather than the forest". How can teachers in upper primary and early high school teach students to organise large sets of data with widely varying…

  14. A Large Group Decision Making Approach Based on TOPSIS Framework with Unknown Weights Information

    Directory of Open Access Journals (Sweden)

    Li Yupeng

    2017-01-01

    Full Text Available Large group decision making considering multiple attributes is imperative in many decision areas. The weights of the decision makers (DMs is difficult to obtain for the large number of DMs. To cope with this issue, an integrated multiple-attributes large group decision making framework is proposed in this article. The fuzziness and hesitation of the linguistic decision variables are described by interval-valued intuitionistic fuzzy sets. The weights of the DMs are optimized by constructing a non-linear programming model, in which the original decision matrices are aggregated by using the interval-valued intuitionistic fuzzy weighted average operator. By solving the non-linear programming model with MATLAB®, the weights of the DMs and the fuzzy comprehensive decision matrix are determined. Then the weights of the criteria are calculated based on the information entropy theory. At last, the TOPSIS framework is employed to establish the decision process. The divergence between interval-valued intuitionistic fuzzy numbers is calculated by interval-valued intuitionistic fuzzy cross entropy. A real-world case study is constructed to elaborate the feasibility and effectiveness of the proposed methodology.

  15. Information behavior versus communication: application models in multidisciplinary settings

    Directory of Open Access Journals (Sweden)

    Cecília Morena Maria da Silva

    2015-05-01

    Full Text Available This paper deals with the information behavior as support for models of communication design in the areas of Information Science, Library and Music. The communication models proposition is based on models of Tubbs and Moss (2003, Garvey and Griffith (1972, adapted by Hurd (1996 and Wilson (1999. Therefore, the questions arose: (i what are the informational skills required of librarians who act as mediators in scholarly communication process and informational user behavior in the educational environment?; (ii what are the needs of music related researchers and as produce, seek, use and access the scientific knowledge of your area?; and (iii as the contexts involved in scientific collaboration processes influence in the scientific production of information science field in Brazil? The article includes a literature review on the information behavior and its insertion in scientific communication considering the influence of context and/or situation of the objects involved in motivating issues. The hypothesis is that the user information behavior in different contexts and situations influence the definition of a scientific communication model. Finally, it is concluded that the same concept or a set of concepts can be used in different perspectives, reaching up, thus, different results.

  16. Parallel analysis tools and new visualization techniques for ultra-large climate data set

    Energy Technology Data Exchange (ETDEWEB)

    Middleton, Don [National Center for Atmospheric Research, Boulder, CO (United States); Haley, Mary [National Center for Atmospheric Research, Boulder, CO (United States)

    2014-12-10

    ParVis was a project funded under LAB 10-05: “Earth System Modeling: Advanced Scientific Visualization of Ultra-Large Climate Data Sets”. Argonne was the lead lab with partners at PNNL, SNL, NCAR and UC-Davis. This report covers progress from January 1st, 2013 through Dec 1st, 2014. Two previous reports covered the period from Summer, 2010, through September 2011 and October 2011 through December 2012, respectively. While the project was originally planned to end on April 30, 2013, personnel and priority changes allowed many of the institutions to continue work through FY14 using existing funds. A primary focus of ParVis was introducing parallelism to climate model analysis to greatly reduce the time-to-visualization for ultra-large climate data sets. Work in the first two years was conducted on two tracks with different time horizons: one track to provide immediate help to climate scientists already struggling to apply their analysis to existing large data sets and another focused on building a new data-parallel library and tool for climate analysis and visualization that will give the field a platform for performing analysis and visualization on ultra-large datasets for the foreseeable future. In the final 2 years of the project, we focused mostly on the new data-parallel library and associated tools for climate analysis and visualization.

  17. Development of estrogen receptor beta binding prediction model using large sets of chemicals.

    Science.gov (United States)

    Sakkiah, Sugunadevi; Selvaraj, Chandrabose; Gong, Ping; Zhang, Chaoyang; Tong, Weida; Hong, Huixiao

    2017-11-03

    We developed an ER β binding prediction model to facilitate identification of chemicals specifically bind ER β or ER α together with our previously developed ER α binding model. Decision Forest was used to train ER β binding prediction model based on a large set of compounds obtained from EADB. Model performance was estimated through 1000 iterations of 5-fold cross validations. Prediction confidence was analyzed using predictions from the cross validations. Informative chemical features for ER β binding were identified through analysis of the frequency data of chemical descriptors used in the models in the 5-fold cross validations. 1000 permutations were conducted to assess the chance correlation. The average accuracy of 5-fold cross validations was 93.14% with a standard deviation of 0.64%. Prediction confidence analysis indicated that the higher the prediction confidence the more accurate the predictions. Permutation testing results revealed that the prediction model is unlikely generated by chance. Eighteen informative descriptors were identified to be important to ER β binding prediction. Application of the prediction model to the data from ToxCast project yielded very high sensitivity of 90-92%. Our results demonstrated ER β binding of chemicals could be accurately predicted using the developed model. Coupling with our previously developed ER α prediction model, this model could be expected to facilitate drug development through identification of chemicals that specifically bind ER β or ER α .

  18. 75 FR 62686 - Health Information Technology: Revisions to Initial Set of Standards, Implementation...

    Science.gov (United States)

    2010-10-13

    ... Health Information Technology: Revisions to Initial Set of Standards, Implementation Specifications, and... Health Information Technology (ONC), Department of Health and Human Services. ACTION: Interim final rule... Coordinator for Health Information Technology, Attention: Steven Posnack, Hubert H. Humphrey Building, Suite...

  19. Practical characterization of large networks using neighborhood information

    KAUST Repository

    Wang, Pinghui; Zhao, Junzhou; Ribeiro, Bruno; Lui, John C. S.; Towsley, Don; Guan, Xiaohong

    2018-01-01

    querying a node also reveals partial structural information about its neighbors. Our methods are optimized for NoSQL graph databases (if the database can be accessed directly), or utilize Web APIs available on most major large networks for graph sampling

  20. Practical characterization of large networks using neighborhood information

    KAUST Repository

    Wang, Pinghui

    2018-02-14

    Characterizing large complex networks such as online social networks through node querying is a challenging task. Network service providers often impose severe constraints on the query rate, hence limiting the sample size to a small fraction of the total network of interest. Various ad hoc subgraph sampling methods have been proposed, but many of them give biased estimates and no theoretical basis on the accuracy. In this work, we focus on developing sampling methods for large networks where querying a node also reveals partial structural information about its neighbors. Our methods are optimized for NoSQL graph databases (if the database can be accessed directly), or utilize Web APIs available on most major large networks for graph sampling. We show that our sampling method has provable convergence guarantees on being an unbiased estimator, and it is more accurate than state-of-the-art methods. We also explore methods to uncover shortest paths between a subset of nodes and detect high degree nodes by sampling only a small fraction of the network of interest. Our results demonstrate that utilizing neighborhood information yields methods that are two orders of magnitude faster than state-of-the-art methods.

  1. An Ethnographically Informed Participatory Design of Primary Healthcare Information Technology in a Developing Country Setting.

    Science.gov (United States)

    Shidende, Nima Herman; Igira, Faraja Teddy; Mörtberg, Christina Margaret

    2017-01-01

    Ethnography, with its emphasis on understanding activities where they occur, and its use of qualitative data gathering techniques rich in description, has a long tradition in Participatory Design (PD). Yet there are limited methodological insights in its application in developing countries. This paper proposes an ethnographically informed PD approach, which can be applied when designing Primary Healthcare Information Technology (PHIT). We use findings from a larger multidisciplinary project, Health Information Systems Project (HISP) to elaborate how ethnography can be used to facilitate participation of health practitioners in developing countries settings as well as indicating the importance of ethnographic approach to participatory Health Information Technology (HIT) designers. Furthermore, the paper discusses the pros and cons of using an ethnographic approach in designing HIT.

  2. Large data sets in finance and marketing: introduction by the special issue editor

    NARCIS (Netherlands)

    Ph.H.B.F. Franses (Philip Hans)

    1998-01-01

    textabstractOn December 18 and 19 of 1997, a small conference on the "Statistical Analysis of Large Data Sets in Business Economics" was organized by the Rotterdam Institute for Business Economic Studies. Eleven presentations were delivered in plenary sessions, which were attended by about 90

  3. A Ranking Approach on Large-Scale Graph With Multidimensional Heterogeneous Information.

    Science.gov (United States)

    Wei, Wei; Gao, Bin; Liu, Tie-Yan; Wang, Taifeng; Li, Guohui; Li, Hang

    2016-04-01

    Graph-based ranking has been extensively studied and frequently applied in many applications, such as webpage ranking. It aims at mining potentially valuable information from the raw graph-structured data. Recently, with the proliferation of rich heterogeneous information (e.g., node/edge features and prior knowledge) available in many real-world graphs, how to effectively and efficiently leverage all information to improve the ranking performance becomes a new challenging problem. Previous methods only utilize part of such information and attempt to rank graph nodes according to link-based methods, of which the ranking performances are severely affected by several well-known issues, e.g., over-fitting or high computational complexity, especially when the scale of graph is very large. In this paper, we address the large-scale graph-based ranking problem and focus on how to effectively exploit rich heterogeneous information of the graph to improve the ranking performance. Specifically, we propose an innovative and effective semi-supervised PageRank (SSP) approach to parameterize the derived information within a unified semi-supervised learning framework (SSLF-GR), then simultaneously optimize the parameters and the ranking scores of graph nodes. Experiments on the real-world large-scale graphs demonstrate that our method significantly outperforms the algorithms that consider such graph information only partially.

  4. mmpdb: An Open-Source Matched Molecular Pair Platform for Large Multiproperty Data Sets.

    Science.gov (United States)

    Dalke, Andrew; Hert, Jérôme; Kramer, Christian

    2018-05-29

    Matched molecular pair analysis (MMPA) enables the automated and systematic compilation of medicinal chemistry rules from compound/property data sets. Here we present mmpdb, an open-source matched molecular pair (MMP) platform to create, compile, store, retrieve, and use MMP rules. mmpdb is suitable for the large data sets typically found in pharmaceutical and agrochemical companies and provides new algorithms for fragment canonicalization and stereochemistry handling. The platform is written in Python and based on the RDKit toolkit. It is freely available from https://github.com/rdkit/mmpdb .

  5. PeptideNavigator: An interactive tool for exploring large and complex data sets generated during peptide-based drug design projects.

    Science.gov (United States)

    Diller, Kyle I; Bayden, Alexander S; Audie, Joseph; Diller, David J

    2018-01-01

    There is growing interest in peptide-based drug design and discovery. Due to their relatively large size, polymeric nature, and chemical complexity, the design of peptide-based drugs presents an interesting "big data" challenge. Here, we describe an interactive computational environment, PeptideNavigator, for naturally exploring the tremendous amount of information generated during a peptide drug design project. The purpose of PeptideNavigator is the presentation of large and complex experimental and computational data sets, particularly 3D data, so as to enable multidisciplinary scientists to make optimal decisions during a peptide drug discovery project. PeptideNavigator provides users with numerous viewing options, such as scatter plots, sequence views, and sequence frequency diagrams. These views allow for the collective visualization and exploration of many peptides and their properties, ultimately enabling the user to focus on a small number of peptides of interest. To drill down into the details of individual peptides, PeptideNavigator provides users with a Ramachandran plot viewer and a fully featured 3D visualization tool. Each view is linked, allowing the user to seamlessly navigate from collective views of large peptide data sets to the details of individual peptides with promising property profiles. Two case studies, based on MHC-1A activating peptides and MDM2 scaffold design, are presented to demonstrate the utility of PeptideNavigator in the context of disparate peptide-design projects. Copyright © 2017 Elsevier Ltd. All rights reserved.

  6. 75 FR 44589 - Health Information Technology: Initial Set of Standards, Implementation Specifications, and...

    Science.gov (United States)

    2010-07-28

    ... Part III Department of Health and Human Services 45 CFR Part 170 Health Information Technology... Secretary 45 CFR Part 170 RIN 0991-AB58 Health Information Technology: Initial Set of Standards... of the National Coordinator for Health Information Technology (ONC), Department of Health and Human...

  7. How to Fully Represent Expert Information about Imprecise Properties in a Computer System – Random Sets, Fuzzy Sets, and Beyond: An Overview

    Science.gov (United States)

    Nguyen, Hung T.; Kreinovich, Vladik

    2014-01-01

    To help computers make better decisions, it is desirable to describe all our knowledge in computer-understandable terms. This is easy for knowledge described in terms on numerical values: we simply store the corresponding numbers in the computer. This is also easy for knowledge about precise (well-defined) properties which are either true or false for each object: we simply store the corresponding “true” and “false” values in the computer. The challenge is how to store information about imprecise properties. In this paper, we overview different ways to fully store the expert information about imprecise properties. We show that in the simplest case, when the only source of imprecision is disagreement between different experts, a natural way to store all the expert information is to use random sets; we also show how fuzzy sets naturally appear in such random-set representation. We then show how the random-set representation can be extended to the general (“fuzzy”) case when, in addition to disagreements, experts are also unsure whether some objects satisfy certain properties or not. PMID:25386045

  8. Scalability of Findability: Decentralized Search and Retrieval in Large Information Networks

    Science.gov (United States)

    Ke, Weimao

    2010-01-01

    Amid the rapid growth of information today is the increasing challenge for people to survive and navigate its magnitude. Dynamics and heterogeneity of large information spaces such as the Web challenge information retrieval in these environments. Collection of information in advance and centralization of IR operations are hardly possible because…

  9. The Current Mind-Set of Federal Information Security Decision-Makers on the Value of Governance: An Informative Study

    Science.gov (United States)

    Stroup, Jay Walter

    2014-01-01

    Understanding the mind-set or perceptions of organizational leaders and decision-makers is important to ascertaining the trends and priorities in policy and governance of the organization. This study finds that a significant shift in the mind-set of government IT and information security leaders has started and will likely result in placing a…

  10. The higher infinite large cardinals in set theory from their beginnings

    CERN Document Server

    Kanamori, Akihiro

    2003-01-01

    The theory of large cardinals is currently a broad mainstream of modern set theory, the main area of investigation for the analysis of the relative consistency of mathematical propositions and possible new axioms for mathematics. The first of a projected multi-volume series, this book provides a comprehensive account of the theory of large cardinals from its beginnings and some of the direct outgrowths leading to the frontiers of contempory research. A "genetic" approach is taken, presenting the subject in the context of its historical development. With hindsight the consequential avenues are pursued and the most elegant or accessible expositions given. With open questions and speculations provided throughout the reader should not only come to appreciate the scope and coherence of the overall enterpreise but also become prepared to pursue research in several specific areas by studying the relevant sections.

  11. Optimum detection for extracting maximum information from symmetric qubit sets

    International Nuclear Information System (INIS)

    Mizuno, Jun; Fujiwara, Mikio; Sasaki, Masahide; Akiba, Makoto; Kawanishi, Tetsuya; Barnett, Stephen M.

    2002-01-01

    We demonstrate a class of optimum detection strategies for extracting the maximum information from sets of equiprobable real symmetric qubit states of a single photon. These optimum strategies have been predicted by Sasaki et al. [Phys. Rev. A 59, 3325 (1999)]. The peculiar aspect is that the detections with at least three outputs suffice for optimum extraction of information regardless of the number of signal elements. The cases of ternary (or trine), quinary, and septenary polarization signals are studied where a standard von Neumann detection (a projection onto a binary orthogonal basis) fails to access the maximum information. Our experiments demonstrate that it is possible with present technologies to attain about 96% of the theoretical limit

  12. Large-scale Health Information Database and Privacy Protection.

    Science.gov (United States)

    Yamamoto, Ryuichi

    2016-09-01

    Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law that aims to ensure healthcare for the elderly; however, there is no mention in the act about using these databases for public interest in general. Thus, an initiative for such use must proceed carefully and attentively. The PMDA projects that collect a large amount of medical record information from large hospitals and the health database development project that the Ministry of Health, Labour and Welfare (MHLW) is working on will soon begin to operate according to a general consensus; however, the validity of this consensus can be questioned if issues of anonymity arise. The likelihood that researchers conducting a study for public interest would intentionally invade the privacy of their subjects is slim. However, patients could develop a sense of distrust about their data being used since legal requirements are ambiguous. Nevertheless, without using patients' medical records for public interest, progress in medicine will grind to a halt. Proper legislation that is clear for both researchers and patients will therefore be highly desirable. A revision of the Act on the Protection of Personal Information is currently in progress. In reality, however, privacy is not something that laws alone can protect; it will also require guidelines and self-discipline. We now live in an information capitalization age. I will introduce the trends in legal reform regarding healthcare information and discuss some basics to help people properly face the issue of health big data and privacy

  13. Information Management for a Large Multidisciplinary Project

    Science.gov (United States)

    Jones, Kennie H.; Randall, Donald P.; Cronin, Catherine K.

    1992-01-01

    In 1989, NASA's Langley Research Center (LaRC) initiated the High-Speed Airframe Integration Research (HiSAIR) Program to develop and demonstrate an integrated environment for high-speed aircraft design using advanced multidisciplinary analysis and optimization procedures. The major goals of this program were to evolve the interactions among disciplines and promote sharing of information, to provide a timely exchange of information among aeronautical disciplines, and to increase the awareness of the effects each discipline has upon other disciplines. LaRC historically has emphasized the advancement of analysis techniques. HiSAIR was founded to synthesize these advanced methods into a multidisciplinary design process emphasizing information feedback among disciplines and optimization. Crucial to the development of such an environment are the definition of the required data exchanges and the methodology for both recording the information and providing the exchanges in a timely manner. These requirements demand extensive use of data management techniques, graphic visualization, and interactive computing. HiSAIR represents the first attempt at LaRC to promote interdisciplinary information exchange on a large scale using advanced data management methodologies combined with state-of-the-art, scientific visualization techniques on graphics workstations in a distributed computing environment. The subject of this paper is the development of the data management system for HiSAIR.

  14. From Visualisation to Data Mining with Large Data Sets

    CERN Document Server

    Adelmann, Andreas; Shalf, John M; Siegerist, Cristina

    2005-01-01

    In 3D particle simulations, the generated 6D phase space data are can be very large due to the need for accurate statistics, sufficient noise attenuation in the field solver and tracking of many turns in ring machines or accelerators. There is a need for distributed applications that allow users to peruse these extremely large remotely located datasets with the same ease as locally downloaded data. This paper will show concepts and a prototype tool to extract useful physical information out of 6D raw phase space data. ParViT allows the user to project 6D data into 3D space by selecting which dimensions will be represented spatially and which dimensions are represented as particle attributes, and the construction of complex transfer functions for representing the particle attributes. It also allows management of time-series data. An HDF5-based parallel-I/O library, with C++, C and Fortran bindings simplifies the interface with a variety of codes. A number of hooks in ParVit will allow it to connect with a para...

  15. Records for radioactive waste management up to repository closure: Managing the primary level information (PLI) set

    International Nuclear Information System (INIS)

    2004-07-01

    The objective of this publication is to highlight the importance of the early establishment of a comprehensive records system to manage primary level information (PLI) as an integrated set of information, not merely as a collection of information, throughout all the phases of radioactive waste management. Early establishment of a comprehensive records system to manage Primary Level Information as an integrated set of information throughout all phases of radioactive waste management is important. In addition to the information described in the waste inventory record keeping system (WIRKS), the PLI of a radioactive waste repository consists of the entire universe of information, data and records related to any aspect of the repository's life cycle. It is essential to establish PLI requirements based on integrated set of needs from Regulators and Waste Managers involved in the waste management chain and to update these requirements as needs change over time. Information flow for radioactive waste management should be back-end driven. Identification of an Authority that will oversee the management of PLI throughout all phases of the radioactive waste management life cycle would guarantee the information flow to future generations. The long term protection of information essential to future generations can only be assured by the timely establishment of a comprehensive and effective RMS capable of capturing, indexing and evaluating all PLI. The loss of intellectual control over the PLI will make it very difficult to subsequently identify the ILI and HLI information sets. At all times prior to the closure of a radioactive waste repository, there should be an identifiable entity with a legally enforceable financial and management responsibility for the continued operation of a PLI Records Management System. The information presented in this publication will assist Member States in ensuring that waste and repository records, relevant for retention after repository closure

  16. Informing Instruction of Students with Autism in Public School Settings

    Science.gov (United States)

    Kuo, Nai-Cheng

    2016-01-01

    The number of applied behavior analysis (ABA) classrooms for students with autism is increasing in K-12 public schools. To inform instruction of students with autism in public school settings, this study examined the relation between performance on mastery learning assessments and standardized achievement tests for students with autism spectrum…

  17. Information-Theoretic Inference of Large Transcriptional Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Meyer Patrick

    2007-01-01

    Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.

  18. Information-Theoretic Inference of Large Transcriptional Regulatory Networks

    Directory of Open Access Journals (Sweden)

    Patrick E. Meyer

    2007-06-01

    Full Text Available The paper presents MRNET, an original method for inferring genetic networks from microarray data. The method is based on maximum relevance/minimum redundancy (MRMR, an effective information-theoretic technique for feature selection in supervised learning. The MRMR principle consists in selecting among the least redundant variables the ones that have the highest mutual information with the target. MRNET extends this feature selection principle to networks in order to infer gene-dependence relationships from microarray data. The paper assesses MRNET by benchmarking it against RELNET, CLR, and ARACNE, three state-of-the-art information-theoretic methods for large (up to several thousands of genes network inference. Experimental results on thirty synthetically generated microarray datasets show that MRNET is competitive with these methods.

  19. Large scale mapping of groundwater resources using a highly integrated set of tools

    DEFF Research Database (Denmark)

    Søndergaard, Verner; Auken, Esben; Christiansen, Anders Vest

    large areas with information from an optimum number of new investigation boreholes, existing boreholes, logs and water samples to get an integrated and detailed description of the groundwater resources and their vulnerability.Development of more time efficient and airborne geophysical data acquisition...... platforms (e.g. SkyTEM) have made large-scale mapping attractive and affordable in the planning and administration of groundwater resources. The handling and optimized use of huge amounts of geophysical data covering large areas has also required a comprehensive database, where data can easily be stored...

  20. Reconstructing Information in Large-Scale Structure via Logarithmic Mapping

    Science.gov (United States)

    Szapudi, Istvan

    We propose to develop a new method to extract information from large-scale structure data combining two-point statistics and non-linear transformations; before, this information was available only with substantially more complex higher-order statistical methods. Initially, most of the cosmological information in large-scale structure lies in two-point statistics. With non- linear evolution, some of that useful information leaks into higher-order statistics. The PI and group has shown in a series of theoretical investigations how that leakage occurs, and explained the Fisher information plateau at smaller scales. This plateau means that even as more modes are added to the measurement of the power spectrum, the total cumulative information (loosely speaking the inverse errorbar) is not increasing. Recently we have shown in Neyrinck et al. (2009, 2010) that a logarithmic (and a related Gaussianization or Box-Cox) transformation on the non-linear Dark Matter or galaxy field reconstructs a surprisingly large fraction of this missing Fisher information of the initial conditions. This was predicted by the earlier wave mechanical formulation of gravitational dynamics by Szapudi & Kaiser (2003). The present proposal is focused on working out the theoretical underpinning of the method to a point that it can be used in practice to analyze data. In particular, one needs to deal with the usual real-life issues of galaxy surveys, such as complex geometry, discrete sam- pling (Poisson or sub-Poisson noise), bias (linear, or non-linear, deterministic, or stochastic), redshift distortions, pro jection effects for 2D samples, and the effects of photometric redshift errors. We will develop methods for weak lensing and Sunyaev-Zeldovich power spectra as well, the latter specifically targetting Planck. In addition, we plan to investigate the question of residual higher- order information after the non-linear mapping, and possible applications for cosmology. Our aim will be to work out

  1. The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

    Science.gov (United States)

    González-Recio, O; Jiménez-Montero, J A; Alenda, R

    2013-01-01

    In the next few years, with the advent of high-density single nucleotide polymorphism (SNP) arrays and genome sequencing, genomic evaluation methods will need to deal with a large number of genetic variants and an increasing sample size. The boosting algorithm is a machine-learning technique that may alleviate the drawbacks of dealing with such large data sets. This algorithm combines different predictors in a sequential manner with some shrinkage on them; each predictor is applied consecutively to the residuals from the committee formed by the previous ones to form a final prediction based on a subset of covariates. Here, a detailed description is provided and examples using a toy data set are included. A modification of the algorithm called "random boosting" was proposed to increase predictive ability and decrease computation time of genome-assisted evaluation in large data sets. Random boosting uses a random selection of markers to add a subsequent weak learner to the predictive model. These modifications were applied to a real data set composed of 1,797 bulls genotyped for 39,714 SNP. Deregressed proofs of 4 yield traits and 1 type trait from January 2009 routine evaluations were used as dependent variables. A 2-fold cross-validation scenario was implemented. Sires born before 2005 were used as a training sample (1,576 and 1,562 for production and type traits, respectively), whereas younger sires were used as a testing sample to evaluate predictive ability of the algorithm on yet-to-be-observed phenotypes. Comparison with the original algorithm was provided. The predictive ability of the algorithm was measured as Pearson correlations between observed and predicted responses. Further, estimated bias was computed as the average difference between observed and predicted phenotypes. The results showed that the modification of the original boosting algorithm could be run in 1% of the time used with the original algorithm and with negligible differences in accuracy

  2. Informational and emotional elements in online support groups: a Bayesian approach to large-scale content analysis.

    Science.gov (United States)

    Deetjen, Ulrike; Powell, John A

    2016-05-01

    This research examines the extent to which informational and emotional elements are employed in online support forums for 14 purposively sampled chronic medical conditions and the factors that influence whether posts are of a more informational or emotional nature. Large-scale qualitative data were obtained from Dailystrength.org. Based on a hand-coded training dataset, all posts were classified into informational or emotional using a Bayesian classification algorithm to generalize the findings. Posts that could not be classified with a probability of at least 75% were excluded. The overall tendency toward emotional posts differs by condition: mental health (depression, schizophrenia) and Alzheimer's disease consist of more emotional posts, while informational posts relate more to nonterminal physical conditions (irritable bowel syndrome, diabetes, asthma). There is no gender difference across conditions, although prostate cancer forums are oriented toward informational support, whereas breast cancer forums rather feature emotional support. Across diseases, the best predictors for emotional content are lower age and a higher number of overall posts by the support group member. The results are in line with previous empirical research and unify empirical findings from single/2-condition research. Limitations include the analytical restriction to predefined categories (informational, emotional) through the chosen machine-learning approach. Our findings provide an empirical foundation for building theory on informational versus emotional support across conditions, give insights for practitioners to better understand the role of online support groups for different patients, and show the usefulness of machine-learning approaches to analyze large-scale qualitative health data from online settings. © The Author 2016. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  3. Breeding and Genetics Symposium: really big data: processing and analysis of very large data sets.

    Science.gov (United States)

    Cole, J B; Newman, S; Foertter, F; Aguilar, I; Coffey, M

    2012-03-01

    Modern animal breeding data sets are large and getting larger, due in part to recent availability of high-density SNP arrays and cheap sequencing technology. High-performance computing methods for efficient data warehousing and analysis are under development. Financial and security considerations are important when using shared clusters. Sound software engineering practices are needed, and it is better to use existing solutions when possible. Storage requirements for genotypes are modest, although full-sequence data will require greater storage capacity. Storage requirements for intermediate and results files for genetic evaluations are much greater, particularly when multiple runs must be stored for research and validation studies. The greatest gains in accuracy from genomic selection have been realized for traits of low heritability, and there is increasing interest in new health and management traits. The collection of sufficient phenotypes to produce accurate evaluations may take many years, and high-reliability proofs for older bulls are needed to estimate marker effects. Data mining algorithms applied to large data sets may help identify unexpected relationships in the data, and improved visualization tools will provide insights. Genomic selection using large data requires a lot of computing power, particularly when large fractions of the population are genotyped. Theoretical improvements have made possible the inversion of large numerator relationship matrices, permitted the solving of large systems of equations, and produced fast algorithms for variance component estimation. Recent work shows that single-step approaches combining BLUP with a genomic relationship (G) matrix have similar computational requirements to traditional BLUP, and the limiting factor is the construction and inversion of G for many genotypes. A naïve algorithm for creating G for 14,000 individuals required almost 24 h to run, but custom libraries and parallel computing reduced that to

  4. Handling Large and Complex Data in a Photovoltaic Research Institution Using a Custom Laboratory Information Management System

    Energy Technology Data Exchange (ETDEWEB)

    White, Robert R.; Munch, Kristin

    2014-01-01

    Twenty-five years ago the desktop computer started becoming ubiquitous in the scientific lab. Researchers were delighted with its ability to both control instrumentation and acquire data on a single system, but they were not completely satisfied. There were often gaps in knowledge that they thought might be gained if they just had more data and they could get the data faster. Computer technology has evolved in keeping with Moore’s Law meeting those desires; however those improvements have of late become both a boon and bane for researchers. Computers are now capable of producing high speed data streams containing terabytes of information; capabilities that evolved faster than envisioned last century. Software to handle large scientific data sets has not kept up. How much information might be lost through accidental mismanagement or how many discoveries are missed through data overload are now vital questions. An important new task in most scientific disciplines involves developing methods to address those issues and to create the software that can handle large data sets with an eye towards scalability. This software must create archived, indexed, and searchable data from heterogeneous instrumentation for the implementation of a strong data-driven materials development strategy. At the National Center for Photovoltaics in the National Renewable Energy Laboratory, we began development a few years ago on a Laboratory Information Management System (LIMS) designed to handle lab-wide scientific data acquisition, management, processing and mining needs for physics and materials science data, and with a specific focus towards future scalability for new equipment or research focuses. We will present the decisions, processes, and problems we went through while building our LIMS system for materials research, its current operational state and our steps for future development.

  5. Service Quality: A Main Determinant Factor for Health Information System Success in Low-resource Settings.

    Science.gov (United States)

    Tilahun, Binyam; Fritz, Fleur

    2015-01-01

    With the increasing implementation of different health information systems in developing countries, there is a growing need to measure the main determinants of their success. The results of this evaluation study on the determinants of HIS success in five low resource setting hospitals show that service quality is the main determinant factor for information system success in those kind of settings.

  6. Setting a new paradigm in cognitive science information: contributions to the process of knowing the information professional

    Directory of Open Access Journals (Sweden)

    Paula Regina Dal' Evedove

    2013-05-01

    Full Text Available Introduction: Studies about human cognition represent a relevant perspective in information science, considering the subjective actions of information professionals and dialogic process that should permeate the activity of subjects dealing with the organization and representation of information.Objective: Explore the approach of the cognitive perspective in information science and their new settings by contemporary needs of information to reflect on the process of meeting the professional information through the social reality that permeates the contexts of information.Methodology: Reflection on theoretical aspects that deal with the cognitive development to discuss the implications of the cognitive approach in information science and its evolution in the scope of the representation and processing of information.Results: Research in Information Science must consider issues of cognitive and social order that underlie information processing and the process of knowing the information professional as knowledge structures must be explained from the social context of knowing subjects.Conclusions: There is a need to investigate the process of knowing the information professional in the bias of socio-cognitive approach, targeting new elements for the understanding of the relationship information (cognitive manifestations and its implications on the social dimension.

  7. Modeling study of solute transport in the unsaturated zone. Information and data sets. Volume 1

    International Nuclear Information System (INIS)

    Polzer, W.L.; Fuentes, H.R.; Springer, E.P.; Nyhan, J.W.

    1986-05-01

    The Environmental Science Group (HSE-12) is conducting a study to compare various approaches of modeling water and solute transport in porous media. Various groups representing different approaches will model a common set of transport data so that the state of the art in modeling and field experimentation can be discussed in a positive framework with an assessment of current capabilities and future needs in this area of research. This paper provides information and sets of data that will be useful to the modelers in meeting the objectives of the modeling study. The information and data sets include: (1) a description of the experimental design and methods used in obtaining solute transport data, (2) supporting data that may be useful in modeling the data set of interest, and (3) the data set to be modeled

  8. Cosmological parameters from large scale structure - geometric versus shape information

    CERN Document Server

    Hamann, Jan; Lesgourgues, Julien; Rampf, Cornelius; Wong, Yvonne Y Y

    2010-01-01

    The matter power spectrum as derived from large scale structure (LSS) surveys contains two important and distinct pieces of information: an overall smooth shape and the imprint of baryon acoustic oscillations (BAO). We investigate the separate impact of these two types of information on cosmological parameter estimation, and show that for the simplest cosmological models, the broad-band shape information currently contained in the SDSS DR7 halo power spectrum (HPS) is by far superseded by geometric information derived from the baryonic features. An immediate corollary is that contrary to popular beliefs, the upper limit on the neutrino mass m_\

  9. 77 FR 38634 - Request for Information: Collection and Use of Patient Work Information in the Clinical Setting...

    Science.gov (United States)

    2012-06-28

    ... (specialty) health care: At your clinical facility, how is the patient's work information collected... the Clinical Setting: Electronic Health Records AGENCY: The National Institute for Occupational Safety... Occupational Safety and Health (NIOSH) of the Centers for Disease Control and Prevention (CDC), Department of...

  10. An effective filter for IBD detection in large data sets.

    KAUST Repository

    Huang, Lin

    2014-03-25

    Identity by descent (IBD) inference is the task of computationally detecting genomic segments that are shared between individuals by means of common familial descent. Accurate IBD detection plays an important role in various genomic studies, ranging from mapping disease genes to exploring ancient population histories. The majority of recent work in the field has focused on improving the accuracy of inference, targeting shorter genomic segments that originate from a more ancient common ancestor. The accuracy of these methods, however, is achieved at the expense of high computational cost, resulting in a prohibitively long running time when applied to large cohorts. To enable the study of large cohorts, we introduce SpeeDB, a method that facilitates fast IBD detection in large unphased genotype data sets. Given a target individual and a database of individuals that potentially share IBD segments with the target, SpeeDB applies an efficient opposite-homozygous filter, which excludes chromosomal segments from the database that are highly unlikely to be IBD with the corresponding segments from the target individual. The remaining segments can then be evaluated by any IBD detection method of choice. When examining simulated individuals sharing 4 cM IBD regions, SpeeDB filtered out 99.5% of genomic regions from consideration while retaining 99% of the true IBD segments. Applying the SpeeDB filter prior to detecting IBD in simulated fourth cousins resulted in an overall running time that was 10,000x faster than inferring IBD without the filter and retained 99% of the true IBD segments in the output.

  11. An effective filter for IBD detection in large data sets.

    KAUST Repository

    Huang, Lin; Bercovici, Sivan; Rodriguez, Jesse M; Batzoglou, Serafim

    2014-01-01

    Identity by descent (IBD) inference is the task of computationally detecting genomic segments that are shared between individuals by means of common familial descent. Accurate IBD detection plays an important role in various genomic studies, ranging from mapping disease genes to exploring ancient population histories. The majority of recent work in the field has focused on improving the accuracy of inference, targeting shorter genomic segments that originate from a more ancient common ancestor. The accuracy of these methods, however, is achieved at the expense of high computational cost, resulting in a prohibitively long running time when applied to large cohorts. To enable the study of large cohorts, we introduce SpeeDB, a method that facilitates fast IBD detection in large unphased genotype data sets. Given a target individual and a database of individuals that potentially share IBD segments with the target, SpeeDB applies an efficient opposite-homozygous filter, which excludes chromosomal segments from the database that are highly unlikely to be IBD with the corresponding segments from the target individual. The remaining segments can then be evaluated by any IBD detection method of choice. When examining simulated individuals sharing 4 cM IBD regions, SpeeDB filtered out 99.5% of genomic regions from consideration while retaining 99% of the true IBD segments. Applying the SpeeDB filter prior to detecting IBD in simulated fourth cousins resulted in an overall running time that was 10,000x faster than inferring IBD without the filter and retained 99% of the true IBD segments in the output.

  12. Developing Archive Information Packages for Data Sets: Early Experiments with Digital Library Standards

    Science.gov (United States)

    Duerr, R. E.; Yang, M.; Gooyabadi, M.; Lee, C.

    2008-12-01

    The key to interoperability between systems is often metadata, yet metadata standards in the digital library and data center communities have evolved separately. In the data center world NASA's Directory Interchange Format (DIF), the Content Standard for Digital Geospatial Metadata (CSDGM), and most recently the international Geographic Information: Metadata (ISO 19115:2003) are used for descriptive metadata at the data set level to allow catalog interoperability; but use of anything other than repository- based metadata standards for the individual files that comprise a data set is rare, making true interoperability, at the data rather than data set level, across archives difficult. While the Open Archival Information Systems (OAIS) Reference Model with its call for creating Archive Information Packages (AIP) containing not just descriptive metadata but also preservation metadata is slowly being adopted in the community, the PREservation Metadata Implementation Strategies (PREMIS) standard, the only extant OAIS- compliant preservation metadata standard, has scarcely even been recognized as being applicable to the community. The digital library community in the meantime has converged upon the Metadata Encoding and Transmission Standard (METS) for interoperability between systems as evidenced by support for the standard by digital library systems such as Fedora and Greenstone. METS is designed to allow inclusion of other XML-based standards as descriptive and administrative metadata components. A recent Stanford study suggests that a combination of METS with included FGDC and PREMIS metadata could work well for individual granules of a data set. However, some of the lessons learned by the data center community over the last 30+ years of dealing with digital data are 1) that data sets as a whole need to be preserved and described and 2) that discovery and access mechanisms need to be hierarchical. Only once a user has reviewed a data set description and determined

  13. Setting up Information Literacy Workshops in School Libraries: Imperatives, Principles and Methods

    Directory of Open Access Journals (Sweden)

    Reza Mokhtarpour

    2010-09-01

    Full Text Available While many professional literature have talked at length about the importance of dealing with information literacy in school libraries in ICT dominated era, but few have dealt with the nature and mode of implementation nor offered a road map. The strategy emphasized in this paper is to hold information literacy sessions through effective workshops. While explaining the reasons behind such workshops being essential in enhancing information literacy skills, the most important principles and stages for setting up of such workshops are offered in a step-by-step manner.

  14. Ubiquitous information for ubiquitous computing: expressing clinical data sets with openEHR archetypes.

    Science.gov (United States)

    Garde, Sebastian; Hovenga, Evelyn; Buck, Jasmin; Knaup, Petra

    2006-01-01

    Ubiquitous computing requires ubiquitous access to information and knowledge. With the release of openEHR Version 1.0 there is a common model available to solve some of the problems related to accessing information and knowledge by improving semantic interoperability between clinical systems. Considerable work has been undertaken by various bodies to standardise Clinical Data Sets. Notwithstanding their value, several problems remain unsolved with Clinical Data Sets without the use of a common model underpinning them. This paper outlines these problems like incompatible basic data types and overlapping and incompatible definitions of clinical content. A solution to this based on openEHR archetypes is motivated and an approach to transform existing Clinical Data Sets into archetypes is presented. To avoid significant overlaps and unnecessary effort during archetype development, archetype development needs to be coordinated nationwide and beyond and also across the various health professions in a formalized process.

  15. How Did the Information Flow in the #AlphaGo Hashtag Network? A Social Network Analysis of the Large-Scale Information Network on Twitter.

    Science.gov (United States)

    Kim, Jinyoung

    2017-12-01

    As it becomes common for Internet users to use hashtags when posting and searching information on social media, it is important to understand who builds a hashtag network and how information is circulated within the network. This article focused on unlocking the potential of the #AlphaGo hashtag network by addressing the following questions. First, the current study examined whether traditional opinion leadership (i.e., the influentials hypothesis) or grassroot participation by the public (i.e., the interpersonal hypothesis) drove dissemination of information in the hashtag network. Second, several unique patterns of information distribution by key users were identified. Finally, the association between attributes of key users who exerted great influence on information distribution (i.e., the number of followers and follows) and their central status in the network was tested. To answer the proffered research questions, a social network analysis was conducted using a large-scale hashtag network data set from Twitter (n = 21,870). The results showed that the leading actors in the network were actively receiving information from their followers rather than serving as intermediaries between the original information sources and the public. Moreover, the leading actors played several roles (i.e., conversation starters, influencers, and active engagers) in the network. Furthermore, the number of their follows and followers were significantly associated with their central status in the hashtag network. Based on the results, the current research explained how the information was exchanged in the hashtag network by proposing the reciprocal model of information flow.

  16. Informational and linguistic analysis of large genomic sequence collections via efficient Hadoop cluster algorithms.

    Science.gov (United States)

    Ferraro Petrillo, Umberto; Roscigno, Gianluca; Cattaneo, Giuseppe; Giancarlo, Raffaele

    2018-06-01

    Information theoretic and compositional/linguistic analysis of genomes have a central role in bioinformatics, even more so since the associated methodologies are becoming very valuable also for epigenomic and meta-genomic studies. The kernel of those methods is based on the collection of k-mer statistics, i.e. how many times each k-mer in {A,C,G,T}k occurs in a DNA sequence. Although this problem is computationally very simple and efficiently solvable on a conventional computer, the sheer amount of data available now in applications demands to resort to parallel and distributed computing. Indeed, those type of algorithms have been developed to collect k-mer statistics in the realm of genome assembly. However, they are so specialized to this domain that they do not extend easily to the computation of informational and linguistic indices, concurrently on sets of genomes. Following the well-established approach in many disciplines, and with a growing success also in bioinformatics, to resort to MapReduce and Hadoop to deal with 'Big Data' problems, we present KCH, the first set of MapReduce algorithms able to perform concurrently informational and linguistic analysis of large collections of genomic sequences on a Hadoop cluster. The benchmarking of KCH that we provide indicates that it is quite effective and versatile. It is also competitive with respect to the parallel and distributed algorithms highly specialized to k-mer statistics collection for genome assembly problems. In conclusion, KCH is a much needed addition to the growing number of algorithms and tools that use MapReduce for bioinformatics core applications. The software, including instructions for running it over Amazon AWS, as well as the datasets are available at http://www.di-srv.unisa.it/KCH. umberto.ferraro@uniroma1.it. Supplementary data are available at Bioinformatics online.

  17. Large Pelagic Logbook Set Survey (Vessels)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — This data set contains catch and effort for fishing trips that are taken by vessels with a Federal permit issued for the swordfish and sharks under the Highly...

  18. Development of a Minimum Data Set (MDS) for C-Section Anesthesia Information Management System (AIMS).

    Science.gov (United States)

    Sheykhotayefeh, Mostafa; Safdari, Reza; Ghazisaeedi, Marjan; Khademi, Seyed Hossein; Seyed Farajolah, Seyedeh Sedigheh; Maserat, Elham; Jebraeily, Mohamad; Torabi, Vahid

    2017-04-01

    Caesarean section, also known as C-section, is a very common procedure in the world. Minimum data set (MDS) is defined as a set of data elements holding information regarding a series of target entities to provide a basis for planning, management, and performance evaluation. MDS has found a great use in health care information systems. Also, it can be considered as a basis for medical information management and has shown a great potential for contributing to the provision of high quality care and disease control measures. The principal aim of this research was to determine MDS and required capabilities for Anesthesia information management system (AIMS) in C-section in Iran. Data items collected from several selected AIMS were studied to establish an initial set of data. The population of this study composed of 115 anesthesiologists was asked to review the proposed data elements and score them in order of importance by using a five-point Likert scale. The items scored as important or highly important by at least 75% of the experts were included in the final list of minimum data set. Overall 8 classes of data (consisted of 81 key data elements) were determined as final set. Also, the most important required capabilities were related to airway management and hypertension and hypotension management. In the development of information system (IS) based on MDS and identification, because of the broad involvement of users, IS capabilities must focus on the users' needs to form a successful system. Therefore, it is essential to assess MDS watchfully by considering the planned uses of data. Also, IS should have essential capabilities to meet the needs of its users.

  19. Large-scale Health Information Database and Privacy Protection*1

    Science.gov (United States)

    YAMAMOTO, Ryuichi

    2016-01-01

    Japan was once progressive in the digitalization of healthcare fields but unfortunately has fallen behind in terms of the secondary use of data for public interest. There has recently been a trend to establish large-scale health databases in the nation, and a conflict between data use for public interest and privacy protection has surfaced as this trend has progressed. Databases for health insurance claims or for specific health checkups and guidance services were created according to the law that aims to ensure healthcare for the elderly; however, there is no mention in the act about using these databases for public interest in general. Thus, an initiative for such use must proceed carefully and attentively. The PMDA*2 projects that collect a large amount of medical record information from large hospitals and the health database development project that the Ministry of Health, Labour and Welfare (MHLW) is working on will soon begin to operate according to a general consensus; however, the validity of this consensus can be questioned if issues of anonymity arise. The likelihood that researchers conducting a study for public interest would intentionally invade the privacy of their subjects is slim. However, patients could develop a sense of distrust about their data being used since legal requirements are ambiguous. Nevertheless, without using patients’ medical records for public interest, progress in medicine will grind to a halt. Proper legislation that is clear for both researchers and patients will therefore be highly desirable. A revision of the Act on the Protection of Personal Information is currently in progress. In reality, however, privacy is not something that laws alone can protect; it will also require guidelines and self-discipline. We now live in an information capitalization age. I will introduce the trends in legal reform regarding healthcare information and discuss some basics to help people properly face the issue of health big data and privacy

  20. Modeling large data sets in marketing

    NARCIS (Netherlands)

    Balasubramanian, S; Gupta, S; Kamakura, W; Wedel, M

    In the last two decades, marketing databases have grown significantly in terms of size and richness of available information. The analysis of these databases raises several information-related and statistical issues. We aim at providing an overview of a selection of issues related to the analysis of

  1. Information Power Grid: Distributed High-Performance Computing and Large-Scale Data Management for Science and Engineering

    Science.gov (United States)

    Johnston, William E.; Gannon, Dennis; Nitzberg, Bill

    2000-01-01

    We use the term "Grid" to refer to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. This infrastructure includes: (1) Tools for constructing collaborative, application oriented Problem Solving Environments / Frameworks (the primary user interfaces for Grids); (2) Programming environments, tools, and services providing various approaches for building applications that use aggregated computing and storage resources, and federated data sources; (3) Comprehensive and consistent set of location independent tools and services for accessing and managing dynamic collections of widely distributed resources: heterogeneous computing systems, storage systems, real-time data sources and instruments, human collaborators, and communications systems; (4) Operational infrastructure including management tools for distributed systems and distributed resources, user services, accounting and auditing, strong and location independent user authentication and authorization, and overall system security services The vision for NASA's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks. Such Grids will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. Examples of these problems include: (1) Coupled, multidisciplinary simulations too large for single systems (e.g., multi-component NPSS turbomachine simulation); (2) Use of widely distributed, federated data archives (e.g., simultaneous access to metrological, topological, aircraft performance, and flight path scheduling databases supporting a National Air Space Simulation systems}; (3

  2. Makerspaces in Informal Settings

    Science.gov (United States)

    Brown, Ryan A.; Antink-Meyer, Allison

    2017-01-01

    The maker movement, with its focus on-hands on learning through creating with familiar materials, has seen growth in both formal and informal learning spaces. This article examines three informal learning spaces that have redesigned their space (or a portion of it) to accommodate several tenets of the maker movement and in various ways have become…

  3. Environmental settings for selected US Department of Energy installations - support information for the programmatic environmental impact statement and the baseline environmental management report

    Energy Technology Data Exchange (ETDEWEB)

    Holdren, G.R.; Glantz, C.S.; Berg, L.K.; Delinger, K.; Fosmire, C.J.; Goodwin, S.M.; Rustad, J.R.; Schalla, R.; Schramke, J.A.

    1995-05-01

    This report contains the environmental setting information developed for 25 U.S. Department of Energy (DOE) installations in support of the DOE`s Programmatic Environmental Impact Study (PEIS) and the Baseline Environmental Management Report (BEMR). The common objective of the PEIS and the BEMR is to provide the public with information about the environmental contamination problems associated with major DOE facilities across the country, and to assess the relative risks that radiological and hazardous contaminants pose to the public, onsite workers, and the environment. Environmental setting information consists of the site-specific data required to model (using the Multimedia Environmental Pollutant Assessment System) the atmospheric, groundwater, and surface water transport of contaminants within and near the boundaries of the installations. The environmental settings data describes the climate, atmospheric dispersion, hydrogeology, and surface water characteristics of the installations. The number of discrete environmental settings established for each installation was governed by two competing requirements: (1) the risks posed by contaminants released from numerous waste sites were to be modeled as accurately as possible, and (2) the modeling required for numerous release sites and a large number of contaminants had to be completed within the limits imposed by the PEIS and BEMR schedule. The final product is the result of attempts to balance these competing concerns in a way that minimizes the number of settings per installation in order to meet the project schedule while at the same, time providing adequate, if sometimes highly simplified, representations of the different areas within an installation. Environmental settings were developed in conjunction with installation experts in the fields of meteorology, geology, hydrology, and geochemistry.

  4. Environmental settings for selected US Department of Energy installations - support information for the programmatic environmental impact statement and the baseline environmental management report

    International Nuclear Information System (INIS)

    Holdren, G.R.; Glantz, C.S.; Berg, L.K.; Delinger, K.; Fosmire, C.J.; Goodwin, S.M.; Rustad, J.R.; Schalla, R.; Schramke, J.A.

    1995-05-01

    This report contains the environmental setting information developed for 25 U.S. Department of Energy (DOE) installations in support of the DOE's Programmatic Environmental Impact Study (PEIS) and the Baseline Environmental Management Report (BEMR). The common objective of the PEIS and the BEMR is to provide the public with information about the environmental contamination problems associated with major DOE facilities across the country, and to assess the relative risks that radiological and hazardous contaminants pose to the public, onsite workers, and the environment. Environmental setting information consists of the site-specific data required to model (using the Multimedia Environmental Pollutant Assessment System) the atmospheric, groundwater, and surface water transport of contaminants within and near the boundaries of the installations. The environmental settings data describes the climate, atmospheric dispersion, hydrogeology, and surface water characteristics of the installations. The number of discrete environmental settings established for each installation was governed by two competing requirements: (1) the risks posed by contaminants released from numerous waste sites were to be modeled as accurately as possible, and (2) the modeling required for numerous release sites and a large number of contaminants had to be completed within the limits imposed by the PEIS and BEMR schedule. The final product is the result of attempts to balance these competing concerns in a way that minimizes the number of settings per installation in order to meet the project schedule while at the same, time providing adequate, if sometimes highly simplified, representations of the different areas within an installation. Environmental settings were developed in conjunction with installation experts in the fields of meteorology, geology, hydrology, and geochemistry

  5. Lessons from a large-scale assessment: Results from conceptual inventories

    Directory of Open Access Journals (Sweden)

    Beth Thacker

    2014-07-01

    Full Text Available We report conceptual inventory results of a large-scale assessment project at a large university. We studied the introduction of materials and instructional methods informed by physics education research (PER (physics education research-informed materials into a department where most instruction has previously been traditional and a significant number of faculty are hesitant, ambivalent, or even resistant to the introduction of such reforms. Data were collected in all of the sections of both the large algebra- and calculus-based introductory courses for a number of years employing commonly used conceptual inventories. Results from a small PER-informed, inquiry-based, laboratory-based class are also reported. Results suggest that when PER-informed materials are introduced in the labs and recitations, independent of the lecture style, there is an increase in students’ conceptual inventory gains. There is also an increase in the results on conceptual inventories if PER-informed instruction is used in the lecture. The highest conceptual inventory gains were achieved by the combination of PER-informed lectures and laboratories in large class settings and by the hands-on, laboratory-based, inquiry-based course taught in a small class setting.

  6. The communication process in clinical settings.

    Science.gov (United States)

    Mathews, J J

    1983-01-01

    The communication of information in clinical settings is fraught with problems despite avowed common aims of practitioners and patients. Some reasons for the problematic nature of clinical communication are incongruent frames of reference about what information ought to be shared, sociolinguistic differences and social distance between practitioners and patients. Communication between doctors and nurses is also problematic, largely due to differences in ideology between the professions about what ought to be communicated to patients about their illness and who is ratified to give such information. Recent social changes, such as the Patient Bill of Rights and informed consent which assure access to information, and new conceptualizations of the nurse's role, warrant continued study of the communication process especially in regard to what constitutes appropriate and acceptable information about a patient's illness and who ought to give such information to patients. The purpose of this paper is to outline characteristics of communication in clinical settings and to provide a literature review of patient and practitioner interaction studies in order to reflect on why information exchange is problematic in clinical settings. A framework for presentation of the problems employs principles from interaction and role theory to investigate clinical communication from three viewpoints: (1) the level of shared knowledge between participants; (2) the effect of status, role and ideology on transactions; and (3) the regulation of communication imposed by features of the institution.

  7. Utilizing Maximal Independent Sets as Dominating Sets in Scale-Free Networks

    Science.gov (United States)

    Derzsy, N.; Molnar, F., Jr.; Szymanski, B. K.; Korniss, G.

    Dominating sets provide key solution to various critical problems in networked systems, such as detecting, monitoring, or controlling the behavior of nodes. Motivated by graph theory literature [Erdos, Israel J. Math. 4, 233 (1966)], we studied maximal independent sets (MIS) as dominating sets in scale-free networks. We investigated the scaling behavior of the size of MIS in artificial scale-free networks with respect to multiple topological properties (size, average degree, power-law exponent, assortativity), evaluated its resilience to network damage resulting from random failure or targeted attack [Molnar et al., Sci. Rep. 5, 8321 (2015)], and compared its efficiency to previously proposed dominating set selection strategies. We showed that, despite its small set size, MIS provides very high resilience against network damage. Using extensive numerical analysis on both synthetic and real-world (social, biological, technological) network samples, we demonstrate that our method effectively satisfies four essential requirements of dominating sets for their practical applicability on large-scale real-world systems: 1.) small set size, 2.) minimal network information required for their construction scheme, 3.) fast and easy computational implementation, and 4.) resiliency to network damage. Supported by DARPA, DTRA, and NSF.

  8. Patent challenges for standard-setting in the global economy : lessons from information and communication industry

    NARCIS (Netherlands)

    Maskus, K.; Merrill, S.A.; Bekkers, R.N.A.; Sandy Block, Marc; Contreras, Jorge; Gilbert, Richard; Goodman, David; Marasco, Amy; Simcoe, Tim; Smoot, Oliver; Suttmeier, Richard; Updegrove, Andrew

    2014-01-01

    Patent Challenges for Standard-Setting in the Global Economy: Lessons from Information and Communication Technology examines how leading national and multinational standard-setting organizations (SSOs) address patent disclosures, licensing terms, transfers of patent ownership, and other issues that

  9. Analyzing large data sets from XGC1 magnetic fusion simulations using apache spark

    Energy Technology Data Exchange (ETDEWEB)

    Churchill, R. Michael [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States)

    2016-11-21

    Apache Spark is explored as a tool for analyzing large data sets from the magnetic fusion simulation code XGCI. Implementation details of Apache Spark on the NERSC Edison supercomputer are discussed, including binary file reading, and parameter setup. Here, an unsupervised machine learning algorithm, k-means clustering, is applied to XGCI particle distribution function data, showing that highly turbulent spatial regions do not have common coherent structures, but rather broad, ring-like structures in velocity space.

  10. Detecting clinically relevant new information in clinical notes across specialties and settings.

    Science.gov (United States)

    Zhang, Rui; Pakhomov, Serguei V S; Arsoniadis, Elliot G; Lee, Janet T; Wang, Yan; Melton, Genevieve B

    2017-07-05

    Automated methods for identifying clinically relevant new versus redundant information in electronic health record (EHR) clinical notes is useful for clinicians and researchers involved in patient care and clinical research, respectively. We evaluated methods to automatically identify clinically relevant new information in clinical notes, and compared the quantity of redundant information across specialties and clinical settings. Statistical language models augmented with semantic similarity measures were evaluated as a means to detect and quantify clinically relevant new and redundant information over longitudinal clinical notes for a given patient. A corpus of 591 progress notes over 40 inpatient admissions was annotated for new information longitudinally by physicians to generate a reference standard. Note redundancy between various specialties was evaluated on 71,021 outpatient notes and 64,695 inpatient notes from 500 solid organ transplant patients (April 2015 through August 2015). Our best method achieved at best performance of 0.87 recall, 0.62 precision, and 0.72 F-measure. Addition of semantic similarity metrics compared to baseline improved recall but otherwise resulted in similar performance. While outpatient and inpatient notes had relatively similar levels of high redundancy (61% and 68%, respectively), redundancy differed by author specialty with mean redundancy of 75%, 66%, 57%, and 55% observed in pediatric, internal medicine, psychiatry and surgical notes, respectively. Automated techniques with statistical language models for detecting redundant versus clinically relevant new information in clinical notes do not improve with the addition of semantic similarity measures. While levels of redundancy seem relatively similar in the inpatient and ambulatory settings in the Fairview Health Services, clinical note redundancy appears to vary significantly with different medical specialties.

  11. CUDA based Level Set Method for 3D Reconstruction of Fishes from Large Acoustic Data

    DEFF Research Database (Denmark)

    Sharma, Ojaswa; Anton, François

    2009-01-01

    Acoustic images present views of underwater dynamics, even in high depths. With multi-beam echo sounders (SONARs), it is possible to capture series of 2D high resolution acoustic images. 3D reconstruction of the water column and subsequent estimation of fish abundance and fish species identificat...... of suppressing threshold and show its convergence as the evolution proceeds. We also present a GPU based streaming computation of the method using NVIDIA's CUDA framework to handle large volume data-sets. Our implementation is optimised for memory usage to handle large volumes....

  12. Inflation and the Great Moderation: Evidence from a Large Panel Data Set

    OpenAIRE

    Georgios Karras

    2013-01-01

    This paper investigates the relationship between the Great Moderation and two measures of inflation performance: trend inflation and inflation volatility. Using annual data from 1970 to 2011 for a large panel of 180 developed and developing economies, the results show that, as expected, both measures are positively correlated with output volatility. When the two measures are jointly considered, however, and there is sufficient information to identify their effects separately, our empirical...

  13. Poster Abstract: Towards NILM for Industrial Settings

    DEFF Research Database (Denmark)

    Holmegaard, Emil; Kjærgaard, Mikkel Baun

    2015-01-01

    Industry consumes a large share of the worldwide electricity consumption. Disaggregated information about electricity consumption enables better decision-making and feedback tools to optimize electricity consumption. In industrial settings electricity loads consist of a variety of equipment, whic...... consumption for six months, at an industrial site. In this poster abstract we provide initial results for how industrial equipment challenge NILM algorithms. These results thereby open up for evaluating the use of NILM in industrial settings....

  14. Barriers in implementing evidence-informed health decisions in rural rehabilitation settings: a mixed methods pilot study.

    Science.gov (United States)

    Prakash, V; Hariohm, K; Balaganapathy, M

    2014-08-01

    Literature on the barriers to implementing research findings into physiotherapy practice are often urban centric, using self report based on the hypothetical patient scenario. The objective of this study was to investigate the occurrence of barriers, encountered by evidence informed practice-trained physiotherapists in the management of "real world" patients in rural rehabilitation settings. A mixed-methods research design was used. Physiotherapists working in rural outpatient rehabilitation settings participated in the study. In the first phase, we asked all participants (N = 5) to maintain a log book for a 4-week period to record questions that arose during their routine clinical encounters and asked them also to follow first four of the five steps of evidence-informed practice (ask, access, appraise and apply). In the second phase (after 4 weeks), we conducted a semistructured, direct interviews with the participants exploring their experiences involved in the process of implementing evidence-informed clinical decisions made during the study period. At the end of 4 weeks, 30 questions were recorded. For 17 questions, the participants found evidence but applied that evidence into their practice only in 9 instances. Being generalist practitioners, lack of outcomes specific to the patients were reported as barriers more so than time constraints in implementing evidence-informed practice. Practice setting, lack of patient-centered research and evidence-informed practice competency of physiotherapists can be significant barriers to implementing evidence-informed health decisions in rural rehabilitation setting. © 2014 Chinese Cochrane Center, West China Hospital of Sichuan University and Wiley Publishing Asia Pty Ltd.

  15. GenoSets: visual analytic methods for comparative genomics.

    Directory of Open Access Journals (Sweden)

    Aurora A Cain

    Full Text Available Many important questions in biology are, fundamentally, comparative, and this extends to our analysis of a growing number of sequenced genomes. Existing genomic analysis tools are often organized around literal views of genomes as linear strings. Even when information is highly condensed, these views grow cumbersome as larger numbers of genomes are added. Data aggregation and summarization methods from the field of visual analytics can provide abstracted comparative views, suitable for sifting large multi-genome datasets to identify critical similarities and differences. We introduce a software system for visual analysis of comparative genomics data. The system automates the process of data integration, and provides the analysis platform to identify and explore features of interest within these large datasets. GenoSets borrows techniques from business intelligence and visual analytics to provide a rich interface of interactive visualizations supported by a multi-dimensional data warehouse. In GenoSets, visual analytic approaches are used to enable querying based on orthology, functional assignment, and taxonomic or user-defined groupings of genomes. GenoSets links this information together with coordinated, interactive visualizations for both detailed and high-level categorical analysis of summarized data. GenoSets has been designed to simplify the exploration of multiple genome datasets and to facilitate reasoning about genomic comparisons. Case examples are included showing the use of this system in the analysis of 12 Brucella genomes. GenoSets software and the case study dataset are freely available at http://genosets.uncc.edu. We demonstrate that the integration of genomic data using a coordinated multiple view approach can simplify the exploration of large comparative genomic data sets, and facilitate reasoning about comparisons and features of interest.

  16. Ssecrett and neuroTrace: Interactive visualization and analysis tools for large-scale neuroscience data sets

    KAUST Repository

    Jeong, Wonki; Beyer, Johanna; Hadwiger, Markus; Blue, Rusty; Law, Charles; Vá zquez Reina, Amelio; Reid, Rollie Clay; Lichtman, Jeff W M D; Pfister, Hanspeter

    2010-01-01

    Recent advances in optical and electron microscopy let scientists acquire extremely high-resolution images for neuroscience research. Data sets imaged with modern electron microscopes can range between tens of terabytes to about one petabyte. These large data sizes and the high complexity of the underlying neural structures make it very challenging to handle the data at reasonably interactive rates. To provide neuroscientists flexible, interactive tools, the authors introduce Ssecrett and NeuroTrace, two tools they designed for interactive exploration and analysis of large-scale optical- and electron-microscopy images to reconstruct complex neural circuits of the mammalian nervous system. © 2010 IEEE.

  17. Ssecrett and neuroTrace: Interactive visualization and analysis tools for large-scale neuroscience data sets

    KAUST Repository

    Jeong, Wonki

    2010-05-01

    Recent advances in optical and electron microscopy let scientists acquire extremely high-resolution images for neuroscience research. Data sets imaged with modern electron microscopes can range between tens of terabytes to about one petabyte. These large data sizes and the high complexity of the underlying neural structures make it very challenging to handle the data at reasonably interactive rates. To provide neuroscientists flexible, interactive tools, the authors introduce Ssecrett and NeuroTrace, two tools they designed for interactive exploration and analysis of large-scale optical- and electron-microscopy images to reconstruct complex neural circuits of the mammalian nervous system. © 2010 IEEE.

  18. Marketing Library and Information Services: Comparing Experiences at Large Institutions.

    Science.gov (United States)

    Noel, Robert; Waugh, Timothy

    This paper explores some of the similarities and differences between publicizing information services within the academic and corporate environments, comparing the marketing experiences of Abbot Laboratories (Illinois) and Indiana University. It shows some innovative online marketing tools, including an animated gif model of a large, integrated…

  19. MUSI: an integrated system for identifying multiple specificity from very large peptide or nucleic acid data sets.

    Science.gov (United States)

    Kim, Taehyung; Tyndel, Marc S; Huang, Haiming; Sidhu, Sachdev S; Bader, Gary D; Gfeller, David; Kim, Philip M

    2012-03-01

    Peptide recognition domains and transcription factors play crucial roles in cellular signaling. They bind linear stretches of amino acids or nucleotides, respectively, with high specificity. Experimental techniques that assess the binding specificity of these domains, such as microarrays or phage display, can retrieve thousands of distinct ligands, providing detailed insight into binding specificity. In particular, the advent of next-generation sequencing has recently increased the throughput of such methods by several orders of magnitude. These advances have helped reveal the presence of distinct binding specificity classes that co-exist within a set of ligands interacting with the same target. Here, we introduce a software system called MUSI that can rapidly analyze very large data sets of binding sequences to determine the relevant binding specificity patterns. Our pipeline provides two major advances. First, it can detect previously unrecognized multiple specificity patterns in any data set. Second, it offers integrated processing of very large data sets from next-generation sequencing machines. The results are visualized as multiple sequence logos describing the different binding preferences of the protein under investigation. We demonstrate the performance of MUSI by analyzing recent phage display data for human SH3 domains as well as microarray data for mouse transcription factors.

  20. Monitoring and Information Fusion for Search and Rescue Operations in Large-Scale Disasters

    National Research Council Canada - National Science Library

    Nardi, Daniele

    2002-01-01

    ... for information fusion with application to search-and-rescue and large scale disaster relief. The objective is to develop and to deploy tools to support the monitoring activities in an intervention caused by a large-scale disaster...

  1. Designing Patient-facing Health Information Technologies for the Outpatient Settings: A Literature Review

    OpenAIRE

    Yushi Yang; Onur Asan

    2016-01-01

    Introduction: The implementation of health information technologies (HITs) has changed the dynamics of doctor–patient communication in outpatient settings. Designing patient-facing HITs provides patients with easy access to healthcare information during the visit and has the potential to enhance the patient-centred care.   Objectives: The objectives of this study are to systematically review how the designs of patient-facing HITs have been suggested and evaluated, and how they may pot...

  2. The Molecule Cloud - compact visualization of large collections of molecules

    Directory of Open Access Journals (Sweden)

    Ertl Peter

    2012-07-01

    Full Text Available Abstract Background Analysis and visualization of large collections of molecules is one of the most frequent challenges cheminformatics experts in pharmaceutical industry are facing. Various sophisticated methods are available to perform this task, including clustering, dimensionality reduction or scaffold frequency analysis. In any case, however, viewing and analyzing large tables with molecular structures is necessary. We present a new visualization technique, providing basic information about the composition of molecular data sets at a single glance. Summary A method is presented here allowing visual representation of the most common structural features of chemical databases in a form of a cloud diagram. The frequency of molecules containing particular substructure is indicated by the size of respective structural image. The method is useful to quickly perceive the most prominent structural features present in the data set. This approach was inspired by popular word cloud diagrams that are used to visualize textual information in a compact form. Therefore we call this approach “Molecule Cloud”. The method also supports visualization of additional information, for example biological activity of molecules containing this scaffold or the protein target class typical for particular scaffolds, by color coding. Detailed description of the algorithm is provided, allowing easy implementation of the method by any cheminformatics toolkit. The layout algorithm is available as open source Java code. Conclusions Visualization of large molecular data sets using the Molecule Cloud approach allows scientists to get information about the composition of molecular databases and their most frequent structural features easily. The method may be used in the areas where analysis of large molecular collections is needed, for example processing of high throughput screening results, virtual screening or compound purchasing. Several example visualizations of large

  3. 77 FR 73671 - Agency Information Collection Activities: Deferral of Duty on Large Yachts Imported for Sale

    Science.gov (United States)

    2012-12-11

    ... Activities: Deferral of Duty on Large Yachts Imported for Sale AGENCY: U.S. Customs and Border Protection... of Duty on Large Yachts Imported for Sale. This is a proposed extension of an information collection... information collection: Title: Deferral of Duty on Large Yachts Imported for Sale. OMB Number: 1651-0080. Form...

  4. Analysis of large databases in vascular surgery.

    Science.gov (United States)

    Nguyen, Louis L; Barshes, Neal R

    2010-09-01

    Large databases can be a rich source of clinical and administrative information on broad populations. These datasets are characterized by demographic and clinical data for over 1000 patients from multiple institutions. Since they are often collected and funded for other purposes, their use for secondary analysis increases their utility at relatively low costs. Advantages of large databases as a source include the very large numbers of available patients and their related medical information. Disadvantages include lack of detailed clinical information and absence of causal descriptions. Researchers working with large databases should also be mindful of data structure design and inherent limitations to large databases, such as treatment bias and systemic sampling errors. Withstanding these limitations, several important studies have been published in vascular care using large databases. They represent timely, "real-world" analyses of questions that may be too difficult or costly to address using prospective randomized methods. Large databases will be an increasingly important analytical resource as we focus on improving national health care efficacy in the setting of limited resources.

  5. Towards better segmentation of large floating point 3D astronomical data sets : first results

    NARCIS (Netherlands)

    Moschini, Ugo; Teeninga, Paul; Wilkinson, Michael; Giese, Nadine; Punzo, Davide; van der Hulst, Jan M.; Trager, Scott

    2014-01-01

    In any image segmentation task, noise must be separated from the actual information and the relevant pixels grouped into objects of interest, on which measures can later be applied. This should be done efficiently on large astronomical surveys with floating point datasets with resolution of the

  6. The Viking viewer for connectomics: scalable multi-user annotation and summarization of large volume data sets.

    Science.gov (United States)

    Anderson, J R; Mohammed, S; Grimm, B; Jones, B W; Koshevoy, P; Tasdizen, T; Whitaker, R; Marc, R E

    2011-01-01

    Modern microscope automation permits the collection of vast amounts of continuous anatomical imagery in both two and three dimensions. These large data sets present significant challenges for data storage, access, viewing, annotation and analysis. The cost and overhead of collecting and storing the data can be extremely high. Large data sets quickly exceed an individual's capability for timely analysis and present challenges in efficiently applying transforms, if needed. Finally annotated anatomical data sets can represent a significant investment of resources and should be easily accessible to the scientific community. The Viking application was our solution created to view and annotate a 16.5 TB ultrastructural retinal connectome volume and we demonstrate its utility in reconstructing neural networks for a distinctive retinal amacrine cell class. Viking has several key features. (1) It works over the internet using HTTP and supports many concurrent users limited only by hardware. (2) It supports a multi-user, collaborative annotation strategy. (3) It cleanly demarcates viewing and analysis from data collection and hosting. (4) It is capable of applying transformations in real-time. (5) It has an easily extensible user interface, allowing addition of specialized modules without rewriting the viewer. © 2010 The Authors Journal of Microscopy © 2010 The Royal Microscopical Society.

  7. A Large Group Decision Making Approach Based on TOPSIS Framework with Unknown Weights Information

    OpenAIRE

    Li Yupeng; Lian Xiaozhen; Lu Cheng; Wang Zhaotong

    2017-01-01

    Large group decision making considering multiple attributes is imperative in many decision areas. The weights of the decision makers (DMs) is difficult to obtain for the large number of DMs. To cope with this issue, an integrated multiple-attributes large group decision making framework is proposed in this article. The fuzziness and hesitation of the linguistic decision variables are described by interval-valued intuitionistic fuzzy sets. The weights of the DMs are optimized by constructing a...

  8. 12 November 1991-Ministerial Order setting up a Commission for assessing information in the nuclear field

    International Nuclear Information System (INIS)

    1991-01-01

    The Commission sets up by this Order must ensure that the public is kept informed on the technical, health, ecological, economic and financial aspects of nuclear energy, and advises the Secretary of State for Energy on the conditions for informing the public and proposes methods for disseminating such information. (NEA)

  9. A large set of potential past, present and future hydro-meteorological time series for the UK

    Science.gov (United States)

    Guillod, Benoit P.; Jones, Richard G.; Dadson, Simon J.; Coxon, Gemma; Bussi, Gianbattista; Freer, James; Kay, Alison L.; Massey, Neil R.; Sparrow, Sarah N.; Wallom, David C. H.; Allen, Myles R.; Hall, Jim W.

    2018-01-01

    Hydro-meteorological extremes such as drought and heavy precipitation can have large impacts on society and the economy. With potentially increasing risks associated with such events due to climate change, properly assessing the associated impacts and uncertainties is critical for adequate adaptation. However, the application of risk-based approaches often requires large sets of extreme events, which are not commonly available. Here, we present such a large set of hydro-meteorological time series for recent past and future conditions for the United Kingdom based on weather@home 2, a modelling framework consisting of a global climate model (GCM) driven by observed or projected sea surface temperature (SST) and sea ice which is downscaled to 25 km over the European domain by a regional climate model (RCM). Sets of 100 time series are generated for each of (i) a historical baseline (1900-2006), (ii) five near-future scenarios (2020-2049) and (iii) five far-future scenarios (2070-2099). The five scenarios in each future time slice all follow the Representative Concentration Pathway 8.5 (RCP8.5) and sample the range of sea surface temperature and sea ice changes from CMIP5 (Coupled Model Intercomparison Project Phase 5) models. Validation of the historical baseline highlights good performance for temperature and potential evaporation, but substantial seasonal biases in mean precipitation, which are corrected using a linear approach. For extremes in low precipitation over a long accumulation period ( > 3 months) and shorter-duration high precipitation (1-30 days), the time series generally represents past statistics well. Future projections show small precipitation increases in winter but large decreases in summer on average, leading to an overall drying, consistently with the most recent UK Climate Projections (UKCP09) but larger in magnitude than the latter. Both drought and high-precipitation events are projected to increase in frequency and intensity in most regions

  10. Participatory Design of Large-Scale Information Systems

    DEFF Research Database (Denmark)

    Simonsen, Jesper; Hertzum, Morten

    2008-01-01

    into a PD process model that (1) emphasizes PD experiments as transcending traditional prototyping by evaluating fully integrated systems exposed to real work practices; (2) incorporates improvisational change management including anticipated, emergent, and opportunity-based change; and (3) extends initial...... design and development into a sustained and ongoing stepwise implementation that constitutes an overall technology-driven organizational change. The process model is presented through a largescale PD experiment in the Danish healthcare sector. We reflect on our experiences from this experiment......In this article we discuss how to engage in large-scale information systems development by applying a participatory design (PD) approach that acknowledges the unique situated work practices conducted by the domain experts of modern organizations. We reconstruct the iterative prototyping approach...

  11. Enterprise Information Systems Outsourcing

    DEFF Research Database (Denmark)

    Svejvig, Per; Pries.Heje, Jan

    2009-01-01

    Outsourcing is now a feasible mean for Enterprise Information Systems (EIS) cost savings, but do however increase the complexity substantially when many organizations are involved. We set out to study EIS outsourcing with many interorganizational partners in a large Scandinavian high-tech organiz......Outsourcing is now a feasible mean for Enterprise Information Systems (EIS) cost savings, but do however increase the complexity substantially when many organizations are involved. We set out to study EIS outsourcing with many interorganizational partners in a large Scandinavian high...... the rational cost saving explanation; but then with a more careful analysis focusing on institutional factors, other explanations "behind the curtain" were revealed, such as management consultants with a "best practice" agenda, people promoting outsourcing thereby being promoted themselves, and outside...

  12. Enterprise Information Systems Outsourcing

    DEFF Research Database (Denmark)

    Pries-Heje, Jan; Svejvig, Per

    2009-01-01

      Outsourcing is now a feasible mean for Enterprise Information Systems (EIS) cost savings, but do however increase the complexity substantially when many organizations are involved. We set out to study EIS outsourcing with many interorganizational partners in a large Scandinavian high-tech organ......  Outsourcing is now a feasible mean for Enterprise Information Systems (EIS) cost savings, but do however increase the complexity substantially when many organizations are involved. We set out to study EIS outsourcing with many interorganizational partners in a large Scandinavian high...... the rational cost saving explanation; but then with a more careful analysis focusing on institutional factors, other explanations "behind the curtain" were revealed, such as management consultants with a "best practice" agenda, people promoting outsourcing thereby being promoted themselves, and outside...

  13. Time series clustering in large data sets

    Directory of Open Access Journals (Sweden)

    Jiří Fejfar

    2011-01-01

    Full Text Available The clustering of time series is a widely researched area. There are many methods for dealing with this task. We are actually using the Self-organizing map (SOM with the unsupervised learning algorithm for clustering of time series. After the first experiment (Fejfar, Weinlichová, Šťastný, 2009 it seems that the whole concept of the clustering algorithm is correct but that we have to perform time series clustering on much larger dataset to obtain more accurate results and to find the correlation between configured parameters and results more precisely. The second requirement arose in a need for a well-defined evaluation of results. It seems useful to use sound recordings as instances of time series again. There are many recordings to use in digital libraries, many interesting features and patterns can be found in this area. We are searching for recordings with the similar development of information density in this experiment. It can be used for musical form investigation, cover songs detection and many others applications.The objective of the presented paper is to compare clustering results made with different parameters of feature vectors and the SOM itself. We are describing time series in a simplistic way evaluating standard deviations for separated parts of recordings. The resulting feature vectors are clustered with the SOM in batch training mode with different topologies varying from few neurons to large maps.There are other algorithms discussed, usable for finding similarities between time series and finally conclusions for further research are presented. We also present an overview of the related actual literature and projects.

  14. Implementation of a large-scale hospital information infrastructure for multi-unit health-care services.

    Science.gov (United States)

    Yoo, Sun K; Kim, Dong Keun; Kim, Jung C; Park, Youn Jung; Chang, Byung Chul

    2008-01-01

    With the increase in demand for high quality medical services, the need for an innovative hospital information system has become essential. An improved system has been implemented in all hospital units of the Yonsei University Health System. Interoperability between multi-units required appropriate hardware infrastructure and software architecture. This large-scale hospital information system encompassed PACS (Picture Archiving and Communications Systems), EMR (Electronic Medical Records) and ERP (Enterprise Resource Planning). It involved two tertiary hospitals and 50 community hospitals. The monthly data production rate by the integrated hospital information system is about 1.8 TByte and the total quantity of data produced so far is about 60 TByte. Large scale information exchange and sharing will be particularly useful for telemedicine applications.

  15. Building and calibrating a large-extent and high resolution coupled groundwater-land surface model using globally available data-sets

    Science.gov (United States)

    Sutanudjaja, E. H.; Van Beek, L. P.; de Jong, S. M.; van Geer, F.; Bierkens, M. F.

    2012-12-01

    : using discharge observations, using time series of remotely sensing soil moisture fields (ERS Soil Water Index from TU Vienna), as well as a combination of both. Note that both sources of information are globally available. Each calibration strategy was subsequently validated using over 4000 groundwater head measurement time series. Comparison of the calibration strategies shows that remotely-sensed soil moisture data can be used for the calibration of upper soil hydraulic conductivities that determine groundwater recharge. However, discharge measurements should be included to calibrate the complete model, specifically to constrain aquifer transmissivities and runoff-infiltration partitioning processes. The combined approach using both remotely-sensed soil moisture data and discharge observations yielded a model that was able to fit both soil moisture as well as discharge reasonably well, as well as predicting the dynamics of groundwater heads with acceptable accuracy. However, absolute levels of groundwater head or are only accurate in regions with shallow groundwater tables. Even though there is room for improvement, our study shows that with the global data-sets that are currently available, large-extent groundwater modeling in data-poor environments is certainly within reach.

  16. Foundations of Large-Scale Multimedia Information Management and Retrieval

    CERN Document Server

    Chang, Edward Y

    2011-01-01

    "Foundations of Large-Scale Multimedia Information Management and Retrieval - Mathematics of Perception" covers knowledge representation and semantic analysis of multimedia data and scalability in signal extraction, data mining, and indexing. The book is divided into two parts: Part I - Knowledge Representation and Semantic Analysis focuses on the key components of mathematics of perception as it applies to data management and retrieval. These include feature selection/reduction, knowledge representation, semantic analysis, distance function formulation for measuring similarity, and

  17. Evaluation of digital soil mapping approaches with large sets of environmental covariates

    Science.gov (United States)

    Nussbaum, Madlene; Spiess, Kay; Baltensweiler, Andri; Grob, Urs; Keller, Armin; Greiner, Lucie; Schaepman, Michael E.; Papritz, Andreas

    2018-01-01

    The spatial assessment of soil functions requires maps of basic soil properties. Unfortunately, these are either missing for many regions or are not available at the desired spatial resolution or down to the required soil depth. The field-based generation of large soil datasets and conventional soil maps remains costly. Meanwhile, legacy soil data and comprehensive sets of spatial environmental data are available for many regions. Digital soil mapping (DSM) approaches relating soil data (responses) to environmental data (covariates) face the challenge of building statistical models from large sets of covariates originating, for example, from airborne imaging spectroscopy or multi-scale terrain analysis. We evaluated six approaches for DSM in three study regions in Switzerland (Berne, Greifensee, ZH forest) by mapping the effective soil depth available to plants (SD), pH, soil organic matter (SOM), effective cation exchange capacity (ECEC), clay, silt, gravel content and fine fraction bulk density for four soil depths (totalling 48 responses). Models were built from 300-500 environmental covariates by selecting linear models through (1) grouped lasso and (2) an ad hoc stepwise procedure for robust external-drift kriging (georob). For (3) geoadditive models we selected penalized smoothing spline terms by component-wise gradient boosting (geoGAM). We further used two tree-based methods: (4) boosted regression trees (BRTs) and (5) random forest (RF). Lastly, we computed (6) weighted model averages (MAs) from the predictions obtained from methods 1-5. Lasso, georob and geoGAM successfully selected strongly reduced sets of covariates (subsets of 3-6 % of all covariates). Differences in predictive performance, tested on independent validation data, were mostly small and did not reveal a single best method for 48 responses. Nevertheless, RF was often the best among methods 1-5 (28 of 48 responses), but was outcompeted by MA for 14 of these 28 responses. RF tended to over

  18. Developing and setting up of a nuclear medicine information management system

    International Nuclear Information System (INIS)

    Baghel, N.S.; Asopa, R.; Nayak, U.N.; Rajan, M.G.R.; Subhalakshmi, P.V.; Shailaja, A.; Rajashekharrao, B.; Karunanidhi, Y.R.

    2010-01-01

    Full text: With the advent and progress of information technology in the present decade, high-performance networks are being installed in hospitals to implement an effective and reliable Hospital Information Management Systems (HIMS). The Radiation Medicine Centre (RMC), is one of the earliest and largest nuclear medicine centres in India and several thousand patients undergo diagnostic as well as therapeutic procedures with different radiopharmaceuticals. The evolution towards a fully digital department of nuclear medicine is driven by expectations of not only improved patient management but also a well-defined workflow along with prompt and quality patient services. The aim was to develop and set up a practical and utility based Nuclear Medicine Information Management System (NMIMS) for various functional procedures at RMC. A customised NMIMS is developed with M/s ECIL using ASP.NET and SQL server technology facilitated by an IBM x3650 M3 Server, 18 thin-clients/desktop PCs and Windows 2008 server operating system and MS-SQL 2005 server software. The various modules have been developed to meet the requirements of different activities pertaining to patient appointment and scheduling, clinical assessment, radiopharmacy procedures, imaging and non-imaging studies and protocols, in-vitro laboratory tests, in-patient and out-patient treatment procedures, radiation protection and regulatory aspects and other routine operational procedures associated with patient management at RMC. The menus are developed as per scheduled workflow (SWF) in the department. The various aspects of SWF have been designed to ensure smooth, easy and trouble free patient management. Presently, the NMIMS has been developed excluding imaging data and we are in the process of setting up Picture Archiving Communication System (PACS) integrated to the existing database system, which will archive and facilitate imaging data in DICOM format in order to make a paperless department. The developed NMIMS

  19. Tailoring Healthy Workplace Interventions to Local Healthcare Settings: A Complexity Theory-Informed Workplace of Well-Being Framework.

    Science.gov (United States)

    Brand, Sarah L; Fleming, Lora E; Wyatt, Katrina M

    2015-01-01

    Many healthy workplace interventions have been developed for healthcare settings to address the consistently low scores of healthcare professionals on assessments of mental and physical well-being. Complex healthcare settings present challenges for the scale-up and spread of successful interventions from one setting to another. Despite general agreement regarding the importance of the local setting in affecting intervention success across different settings, there is no consensus on what it is about a local setting that needs to be taken into account to design healthy workplace interventions appropriate for different local settings. Complexity theory principles were used to understand a workplace as a complex adaptive system and to create a framework of eight domains (system characteristics) that affect the emergence of system-level behaviour. This Workplace of Well-being (WoW) framework is responsive and adaptive to local settings and allows a shared understanding of the enablers and barriers to behaviour change by capturing local information for each of the eight domains. We use the results of applying the WoW framework to one workplace, a UK National Health Service ward, to describe the utility of this approach in informing design of setting-appropriate healthy workplace interventions that create workplaces conducive to healthy behaviour change.

  20. Tailoring Healthy Workplace Interventions to Local Healthcare Settings: A Complexity Theory-Informed Workplace of Well-Being Framework

    Directory of Open Access Journals (Sweden)

    Sarah L. Brand

    2015-01-01

    Full Text Available Many healthy workplace interventions have been developed for healthcare settings to address the consistently low scores of healthcare professionals on assessments of mental and physical well-being. Complex healthcare settings present challenges for the scale-up and spread of successful interventions from one setting to another. Despite general agreement regarding the importance of the local setting in affecting intervention success across different settings, there is no consensus on what it is about a local setting that needs to be taken into account to design healthy workplace interventions appropriate for different local settings. Complexity theory principles were used to understand a workplace as a complex adaptive system and to create a framework of eight domains (system characteristics that affect the emergence of system-level behaviour. This Workplace of Well-being (WoW framework is responsive and adaptive to local settings and allows a shared understanding of the enablers and barriers to behaviour change by capturing local information for each of the eight domains. We use the results of applying the WoW framework to one workplace, a UK National Health Service ward, to describe the utility of this approach in informing design of setting-appropriate healthy workplace interventions that create workplaces conducive to healthy behaviour change.

  1. Tailoring Healthy Workplace Interventions to Local Healthcare Settings: A Complexity Theory-Informed Workplace of Well-Being Framework

    Science.gov (United States)

    Brand, Sarah L.; Fleming, Lora E.; Wyatt, Katrina M.

    2015-01-01

    Many healthy workplace interventions have been developed for healthcare settings to address the consistently low scores of healthcare professionals on assessments of mental and physical well-being. Complex healthcare settings present challenges for the scale-up and spread of successful interventions from one setting to another. Despite general agreement regarding the importance of the local setting in affecting intervention success across different settings, there is no consensus on what it is about a local setting that needs to be taken into account to design healthy workplace interventions appropriate for different local settings. Complexity theory principles were used to understand a workplace as a complex adaptive system and to create a framework of eight domains (system characteristics) that affect the emergence of system-level behaviour. This Workplace of Well-being (WoW) framework is responsive and adaptive to local settings and allows a shared understanding of the enablers and barriers to behaviour change by capturing local information for each of the eight domains. We use the results of applying the WoW framework to one workplace, a UK National Health Service ward, to describe the utility of this approach in informing design of setting-appropriate healthy workplace interventions that create workplaces conducive to healthy behaviour change. PMID:26380358

  2. Setting up and Running a School Library. Information Collection and Exchange Publication No. ED204

    Science.gov (United States)

    Baird, Nicola

    2012-01-01

    This book explains how teachers can set up and run a successful school library. In it you will find advice and information on how to: (1) set up a small library and build bookshelves; (2) select books for your library; (3) make a written record of your school's books, pamphlets and other library stock such as newspapers, magazines, audio tapes and…

  3. Support of an Active Science Project by a Large Information System: Lessons for the EOS Era

    Science.gov (United States)

    Angelici, Gary L.; Skiles, J. W.; Popovici, Lidia Z.

    1993-01-01

    The ability of large information systems to support the changing data requirements of active science projects is being tested in a NASA collaborative study. This paper briefly profiles both the active science project and the large information system involved in this effort and offers some observations about the effectiveness of the project support. This is followed by lessons that are important for those participating in large information systems that need to support active science projects or that make available the valuable data produced by these projects. We learned in this work that it is difficult for a large information system focused on long term data management to satisfy the requirements of an on-going science project. For example, in order to provide the best service, it is important for all information system staff to keep focused on the needs and constraints of the scientists in the development of appropriate services. If the lessons learned in this and other science support experiences are not applied by those involved with large information systems of the EOS (Earth Observing System) era, then the final data products produced by future science projects may not be robust or of high quality, thereby making the conduct of the project science less efficacious and reducing the value of these unique suites of data for future research.

  4. Decomposing wage distributions on a large data set - a quantile regression analysis of the gender wage gap

    DEFF Research Database (Denmark)

    Albæk, Karsten; Brink Thomsen, Lars

    This paper presents and implements a procedure that makes it possible to decompose wage distributions on large data sets. We replace bootstrap sampling in the standard Machado-Mata procedure with ‘non-replacement subsampling’, which is more suitable for the linked employer-employee data applied i...... in gender wage differences in the lower part of the wage distribution.......This paper presents and implements a procedure that makes it possible to decompose wage distributions on large data sets. We replace bootstrap sampling in the standard Machado-Mata procedure with ‘non-replacement subsampling’, which is more suitable for the linked employer-employee data applied...... in this paper. Decompositions show that most of the glass ceiling is related to segregation in the form of either composition effects or different returns to males and females. A counterfactual wage distribution without differences in the constant terms (or ‘discrimination’) implies substantial changes...

  5. Recruiting Science Majors into Secondary Science Teaching: Paid Internships in Informal Science Settings

    Science.gov (United States)

    Worsham, Heather M.; Friedrichsen, Patricia; Soucie, Marilyn; Barnett, Ellen; Akiba, Motoko

    2014-01-01

    Despite the importance of recruiting highly qualified individuals into the science teaching profession, little is known about the effectiveness of particular recruitment strategies. Over 3 years, 34 college science majors and undecided students were recruited into paid internships in informal science settings to consider secondary science teaching…

  6. Stereoscopy in Static Scientific Imagery in an Informal Education Setting: Does It Matter?

    Science.gov (United States)

    Price, C. Aaron; Lee, H.-S.; Malatesta, K.

    2014-01-01

    Stereoscopic technology (3D) is rapidly becoming ubiquitous across research, entertainment and informal educational settings. Children of today may grow up never knowing a time when movies, television and video games were not available stereoscopically. Despite this rapid expansion, the field's understanding of the impact of stereoscopic…

  7. Visual attention mitigates information loss in small- and large-scale neural codes

    Science.gov (United States)

    Sprague, Thomas C; Saproo, Sameer; Serences, John T

    2015-01-01

    Summary The visual system transforms complex inputs into robust and parsimonious neural codes that efficiently guide behavior. Because neural communication is stochastic, the amount of encoded visual information necessarily decreases with each synapse. This constraint requires processing sensory signals in a manner that protects information about relevant stimuli from degradation. Such selective processing – or selective attention – is implemented via several mechanisms, including neural gain and changes in tuning properties. However, examining each of these effects in isolation obscures their joint impact on the fidelity of stimulus feature representations by large-scale population codes. Instead, large-scale activity patterns can be used to reconstruct representations of relevant and irrelevant stimuli, providing a holistic understanding about how neuron-level modulations collectively impact stimulus encoding. PMID:25769502

  8. Cost (and Quality and Value) of Information Technology Support in Large Research Universities.

    Science.gov (United States)

    Peebles, Christopher S.; Antolovic, Laurie

    1999-01-01

    Shows how financial and quality measures associated with the Balanced Scorecard (developed by Kaplan and Norton to measure organizational performance) can be applied to information technology (IT) user education and support in large research universities. Focuses on University Information Technology Services that has measured the quality of IT…

  9. Constraining new resonant physics with top spin polarisation information

    Energy Technology Data Exchange (ETDEWEB)

    Englert, Christoph; Nordstroem, Karl [University of Glasgow, SUPA, School of Physics and Astronomy, Glasgow (United Kingdom); Ferrando, James [DESY Hamburg, Hamburg (Germany)

    2017-06-15

    We provide a comprehensive analysis of the power of including top quark-polarisation information to kinematically challenging top pair resonance searches, for which ATLAS and CMS start losing sensitivity. Following the general modelling and analysis strategies pursued by the experiments, we analyse the semi-leptonic and the di-lepton channels and show that including polarisation information can lead to large improvements in the limit setting procedures with large data sets. This will allow us to set stronger limits for parameter choices where sensitivity from the invariant mass of the top pair is not sufficient. This highlights the importance of spin observables as part of a more comprehensive set of observables to gain sensitivity to BSM resonance searches. (orig.)

  10. Isotropic gates in large gamma detector arrays versus angular distributions

    International Nuclear Information System (INIS)

    Iacob, V.E.; Duchene, G.

    1997-01-01

    The quality of the angular distribution information extracted from high-fold gamma-gamma coincidence events is analyzed. It is shown that a correct quasi-isotropic gate setting, available at the modern large gamma-ray detector arrays, essentially preserves the quality of the angular information. (orig.)

  11. MATLAB-SIMULINK BASED INFORMATION SUPPORT FOR DIGITAL OVERCURRENT PROTECTION TEST SETS

    Directory of Open Access Journals (Sweden)

    I. V. Novash

    2017-01-01

    Full Text Available The implementation of information support for PC-based and hardware-software based sets for digital overcurrent protection devices and their models testing using MatLab-Simulink environment is considered. It is demonstrated that the mathematical modeling of a part of the power system – viz. of the generalized electric power object – could be based on rigid and flexible models. Rigid models implemented on the basis of mathematical description of electrical and magnetic circuits of a power system can be considered as a reference model for the simulation results that have been obtained with the aid of another simulation system to be compared with. It is proposed to implement flexible models for generalized electric power object in the MatLabSimulink environment that includes the SimPowerSystems component library targeted to power system modeling. The features of the parameters calculation of the SimPowerSystems component library blocks that the power system model is formed of are considered. Out of the Simulink standard blocks the models of a wye-connected current transformers were composed as well as the digital overcurrent protection, missing in the component library. A comparison of simulation results of one and the same generalized electric power object implemented in various PC-based software packages was undertaken. The divergence of simulation results did not exceed 3 %; the latter allows us to recommend the MatLab-Simulink environment for information support creation for hardware-software based sets for digital overcurrent protection devices testing. The structure of the hardware-software based set for digital overcurrent protection device testing using the Omicron CMC 356 has been suggested. Time to trip comparison between the real digital protection device МР 801 and the model with the parameters which are exactly match the parameters of the prototype device was carried out using the identical test inputs. The results of the tests

  12. A Mine of Information: Can Sports Analytics Provide Wisdom From Your Data?

    Science.gov (United States)

    Passfield, Louis; Hopker, James G

    2017-08-01

    This paper explores the notion that the availability and analysis of large data sets have the capacity to improve practice and change the nature of science in the sport and exercise setting. The increasing use of data and information technology in sport is giving rise to this change. Web sites hold large data repositories, and the development of wearable technology, mobile phone applications, and related instruments for monitoring physical activity, training, and competition provide large data sets of extensive and detailed measurements. Innovative approaches conceived to more fully exploit these large data sets could provide a basis for more objective evaluation of coaching strategies and new approaches to how science is conducted. An emerging discipline, sports analytics, could help overcome some of the challenges involved in obtaining knowledge and wisdom from these large data sets. Examples of where large data sets have been analyzed, to evaluate the career development of elite cyclists and to characterize and optimize the training load of well-trained runners, are discussed. Careful verification of large data sets is time consuming and imperative before useful conclusions can be drawn. Consequently, it is recommended that prospective studies be preferred over retrospective analyses of data. It is concluded that rigorous analysis of large data sets could enhance our knowledge in the sport and exercise sciences, inform competitive strategies, and allow innovative new research and findings.

  13. Risk-based optimization of pipe inspections in large underground networks with imprecise information

    International Nuclear Information System (INIS)

    Mancuso, A.; Compare, M.; Salo, A.; Zio, E.; Laakso, T.

    2016-01-01

    In this paper, we present a novel risk-based methodology for optimizing the inspections of large underground infrastructure networks in the presence of incomplete information about the network features and parameters. The methodology employs Multi Attribute Value Theory to assess the risk of each pipe in the network, whereafter the optimal inspection campaign is built with Portfolio Decision Analysis (PDA). Specifically, Robust Portfolio Modeling (RPM) is employed to identify Pareto-optimal portfolios of pipe inspections. The proposed methodology is illustrated by reporting a real case study on the large-scale maintenance optimization of the sewerage network in Espoo, Finland. - Highlights: • Risk-based approach to optimize pipe inspections on large underground networks. • Reasonable computational effort to select efficient inspection portfolios. • Possibility to accommodate imprecise expert information. • Feasibility of the approach shown by Espoo water system case study.

  14. Predictive information speeds up visual awareness in an individuation task by modulating threshold setting, not processing efficiency.

    Science.gov (United States)

    De Loof, Esther; Van Opstal, Filip; Verguts, Tom

    2016-04-01

    Theories on visual awareness claim that predicted stimuli reach awareness faster than unpredicted ones. In the current study, we disentangle whether prior information about the upcoming stimulus affects visual awareness of stimulus location (i.e., individuation) by modulating processing efficiency or threshold setting. Analogous research on stimulus identification revealed that prior information modulates threshold setting. However, as identification and individuation are two functionally and neurally distinct processes, the mechanisms underlying identification cannot simply be extrapolated directly to individuation. The goal of this study was therefore to investigate how individuation is influenced by prior information about the upcoming stimulus. To do so, a drift diffusion model was fitted to estimate the processing efficiency and threshold setting for predicted versus unpredicted stimuli in a cued individuation paradigm. Participants were asked to locate a picture, following a cue that was congruent, incongruent or neutral with respect to the picture's identity. Pictures were individuated faster in the congruent and neutral condition compared to the incongruent condition. In the diffusion model analysis, the processing efficiency was not significantly different across conditions. However, the threshold setting was significantly higher following an incongruent cue compared to both congruent and neutral cues. Our results indicate that predictive information about the upcoming stimulus influences visual awareness by shifting the threshold for individuation rather than by enhancing processing efficiency. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Calculations of safe collimator settings and β^{*} at the CERN Large Hadron Collider

    Directory of Open Access Journals (Sweden)

    R. Bruce

    2015-06-01

    Full Text Available The first run of the Large Hadron Collider (LHC at CERN was very successful and resulted in important physics discoveries. One way of increasing the luminosity in a collider, which gave a very significant contribution to the LHC performance in the first run and can be used even if the beam intensity cannot be increased, is to decrease the transverse beam size at the interaction points by reducing the optical function β^{*}. However, when doing so, the beam becomes larger in the final focusing system, which could expose its aperture to beam losses. For the LHC, which is designed to store beams with a total energy of 362 MJ, this is critical, since the loss of even a small fraction of the beam could cause a magnet quench or even damage. Therefore, the machine aperture has to be protected by the collimation system. The settings of the collimators constrain the maximum beam size that can be tolerated and therefore impose a lower limit on β^{*}. In this paper, we present calculations to determine safe collimator settings and the resulting limit on β^{*}, based on available aperture and operational stability of the machine. Our model was used to determine the LHC configurations in 2011 and 2012 and it was found that β^{*} could be decreased significantly compared to the conservative model used in 2010. The gain in luminosity resulting from the decreased margins between collimators was more than a factor 2, and a further contribution from the use of realistic aperture estimates based on measurements was almost as large. This has played an essential role in the rapid and successful accumulation of experimental data in the LHC.

  16. Calculations of safe collimator settings and β* at the CERN Large Hadron Collider

    Science.gov (United States)

    Bruce, R.; Assmann, R. W.; Redaelli, S.

    2015-06-01

    The first run of the Large Hadron Collider (LHC) at CERN was very successful and resulted in important physics discoveries. One way of increasing the luminosity in a collider, which gave a very significant contribution to the LHC performance in the first run and can be used even if the beam intensity cannot be increased, is to decrease the transverse beam size at the interaction points by reducing the optical function β*. However, when doing so, the beam becomes larger in the final focusing system, which could expose its aperture to beam losses. For the LHC, which is designed to store beams with a total energy of 362 MJ, this is critical, since the loss of even a small fraction of the beam could cause a magnet quench or even damage. Therefore, the machine aperture has to be protected by the collimation system. The settings of the collimators constrain the maximum beam size that can be tolerated and therefore impose a lower limit on β*. In this paper, we present calculations to determine safe collimator settings and the resulting limit on β*, based on available aperture and operational stability of the machine. Our model was used to determine the LHC configurations in 2011 and 2012 and it was found that β* could be decreased significantly compared to the conservative model used in 2010. The gain in luminosity resulting from the decreased margins between collimators was more than a factor 2, and a further contribution from the use of realistic aperture estimates based on measurements was almost as large. This has played an essential role in the rapid and successful accumulation of experimental data in the LHC.

  17. A large set of potential past, present and future hydro-meteorological time series for the UK

    Directory of Open Access Journals (Sweden)

    B. P. Guillod

    2018-01-01

    Full Text Available Hydro-meteorological extremes such as drought and heavy precipitation can have large impacts on society and the economy. With potentially increasing risks associated with such events due to climate change, properly assessing the associated impacts and uncertainties is critical for adequate adaptation. However, the application of risk-based approaches often requires large sets of extreme events, which are not commonly available. Here, we present such a large set of hydro-meteorological time series for recent past and future conditions for the United Kingdom based on weather@home 2, a modelling framework consisting of a global climate model (GCM driven by observed or projected sea surface temperature (SST and sea ice which is downscaled to 25 km over the European domain by a regional climate model (RCM. Sets of 100 time series are generated for each of (i a historical baseline (1900–2006, (ii five near-future scenarios (2020–2049 and (iii five far-future scenarios (2070–2099. The five scenarios in each future time slice all follow the Representative Concentration Pathway 8.5 (RCP8.5 and sample the range of sea surface temperature and sea ice changes from CMIP5 (Coupled Model Intercomparison Project Phase 5 models. Validation of the historical baseline highlights good performance for temperature and potential evaporation, but substantial seasonal biases in mean precipitation, which are corrected using a linear approach. For extremes in low precipitation over a long accumulation period ( > 3 months and shorter-duration high precipitation (1–30 days, the time series generally represents past statistics well. Future projections show small precipitation increases in winter but large decreases in summer on average, leading to an overall drying, consistently with the most recent UK Climate Projections (UKCP09 but larger in magnitude than the latter. Both drought and high-precipitation events are projected to increase in frequency and

  18. Visual attention mitigates information loss in small- and large-scale neural codes.

    Science.gov (United States)

    Sprague, Thomas C; Saproo, Sameer; Serences, John T

    2015-04-01

    The visual system transforms complex inputs into robust and parsimonious neural codes that efficiently guide behavior. Because neural communication is stochastic, the amount of encoded visual information necessarily decreases with each synapse. This constraint requires that sensory signals are processed in a manner that protects information about relevant stimuli from degradation. Such selective processing--or selective attention--is implemented via several mechanisms, including neural gain and changes in tuning properties. However, examining each of these effects in isolation obscures their joint impact on the fidelity of stimulus feature representations by large-scale population codes. Instead, large-scale activity patterns can be used to reconstruct representations of relevant and irrelevant stimuli, thereby providing a holistic understanding about how neuron-level modulations collectively impact stimulus encoding. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Efficient One-click Browsing of Large Trajectory Sets

    DEFF Research Database (Denmark)

    Krogh, Benjamin Bjerre; Andersen, Ove; Lewis-Kelham, Edwin

    2014-01-01

    presents a novel query type called sheaf, where users can browse trajectory data sets using a single mouse click. Sheaves are very versatile and can be used for location-based advertising, travel-time analysis, intersection analysis, and reachability analysis (isochrones). A novel in-memory trajectory...... index compresses the data by a factor of 12.4 and enables execution of sheaf queries in 40 ms. This is up to 2 orders of magnitude faster than existing work. We demonstrate the simplicity, versatility, and efficiency of sheaf queries using a real-world trajectory set consisting of 2.7 million...

  20. Uncovering gender discrimination cues in a realistic setting.

    Science.gov (United States)

    Dupuis-Roy, Nicolas; Fortin, Isabelle; Fiset, Daniel; Gosselin, Frédéric

    2009-02-10

    Which face cues do we use for gender discrimination? Few studies have tried to answer this question and the few that have tried typically used only a small set of grayscale stimuli, often distorted and presented a large number of times. Here, we reassessed the importance of facial cues for gender discrimination in a more realistic setting. We applied Bubbles-a technique that minimizes bias toward specific facial features and does not necessitate the distortion of stimuli-to a set of 300 color photographs of Caucasian faces, each presented only once to 30 participants. Results show that the region of the eyes and the eyebrows-probably in the light-dark channel-is the most important facial cue for accurate gender discrimination; and that the mouth region is driving fast correct responses (but not fast incorrect responses)-the gender discrimination information in the mouth region is concentrated in the red-green color channel. Together, these results suggest that, when color is informative in the mouth region, humans use it and respond rapidly; and, when it's not informative, they have to rely on the more robust but more sluggish luminance information in the eye-eyebrow region.

  1. PREP KITT, System Reliability by Fault Tree Analysis. PREP, Min Path Set and Min Cut Set for Fault Tree Analysis, Monte-Carlo Method. KITT, Component and System Reliability Information from Kinetic Fault Tree Theory

    International Nuclear Information System (INIS)

    Vesely, W.E.; Narum, R.E.

    1997-01-01

    1 - Description of problem or function: The PREP/KITT computer program package obtains system reliability information from a system fault tree. The PREP program finds the minimal cut sets and/or the minimal path sets of the system fault tree. (A minimal cut set is a smallest set of components such that if all the components are simultaneously failed the system is failed. A minimal path set is a smallest set of components such that if all of the components are simultaneously functioning the system is functioning.) The KITT programs determine reliability information for the components of each minimal cut or path set, for each minimal cut or path set, and for the system. Exact, time-dependent reliability information is determined for each component and for each minimal cut set or path set. For the system, reliability results are obtained by upper bound approximations or by a bracketing procedure in which various upper and lower bounds may be obtained as close to one another as desired. The KITT programs can handle independent components which are non-repairable or which have a constant repair time. Any assortment of non-repairable components and components having constant repair times can be considered. Any inhibit conditions having constant probabilities of occurrence can be handled. The failure intensity of each component is assumed to be constant with respect to time. The KITT2 program can also handle components which during different time intervals, called phases, may have different reliability properties. 2 - Method of solution: The PREP program obtains minimal cut sets by either direct deterministic testing or by an efficient Monte Carlo algorithm. The minimal path sets are obtained using the Monte Carlo algorithm. The reliability information is obtained by the KITT programs from numerical solution of the simple integral balance equations of kinetic tree theory. 3 - Restrictions on the complexity of the problem: The PREP program will obtain the minimal cut and

  2. Challenges in the use of the mental health information system in a resource-limited setting: lessons from Ghana.

    Science.gov (United States)

    Kpobi, Lily; Swartz, Leslie; Ofori-Atta, Angela L

    2018-02-08

    One of the most successful modes of record-keeping and data collection is the use of health management information systems, where patient information and management plans are uniformly entered into a database to streamline the information and for ease of further patient management. For mental healthcare, a Mental Health Information System (MHIS) has been found most successful since a properly established and operational MHIS is helpful for developing equitable and appropriate mental health care systems. Until 2010, the system of keeping patient records and information in the Accra Psychiatric Hospital of Ghana was old and outdated. In light of this and other factors, a complete reforming of the mental health information systems in three psychiatric hospitals in Ghana was undertaken in 2010. Four years after its implementation, we explored user experiences with the new system, and report here the challenges that were identified with use of the new MHIS. Individual semi-structured interviews were conducted with nine clinical and administrative staff of the Accra Psychiatric Hospital to examine their experiences with the new MHIS. Participants in the study were in three categories: clinical staff, administrator, and records clerk. Participants' knowledge of the system and its use, as well as the challenges they had experienced in its use were explored using an interpretative phenomenological approach. The data suggest that optimal use of the current MHIS had faced significant implementation challenges in a number of areas. Central challenges reported by users included increased workload, poor staff involvement and training, and absence of logistic support to keep the system running. Setting up a new system does not guarantee its success. As important as it is to have a mental health information system, its usefulness is largely dependent on proper implementation and maintenance. Further, the system can facilitate policy transformation only when the place of mental

  3. An ancestry informative marker set for determining continental origin: validation and extension using human genome diversity panels

    Directory of Open Access Journals (Sweden)

    Gregersen Peter K

    2009-07-01

    Full Text Available Abstract Background Case-control genetic studies of complex human diseases can be confounded by population stratification. This issue can be addressed using panels of ancestry informative markers (AIMs that can provide substantial population substructure information. Previously, we described a panel of 128 SNP AIMs that were designed as a tool for ascertaining the origins of subjects from Europe, Sub-Saharan Africa, Americas, and East Asia. Results In this study, genotypes from Human Genome Diversity Panel populations were used to further evaluate a 93 SNP AIM panel, a subset of the 128 AIMS set, for distinguishing continental origins. Using both model-based and relatively model-independent methods, we here confirm the ability of this AIM set to distinguish diverse population groups that were not previously evaluated. This study included multiple population groups from Oceana, South Asia, East Asia, Sub-Saharan Africa, North and South America, and Europe. In addition, the 93 AIM set provides population substructure information that can, for example, distinguish Arab and Ashkenazi from Northern European population groups and Pygmy from other Sub-Saharan African population groups. Conclusion These data provide additional support for using the 93 AIM set to efficiently identify continental subject groups for genetic studies, to identify study population outliers, and to control for admixture in association studies.

  4. A Core Set Based Large Vector-Angular Region and Margin Approach for Novelty Detection

    Directory of Open Access Journals (Sweden)

    Jiusheng Chen

    2016-01-01

    Full Text Available A large vector-angular region and margin (LARM approach is presented for novelty detection based on imbalanced data. The key idea is to construct the largest vector-angular region in the feature space to separate normal training patterns; meanwhile, maximize the vector-angular margin between the surface of this optimal vector-angular region and abnormal training patterns. In order to improve the generalization performance of LARM, the vector-angular distribution is optimized by maximizing the vector-angular mean and minimizing the vector-angular variance, which separates the normal and abnormal examples well. However, the inherent computation of quadratic programming (QP solver takes O(n3 training time and at least O(n2 space, which might be computational prohibitive for large scale problems. By (1+ε  and  (1-ε-approximation algorithm, the core set based LARM algorithm is proposed for fast training LARM problem. Experimental results based on imbalanced datasets have validated the favorable efficiency of the proposed approach in novelty detection.

  5. Informal Music Education: The Nature of a Young Child's Engagement in an Individual Piano Lesson Setting

    Science.gov (United States)

    Kooistra, Lauren

    2016-01-01

    The purpose of this study was to gain insight into the nature of a young child's engagement in an individual music lesson setting based on principles of informal learning. The informal educational space allowed the child to observe, explore, and interact with a musical environment as a process of enculturation and development (Gordon, 2013;…

  6. Numerical Estimation of Information Theoretic Measures for Large Data Sets

    Science.gov (United States)

    2013-01-30

    probability including a new indifference rule,” J. Inst. of Actuaries Students’ Soc. 73, 285–334 (1947). 7. M. Hutter and M. Zaffalon, “Distribution...Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover Publications, New York (1972). 13. K.B. Oldham et al., An Atlas

  7. Priority Setting for Universal Health Coverage: We Need Evidence-Informed Deliberative Processes, Not Just More Evidence on Cost-Effectiveness

    Directory of Open Access Journals (Sweden)

    Rob Baltussen

    2016-11-01

    Full Text Available Priority setting of health interventions is generally considered as a valuable approach to support low- and middle-income countries (LMICs in their strive for universal health coverage (UHC. However, present initiatives on priority setting are mainly geared towards the development of more cost-effectiveness information, and this evidence does not sufficiently support countries to make optimal choices. The reason is that priority setting is in reality a value-laden political process in which multiple criteria beyond cost-effectiveness are important, and stakeholders often justifiably disagree about the relative importance of these criteria. Here, we propose the use of ‘evidence-informed deliberative processes’ as an approach that does explicitly recognise priority setting as a political process and an intrinsically complex task. In these processes, deliberation between stakeholders is crucial to identify, reflect and learn about the meaning and importance of values, informed by evidence on these values. Such processes then result in the use of a broader range of explicit criteria that can be seen as the product of both international learning (‘core’ criteria, which include eg, cost-effectiveness, priority to the worse off, and financial protection and learning among local stakeholders (‘contextual’ criteria. We believe that, with these evidence-informed deliberative processes in place, priority setting can provide a more meaningful contribution to achieving UHC.

  8. An Automated Medical Information Management System (OpScan-MIMS) in a Clinical Setting

    Science.gov (United States)

    Margolis, S.; Baker, T.G.; Ritchey, M.G.; Alterescu, S.; Friedman, C.

    1981-01-01

    This paper describes an automated medical information management system within a clinic setting. The system includes an optically scanned data entry system (OpScan), a generalized, interactive retrieval and storage software system(Medical Information Management System, MIMS) and the use of time-sharing. The system has the advantages of minimal hardware purchase and maintenance, rapid data entry and retrieval, user-created programs, no need for user knowledge of computer language or technology and is cost effective. The OpScan-MIMS system has been operational for approximately 16 months in a sexually transmitted disease clinic. The system's application to medical audit, quality assurance, clinic management and clinical training are demonstrated.

  9. Direction of information flow in large-scale resting-state networks is frequency-dependent.

    Science.gov (United States)

    Hillebrand, Arjan; Tewarie, Prejaas; van Dellen, Edwin; Yu, Meichen; Carbo, Ellen W S; Douw, Linda; Gouw, Alida A; van Straaten, Elisabeth C W; Stam, Cornelis J

    2016-04-05

    Normal brain function requires interactions between spatially separated, and functionally specialized, macroscopic regions, yet the directionality of these interactions in large-scale functional networks is unknown. Magnetoencephalography was used to determine the directionality of these interactions, where directionality was inferred from time series of beamformer-reconstructed estimates of neuronal activation, using a recently proposed measure of phase transfer entropy. We observed well-organized posterior-to-anterior patterns of information flow in the higher-frequency bands (alpha1, alpha2, and beta band), dominated by regions in the visual cortex and posterior default mode network. Opposite patterns of anterior-to-posterior flow were found in the theta band, involving mainly regions in the frontal lobe that were sending information to a more distributed network. Many strong information senders in the theta band were also frequent receivers in the alpha2 band, and vice versa. Our results provide evidence that large-scale resting-state patterns of information flow in the human brain form frequency-dependent reentry loops that are dominated by flow from parieto-occipital cortex to integrative frontal areas in the higher-frequency bands, which is mirrored by a theta band anterior-to-posterior flow.

  10. Large-scale modeling of condition-specific gene regulatory networks by information integration and inference.

    Science.gov (United States)

    Ellwanger, Daniel Christian; Leonhardt, Jörn Florian; Mewes, Hans-Werner

    2014-12-01

    Understanding how regulatory networks globally coordinate the response of a cell to changing conditions, such as perturbations by shifting environments, is an elementary challenge in systems biology which has yet to be met. Genome-wide gene expression measurements are high dimensional as these are reflecting the condition-specific interplay of thousands of cellular components. The integration of prior biological knowledge into the modeling process of systems-wide gene regulation enables the large-scale interpretation of gene expression signals in the context of known regulatory relations. We developed COGERE (http://mips.helmholtz-muenchen.de/cogere), a method for the inference of condition-specific gene regulatory networks in human and mouse. We integrated existing knowledge of regulatory interactions from multiple sources to a comprehensive model of prior information. COGERE infers condition-specific regulation by evaluating the mutual dependency between regulator (transcription factor or miRNA) and target gene expression using prior information. This dependency is scored by the non-parametric, nonlinear correlation coefficient η(2) (eta squared) that is derived by a two-way analysis of variance. We show that COGERE significantly outperforms alternative methods in predicting condition-specific gene regulatory networks on simulated data sets. Furthermore, by inferring the cancer-specific gene regulatory network from the NCI-60 expression study, we demonstrate the utility of COGERE to promote hypothesis-driven clinical research. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Large-scale weakly supervised object localization via latent category learning.

    Science.gov (United States)

    Chong Wang; Kaiqi Huang; Weiqiang Ren; Junge Zhang; Maybank, Steve

    2015-04-01

    Localizing objects in cluttered backgrounds is challenging under large-scale weakly supervised conditions. Due to the cluttered image condition, objects usually have large ambiguity with backgrounds. Besides, there is also a lack of effective algorithm for large-scale weakly supervised localization in cluttered backgrounds. However, backgrounds contain useful latent information, e.g., the sky in the aeroplane class. If this latent information can be learned, object-background ambiguity can be largely reduced and background can be suppressed effectively. In this paper, we propose the latent category learning (LCL) in large-scale cluttered conditions. LCL is an unsupervised learning method which requires only image-level class labels. First, we use the latent semantic analysis with semantic object representation to learn the latent categories, which represent objects, object parts or backgrounds. Second, to determine which category contains the target object, we propose a category selection strategy by evaluating each category's discrimination. Finally, we propose the online LCL for use in large-scale conditions. Evaluation on the challenging PASCAL Visual Object Class (VOC) 2007 and the large-scale imagenet large-scale visual recognition challenge 2013 detection data sets shows that the method can improve the annotation precision by 10% over previous methods. More importantly, we achieve the detection precision which outperforms previous results by a large margin and can be competitive to the supervised deformable part model 5.0 baseline on both data sets.

  12. The Impact of Ranking Information on Students’ Behavior and Performance in Peer Review Settings

    DEFF Research Database (Denmark)

    Papadopoulos, Pantelis M.; Lagkas, Thomas D.; Demetriadis, Stavros N.

    2015-01-01

    The paper explores the potential of usage and ranking information in increasing student engagement in a double-blinded peer review setting, where students are allowed to select freely which/how many peer works to review. The study employed 56 volunteering sophomore students majoring in Informatics...... and Telecommunications Engineering. We performed a controlled experiment, grouping students into 3 study conditions: control, usage data, usage and ranking data. Students in the control condition did not receive additional information. Students in the next two conditions were able to see their usage data (logins, peer...

  13. Augmenting Reality and Formality of Informal and Non-Formal Settings to Enhance Blended Learning

    Science.gov (United States)

    Pérez-Sanagustin, Mar; Hernández-Leo, Davinia; Santos, Patricia; Kloos, Carlos Delgado; Blat, Josep

    2014-01-01

    Visits to museums and city tours have been part of higher and secondary education curriculum activities for many years. However these activities are typically considered "less formal" when compared to those carried out in the classroom, mainly because they take place in informal or non-formal settings. Augmented Reality (AR) technologies…

  14. A comparison of social and spatial determinants of health between formal and informal settlements in a large metropolitan setting in Brazil.

    Science.gov (United States)

    Snyder, Robert E; Jaimes, Guillermo; Riley, Lee W; Faerstein, Eduardo; Corburn, Jason

    2014-06-01

    Urban informal settlements are often under-recognized in national and regional surveys. A lack of quality intra-urban data frequently contributes to a one-size-fits-all public health intervention and clinical strategies that rarely address the variegated socioeconomic disparities across and within different informal settlements in a city. The 2010 Brazilian census gathered detailed population and place-based data across the country's informal settlements. Here, we examined key socio-demographic and infrastructure characteristics that are associated with health outcomes in Rio de Janeiro with the census tract as the unit of analysis. Many of the city's residents (1.39 million people, 22 % of the population) live in informal settlements. Residents of census tracts in Rio de Janeiro's urban informal areas are younger, (median age of 26 versus 35 years in formal settlements), and have less access to adequate water (96 versus 99 % of informal households), sanitation (86 versus 96 %), and electricity (67 versus 92 %). Average per household income in informal settlement census tracts is less than one third that of non-informal tracts (US\\$708 versus US\\$2362). Even among informal settlements in different planning areas in the same city, there is marked variation in these characteristics. Public health interventions, clinical management, and urban planning policies aiming to improve the living conditions of the people residing in informal settlements, including government strategies currently underway, must consider the differences that exist between and within informal settlements that shape place-based physical and social determinants of health.

  15. Factors affecting smartphone adoption for accessing information in medical settings.

    Science.gov (United States)

    Tahamtan, Iman; Pajouhanfar, Sara; Sedghi, Shahram; Azad, Mohsen; Roudbari, Masoud

    2017-06-01

    This study aimed to acquire knowledge about the factors affecting smartphone adoption for accessing information in medical settings in Iranian Hospitals. A qualitative and quantitative approach was used to conduct this study. Semi-structured interviews were conducted with 21 medical residents and interns in 2013 to identify determinant factors for smartphone adoption. Afterwards, nine relationships were hypothesised. We developed a questionnaire to test these hypotheses and to evaluate the importance of each factor. Structural equation modelling was used to analyse the causal relations between model parameters and to accurately identify determinant factors. Eight factors were identified in the qualitative phase of the study, including perceived usefulness, perceived ease of use, training, internal environment, personal experience, social impacts, observability and job related characteristics. Among the studied factors, perceived usefulness, personal experience and job related characteristics were significantly associated with attitude to use a smartphone which accounted for 64% of the variance in attitude. Perceived usefulness had the strongest impact on attitude to use a smartphone. The factors that emerged from interviews were consistent with the Technology Acceptance Model (TAM) and some previous studies. TAM is a reliable model for understanding the factors of smartphone acceptance in medical settings. © 2017 Health Libraries Group.

  16. Choosing the appropriate treatment setting: which information and decision-making needs do adult inpatients with mental disorders have? A qualitative interview study.

    Science.gov (United States)

    Kivelitz, Laura; Härter, Martin; Mohr, Jil; Melchior, Hanne; Goetzmann, Lutz; Warnke, Max Holger; Kleinschmidt, Silke; Dirmaier, Jörg

    2018-01-01

    Decisions on medical treatment setting are perceived as important but often difficult to make for patients with mental disorders. Shared decision-making as a strategy to decrease decisional conflict has been recommended, but is not yet widely implemented. This study aimed to investigate the information needs and the decision-making preferences of patients with mental disorders prior to the decision for a certain treatment setting. The results will serve as a prerequisite for the development of a high-quality patient decision aid (PtDA) regarding the treatment setting decision. We conducted retrospective individual semi-structured interviews with n=24 patients with mental disorders in three psychotherapeutic inpatient care units. The interviews were audiotaped, transcribed, coded, and content-analyzed. The majority of the patients wanted to be involved in the decision-making process. They reported high information needs regarding treatment options in order to feel empowered to participate adequately in the decision for a certain treatment setting. However, some patients did not want to participate or receive information, for example, because of their high burden of mental disorder. Whereas the majority were satisfied with the extent they were involved in the decision, few participants felt sufficiently informed about treatment options. Most patients reported that a decision aid regarding an appropriate treatment setting would have been helpful for them. Important information that should be included in a PtDA was general information about mental illness, effective treatment options, specific information about the different treatment settings, and access to treatment. The identified information and decision-making needs provide a valuable basis for the development of a PtDA aiming to support patients and caregivers regarding the decision for an adequate treatment setting. As preferences for participation vary among patients and also depend on the current mental state

  17. New set-up for high-quality soft-X-ray absorption spectroscopy of large organic molecules in the gas phase

    Energy Technology Data Exchange (ETDEWEB)

    Holch, Florian; Huebner, Dominique [Universitaet Wuerzburg, Experimentelle Physik VII, Am and Roentgen Reasearch Center for Complex Materials (RCCM) Hubland, 97074 Wuerzburg (Germany); Fink, Rainer [Universitaet Erlangen-Nuernberg, ICMM and CENEM, Egerlandstrasse 3, 91058 Erlangen (Germany); Schoell, Achim, E-mail: achim.schoell@physik.uni-wuerzburg.de [Universitaet Wuerzburg, Experimentelle Physik VII, Am and Roentgen Reasearch Center for Complex Materials (RCCM) Hubland, 97074 Wuerzburg (Germany); Umbach, Eberhard [Karlsruhe Institute of Technology, 76021 Karlsruhe (Germany)

    2011-11-15

    Highlights: {yields} We present a new set-up for x-ray absorption (NEXAFS) on large molecules in the gas-phase. {yields} The cell has a confined volume and can be heated. {yields} The spectra can be acquired fast, are of very high quality with respect tosignal-to-noise ratio and energy resolution. {yields} This allowsthe analysis of spectroscopic details (e.g. solid state effects by comparing gas- and condensed phase data). - Abstract: We present a new experimental set-up for the investigation of large (>128 amu) organic molecules in the gas-phase by means of near-edge X-ray absorption fine structure spectroscopy in the soft X-ray range. Our approach uses a gas cell, which is sealed off against the surrounding vacuum and which can be heated above the sublimation temperature of the respective molecular compound. Using a confined volume rather than a molecular beam yields short acquisition times and intense signals due to the high molecular density, which can be tuned by the container temperature. In turn, the resulting spectra are of very high quality with respect to signal-to-noise ratio and energy resolution, which are the essential aspects for the analysis of fine spectroscopic details. Using the examples of ANQ, NTCDA, and PTCDA, specific challenges of gas phase measurements on large organic molecules with high sublimation temperatures are addressed in detail with respect to the presented set-up and possible ways to tackle them are outlined.

  18. Social Work Involvement in Advance Care Planning: Findings from a Large Survey of Social Workers in Hospice and Palliative Care Settings.

    Science.gov (United States)

    Stein, Gary L; Cagle, John G; Christ, Grace H

    2017-03-01

    Few data are available describing the involvement and activities of social workers in advance care planning (ACP). We sought to provide data about (1) social worker involvement and leadership in ACP conversations with patients and families; and (2) the extent of functions and activities when these discussions occur. We conducted a large web-based survey of social workers employed in hospice, palliative care, and related settings to explore their role, participation, and self-rated competency in facilitating ACP discussions. Respondents were recruited through the Social Work Hospice and Palliative Care Network and the National Hospice and Palliative Care Organization. Descriptive analyses were conducted on the full sample of respondents (N = 641) and a subsample of clinical social workers (N = 456). Responses were analyzed to explore differences in ACP involvement by practice setting. Most clinical social workers (96%) reported that social workers in their department are conducting ACP discussions with patients/families. Majorities also participate in, and lead, ACP discussions (69% and 60%, respectively). Most respondents report that social workers are responsible for educating patients/families about ACP options (80%) and are the team members responsible for documenting ACP (68%). Compared with other settings, oncology and inpatient palliative care social workers were less likely to be responsible for ensuring that patients/families are informed of ACP options and documenting ACP preferences. Social workers are prominently involved in facilitating, leading, and documenting ACP discussions. Policy-makers, administrators, and providers should incorporate the vital contributions of social work professionals in policies and programs supporting ACP.

  19. Social Attitudes on Gender Equality and Firms' Discriminatory Pay-Setting

    OpenAIRE

    Janssen, Simon; Tuor Sartore, Simone N.; Backes-Gellner, Uschi

    2014-01-01

    We analyze the relationship between social attitudes on gender equality and firms' pay-setting behavior by combining information about regional votes relative to gender equality laws with a large data set of multi-branch firms and workers. The results show that multi-branch firms pay more discriminatory wages in branches located in regions with a higher social acceptance of gender inequality than in branches located in regions with a lower acceptance. The results are similar for different sub...

  20. The Classification of Complementary Information Set Codes of Lengths 14 and 16

    OpenAIRE

    Freibert, Finley

    2012-01-01

    In the paper "A new class of codes for Boolean masking of cryptographic computations," Carlet, Gaborit, Kim, and Sol\\'{e} defined a new class of rate one-half binary codes called \\emph{complementary information set} (or CIS) codes. The authors then classified all CIS codes of length less than or equal to 12. CIS codes have relations to classical Coding Theory as they are a generalization of self-dual codes. As stated in the paper, CIS codes also have important practical applications as they m...

  1. How information systems should support the information needs of general dentists in clinical settings: suggestions from a qualitative study

    Directory of Open Access Journals (Sweden)

    Wali Teena

    2010-02-01

    Full Text Available Abstract Background A major challenge in designing useful clinical information systems in dentistry is to incorporate clinical evidence based on dentists' information needs and then integrate the system seamlessly into the complex clinical workflow. However, little is known about the actual information needs of dentists during treatment sessions. The purpose of this study is to identify general dentists' information needs and the information sources they use to meet those needs in clinical settings so as to inform the design of dental information systems. Methods A semi-structured interview was conducted with a convenience sample of 18 general dentists in the Pittsburgh area during clinical hours. One hundred and five patient cases were reported by these dentists. Interview transcripts were coded and analyzed using thematic analysis with a constant comparative method to identify categories and themes regarding information needs and information source use patterns. Results Two top-level categories of information needs were identified: foreground and background information needs. To meet these needs, dentists used four types of information sources: clinical information/tasks, administrative tasks, patient education and professional development. Major themes of dentists' unmet information needs include: (1 timely access to information on various subjects; (2 better visual representations of dental problems; (3 access to patient-specific evidence-based information; and (4 accurate, complete and consistent documentation of patient records. Resource use patterns include: (1 dentists' information needs matched information source use; (2 little use of electronic sources took place during treatment; (3 source use depended on the nature and complexity of the dental problems; and (4 dentists routinely practiced cross-referencing to verify patient information. Conclusions Dentists have various information needs at the point of care. Among them, the needs

  2. Treatment of severe pulmonary hypertension in the setting of the large patent ductus arteriosus.

    Science.gov (United States)

    Niu, Mary C; Mallory, George B; Justino, Henri; Ruiz, Fadel E; Petit, Christopher J

    2013-05-01

    Treatment of the large patent ductus arteriosus (PDA) in the setting of pulmonary hypertension (PH) is challenging. Left patent, the large PDA can result in irreversible pulmonary vascular disease. Occlusion, however, may lead to right ventricular failure for certain patients with severe PH. Our center has adopted a staged management strategy using medical management, noninvasive imaging, and invasive cardiac catheterization to treat PH in the presence of a large PDA. This approach determines the safety of ductal closure but also leverages medical therapy to create an opportunity for safe PDA occlusion. We reviewed our experience with this approach. Patients with both severe PH and PDAs were studied. PH treatment history and hemodynamic data obtained during catheterizations were reviewed. Repeat catheterizations, echocardiograms, and clinical status at latest follow-up were also reviewed. Seven patients had both PH and large, unrestrictive PDAs. At baseline, all patients had near-systemic right ventricular pressures. Nine catheterizations were performed. Two patients underwent 2 catheterizations each due to poor initial response to balloon test occlusion. Six of 7 patients exhibited subsystemic pulmonary pressures during test occlusion and underwent successful PDA occlusion. One patient did not undergo PDA occlusion. In follow-up, 2 additional catheterizations were performed after successful PDA occlusion for subsequent hemodynamic assessment. At the latest follow-up, the 6 patients who underwent PDA occlusion are well, with continued improvement in PH. Five patients remain on PH treatment. A staged approach to PDA closure for patients with severe PH is an effective treatment paradigm. Aggressive treatment of PH creates a window of opportunity for PDA occlusion, echocardiography assists in identifying the timing for closure, and balloon test occlusion during cardiac catheterization is critical in determining safety of closure. By safely eliminating the large PDA

  3. Evaluating usability of the Halden Reactor Large Screen Display. Is the Information Rich Design concept suitable for real-world installations?

    International Nuclear Information System (INIS)

    Braseth, Alf Ove

    2013-01-01

    Large Screen Displays (LSDs) are beginning to supplement desktop displays in modern control rooms, having the potential to display the big picture of complex processes. Information Rich Design (IRD) is a LSD concept used in many real-life installations in the petroleum domain, and more recently in nuclear research applications. The objectives of IRD are to provide the big picture, avoiding keyhole related problems while supporting fast visual perception of larger data sets. Two LSDs based on the IRD concept have been developed for large-scale nuclear simulators for research purposes; they have however suffered from unsatisfying user experience. The new Halden Reactor LSD, used to monitor a nuclear research reactor, was designed according to recent proposed Design Principles compiled in this paper to mitigate previously experienced problems. This paper evaluates the usability of the Halden Reactor LSD, comparing usability data with the replaced analogue panel, and data for an older IRD large screen display. The results suggest that the IRD concept is suitable for use in real-life applications from a user experience point of view, and that the recently proposed Design Principles have had a positive effect on usability. (author)

  4. WebViz:A Web-based Collaborative Interactive Visualization System for large-Scale Data Sets

    Science.gov (United States)

    Yuen, D. A.; McArthur, E.; Weiss, R. M.; Zhou, J.; Yao, B.

    2010-12-01

    WebViz is a web-based application designed to conduct collaborative, interactive visualizations of large data sets for multiple users, allowing researchers situated all over the world to utilize the visualization services offered by the University of Minnesota’s Laboratory for Computational Sciences and Engineering (LCSE). This ongoing project has been built upon over the last 3 1/2 years .The motivation behind WebViz lies primarily with the need to parse through an increasing amount of data produced by the scientific community as a result of larger and faster multicore and massively parallel computers coming to the market, including the use of general purpose GPU computing. WebViz allows these large data sets to be visualized online by anyone with an account. The application allows users to save time and resources by visualizing data ‘on the fly’, wherever he or she may be located. By leveraging AJAX via the Google Web Toolkit (http://code.google.com/webtoolkit/), we are able to provide users with a remote, web portal to LCSE's (http://www.lcse.umn.edu) large-scale interactive visualization system already in place at the University of Minnesota. LCSE’s custom hierarchical volume rendering software provides high resolution visualizations on the order of 15 million pixels and has been employed for visualizing data primarily from simulations in astrophysics to geophysical fluid dynamics . In the current version of WebViz, we have implemented a highly extensible back-end framework built around HTTP "server push" technology. The web application is accessible via a variety of devices including netbooks, iPhones, and other web and javascript-enabled cell phones. Features in the current version include the ability for users to (1) securely login (2) launch multiple visualizations (3) conduct collaborative visualization sessions (4) delegate control aspects of a visualization to others and (5) engage in collaborative chats with other users within the user interface

  5. Validation of the Care-Related Quality of Life Instrument in different study settings: findings from The Older Persons and Informal Caregivers Survey Minimum DataSet (TOPICS-MDS).

    Science.gov (United States)

    Lutomski, J E; van Exel, N J A; Kempen, G I J M; Moll van Charante, E P; den Elzen, W P J; Jansen, A P D; Krabbe, P F M; Steunenberg, B; Steyerberg, E W; Olde Rikkert, M G M; Melis, R J F

    2015-05-01

    Validity is a contextual aspect of a scale which may differ across sample populations and study protocols. The objective of our study was to validate the Care-Related Quality of Life Instrument (CarerQol) across two different study design features, sampling framework (general population vs. different care settings) and survey mode (interview vs. written questionnaire). Data were extracted from The Older Persons and Informal Caregivers Minimum DataSet (TOPICS-MDS, www.topics-mds.eu ), a pooled public-access data set with information on >3,000 informal caregivers throughout the Netherlands. Meta-correlations and linear mixed models between the CarerQol's seven dimensions (CarerQol-7D) and caregiver's level of happiness (CarerQol-VAS) and self-rated burden (SRB) were performed. The CarerQol-7D dimensions were correlated to the CarerQol-VAS and SRB in the pooled data set and the subgroups. The strength of correlations between CarerQol-7D dimensions and SRB was weaker among caregivers who were interviewed versus those who completed a written questionnaire. The directionality of associations between the CarerQol-VAS, SRB and the CarerQol-7D dimensions in the multivariate model supported the construct validity of the CarerQol in the pooled population. Significant interaction terms were observed in several dimensions of the CarerQol-7D across sampling frame and survey mode, suggesting meaningful differences in reporting levels. Although good scientific practice emphasises the importance of re-evaluating instrument properties in individual research studies, our findings support the validity and applicability of the CarerQol instrument in a variety of settings. Due to minor differential reporting, pooling CarerQol data collected using mixed administration modes should be interpreted with caution; for TOPICS-MDS, meta-analytic techniques may be warranted.

  6. Large number discrimination by mosquitofish.

    Directory of Open Access Journals (Sweden)

    Christian Agrillo

    Full Text Available BACKGROUND: Recent studies have demonstrated that fish display rudimentary numerical abilities similar to those observed in mammals and birds. The mechanisms underlying the discrimination of small quantities (<4 were recently investigated while, to date, no study has examined the discrimination of large numerosities in fish. METHODOLOGY/PRINCIPAL FINDINGS: Subjects were trained to discriminate between two sets of small geometric figures using social reinforcement. In the first experiment mosquitofish were required to discriminate 4 from 8 objects with or without experimental control of the continuous variables that co-vary with number (area, space, density, total luminance. Results showed that fish can use the sole numerical information to compare quantities but that they preferentially use cumulative surface area as a proxy of the number when this information is available. A second experiment investigated the influence of the total number of elements to discriminate large quantities. Fish proved to be able to discriminate up to 100 vs. 200 objects, without showing any significant decrease in accuracy compared with the 4 vs. 8 discrimination. The third experiment investigated the influence of the ratio between the numerosities. Performance was found to decrease when decreasing the numerical distance. Fish were able to discriminate numbers when ratios were 1:2 or 2:3 but not when the ratio was 3:4. The performance of a sample of undergraduate students, tested non-verbally using the same sets of stimuli, largely overlapped that of fish. CONCLUSIONS/SIGNIFICANCE: Fish are able to use pure numerical information when discriminating between quantities larger than 4 units. As observed in human and non-human primates, the numerical system of fish appears to have virtually no upper limit while the numerical ratio has a clear effect on performance. These similarities further reinforce the view of a common origin of non-verbal numerical systems in all

  7. What information do people use, trust, and find useful during a disaster? Evidence from five large wildfires

    Science.gov (United States)

    Toddi A. Steelman; Sarah M. McCaffrey; Anne-Lise Knox Velez; Jason Alexander. Briefel

    2015-01-01

    The communication system through which information flows during a disaster can be conceived of as a set of relationships among sources and recipients who are concerned about key information characteristics. The recipient perspective is often neglected within this system. In this article, we explore recipient perspectives related to what information was used, useful,...

  8. Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets

    Science.gov (United States)

    2013-01-01

    Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance (set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that

  9. Large Scale Self-Organizing Information Distribution System

    National Research Council Canada - National Science Library

    Low, Steven

    2005-01-01

    This project investigates issues in "large-scale" networks. Here "large-scale" refers to networks with large number of high capacity nodes and transmission links, and shared by a large number of users...

  10. Information contained within the large scale gas injection test (Lasgit) dataset exposed using a bespoke data analysis tool-kit

    International Nuclear Information System (INIS)

    Bennett, D.P.; Thomas, H.R.; Cuss, R.J.; Harrington, J.F.; Vardon, P.J.

    2012-01-01

    spurious result of the applied processing. A brief comparison of key observations made before and after trend removal are presented to evaluate the validity of the tool-kit process with respect to information exposure. A tool-kit has been developed to perform an EDA on large-scale long-term datasets. An analysis on the Lasgit dataset successfully exposes, among other things: information regarding small scale events; non-parametric long-term trend identification; and deterministic quantification of frequency content. The uniformity and mechanical nature with which the information described above is exposed and quantified provides a level of rigour to and removes subjectivity from the resultant tool-kit output by effectively turning observations into measurements. This exposed information can be used to guide and underpin investigation into scientific process within the experimental set-up. While developed specifically for the Lasgit experiment the tool-kit is expected to be generally applicable to long-term, large-scale geotechnical or environmental experimental datasets with time series information

  11. Goal setting and action planning in the rehabilitation setting: development of a theoretically informed practice framework.

    Science.gov (United States)

    Scobbie, Lesley; Dixon, Diane; Wyke, Sally

    2011-05-01

    Setting and achieving goals is fundamental to rehabilitation practice but has been criticized for being a-theoretical and the key components of replicable goal-setting interventions are not well established. To describe the development of a theory-based goal setting practice framework for use in rehabilitation settings and to detail its component parts. Causal modelling was used to map theories of behaviour change onto the process of setting and achieving rehabilitation goals, and to suggest the mechanisms through which patient outcomes are likely to be affected. A multidisciplinary task group developed the causal model into a practice framework for use in rehabilitation settings through iterative discussion and implementation with six patients. Four components of a goal-setting and action-planning practice framework were identified: (i) goal negotiation, (ii) goal identification, (iii) planning, and (iv) appraisal and feedback. The variables hypothesized to effect change in patient outcomes were self-efficacy and action plan attainment. A theory-based goal setting practice framework for use in rehabilitation settings is described. The framework requires further development and systematic evaluation in a range of rehabilitation settings.

  12. RADIOMETRIC NORMALIZATION OF LARGE AIRBORNE IMAGE DATA SETS ACQUIRED BY DIFFERENT SENSOR TYPES

    Directory of Open Access Journals (Sweden)

    S. Gehrke

    2016-06-01

    HxMap software. It has been successfully applied to large sets of heterogeneous imagery, including the adjustment of original sensor images prior to quality control and further processing as well as radiometric adjustment for ortho-image mosaic generation.

  13. TO BE OR NOT TO BE: AN INFORMATIVE NON-SYMBOLIC NUMERICAL MAGNITUDE PROCESSING STUDY ABOUT SMALL VERSUS LARGE NUMBERS IN INFANTS

    Directory of Open Access Journals (Sweden)

    Annelies CEULEMANS

    2014-03-01

    Full Text Available Many studies tested the association between numerical magnitude processing and mathematical achievement with conflicting findings reported for individuals with mathematical learning disorders. Some of the inconsistencies might be explained by the number of non-symbolic stimuli or dot collections used in studies. It has been hypothesized that there is an object-file system for ‘small’ and an analogue magnitude system for ‘large’ numbers. This two-system account has been supported by the set size limit of the object-file system (three items. A boundary was defined, accordingly, categorizing numbers below four as ‘small’ and from four and above as ‘large’. However, data on ‘small’ number processing and on the ‘boundary’ between small and large numbers are missing. In this contribution we provide data from infants discriminating between the number sets 4 vs. 8 and 1 vs. 4, both containing the number four combined with a small and a large number respectively. Participants were 25 and 26 full term 9-month-olds for 4 vs. 8 and 1 vs. 4 respectively. The stimuli (dots were controlled for continuous variables. Eye-tracking was combined with the habituation paradigm. The results showed that the infants were successful in discriminating 1 from 4, but failed to discriminate 4 from 8 dots. This finding supports the assumption of the number four as a ‘small’ number and enlarges the object-file system’s limit. This study might help to explain inconsistencies in studies. Moreover, the information may be useful in answering parent’s questions about challenges that vulnerable children with number processing problems, such as children with mathematical learning disorders, might encounter. In addition, the study might give some information on the stimuli that can be used to effectively foster children’s magnitude processing skills.

  14. GSHR, a Web-Based Platform Provides Gene Set-Level Analyses of Hormone Responses in Arabidopsis

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ran

    2018-01-01

    Full Text Available Phytohormones regulate diverse aspects of plant growth and environmental responses. Recent high-throughput technologies have promoted a more comprehensive profiling of genes regulated by different hormones. However, these omics data generally result in large gene lists that make it challenging to interpret the data and extract insights into biological significance. With the rapid accumulation of theses large-scale experiments, especially the transcriptomic data available in public databases, a means of using this information to explore the transcriptional networks is needed. Different platforms have different architectures and designs, and even similar studies using the same platform may obtain data with large variances because of the highly dynamic and flexible effects of plant hormones; this makes it difficult to make comparisons across different studies and platforms. Here, we present a web server providing gene set-level analyses of Arabidopsis thaliana hormone responses. GSHR collected 333 RNA-seq and 1,205 microarray datasets from the Gene Expression Omnibus, characterizing transcriptomic changes in Arabidopsis in response to phytohormones including abscisic acid, auxin, brassinosteroids, cytokinins, ethylene, gibberellins, jasmonic acid, salicylic acid, and strigolactones. These data were further processed and organized into 1,368 gene sets regulated by different hormones or hormone-related factors. By comparing input gene lists to these gene sets, GSHR helped to identify gene sets from the input gene list regulated by different phytohormones or related factors. Together, GSHR links prior information regarding transcriptomic changes induced by hormones and related factors to newly generated data and facilities cross-study and cross-platform comparisons; this helps facilitate the mining of biologically significant information from large-scale datasets. The GSHR is freely available at http://bioinfo.sibs.ac.cn/GSHR/.

  15. Incorporating Real-time Earthquake Information into Large Enrollment Natural Disaster Course Learning

    Science.gov (United States)

    Furlong, K. P.; Benz, H.; Hayes, G. P.; Villasenor, A.

    2010-12-01

    Although most would agree that the occurrence of natural disaster events such as earthquakes, volcanic eruptions, and floods can provide effective learning opportunities for natural hazards-based courses, implementing compelling materials into the large-enrollment classroom environment can be difficult. These natural hazard events derive much of their learning potential from their real-time nature, and in the modern 24/7 news-cycle where all but the most devastating events are quickly out of the public eye, the shelf life for an event is quite limited. To maximize the learning potential of these events requires that both authoritative information be available and course materials be generated as the event unfolds. Although many events such as hurricanes, flooding, and volcanic eruptions provide some precursory warnings, and thus one can prepare background materials to place the main event into context, earthquakes present a particularly confounding situation of providing no warning, but where context is critical to student learning. Attempting to implement real-time materials into large enrollment classes faces the additional hindrance of limited internet access (for students) in most lecture classrooms. In Earth 101 Natural Disasters: Hollywood vs Reality, taught as a large enrollment (150+ students) general education course at Penn State, we are collaborating with the USGS’s National Earthquake Information Center (NEIC) to develop efficient means to incorporate their real-time products into learning activities in the lecture hall environment. Over time (and numerous events) we have developed a template for presenting USGS-produced real-time information in lecture mode. The event-specific materials can be quickly incorporated and updated, along with key contextual materials, to provide students with up-to-the-minute current information. In addition, we have also developed in-class activities, such as student determination of population exposure to severe ground

  16. Performance of informative priors skeptical of large treatment effects in clinical trials: A simulation study.

    Science.gov (United States)

    Pedroza, Claudia; Han, Weilu; Thanh Truong, Van Thi; Green, Charles; Tyson, Jon E

    2018-01-01

    One of the main advantages of Bayesian analyses of clinical trials is their ability to formally incorporate skepticism about large treatment effects through the use of informative priors. We conducted a simulation study to assess the performance of informative normal, Student- t, and beta distributions in estimating relative risk (RR) or odds ratio (OR) for binary outcomes. Simulation scenarios varied the prior standard deviation (SD; level of skepticism of large treatment effects), outcome rate in the control group, true treatment effect, and sample size. We compared the priors with regards to bias, mean squared error (MSE), and coverage of 95% credible intervals. Simulation results show that the prior SD influenced the posterior to a greater degree than the particular distributional form of the prior. For RR, priors with a 95% interval of 0.50-2.0 performed well in terms of bias, MSE, and coverage under most scenarios. For OR, priors with a wider 95% interval of 0.23-4.35 had good performance. We recommend the use of informative priors that exclude implausibly large treatment effects in analyses of clinical trials, particularly for major outcomes such as mortality.

  17. FACTORS CONTRIBUTING TO ELDER ABUSE AND NEGLECT IN THE INFORMAL CAREGIVING SETTING

    Directory of Open Access Journals (Sweden)

    Ananias, Janetta

    2014-11-01

    Full Text Available This article provides an overview of factors contributing to elder abuse and neglect within the informal caregiving setting from the perspective of ecological theory. This theory offers a deeper understanding of the complexity of elder abuse by considering the interactions that take place across a number of interrelated systems as well as the multiple risk factors that contribute to elder abuse and neglect. Researchers, policy makers and practitioners need to develop awareness of the risk factors regarding elder abuse and neglect, and to develop appropriate interventions in response to elder abuse and neglect.

  18. PACOM: A Versatile Tool for Integrating, Filtering, Visualizing, and Comparing Multiple Large Mass Spectrometry Proteomics Data Sets.

    Science.gov (United States)

    Martínez-Bartolomé, Salvador; Medina-Aunon, J Alberto; López-García, Miguel Ángel; González-Tejedo, Carmen; Prieto, Gorka; Navajas, Rosana; Salazar-Donate, Emilio; Fernández-Costa, Carolina; Yates, John R; Albar, Juan Pablo

    2018-04-06

    Mass-spectrometry-based proteomics has evolved into a high-throughput technology in which numerous large-scale data sets are generated from diverse analytical platforms. Furthermore, several scientific journals and funding agencies have emphasized the storage of proteomics data in public repositories to facilitate its evaluation, inspection, and reanalysis. (1) As a consequence, public proteomics data repositories are growing rapidly. However, tools are needed to integrate multiple proteomics data sets to compare different experimental features or to perform quality control analysis. Here, we present a new Java stand-alone tool, Proteomics Assay COMparator (PACOM), that is able to import, combine, and simultaneously compare numerous proteomics experiments to check the integrity of the proteomic data as well as verify data quality. With PACOM, the user can detect source of errors that may have been introduced in any step of a proteomics workflow and that influence the final results. Data sets can be easily compared and integrated, and data quality and reproducibility can be visually assessed through a rich set of graphical representations of proteomics data features as well as a wide variety of data filters. Its flexibility and easy-to-use interface make PACOM a unique tool for daily use in a proteomics laboratory. PACOM is available at https://github.com/smdb21/pacom .

  19. Email-Based Informed Consent: Innovative Method for Reaching Large Numbers of Subjects for Data Mining Research

    Science.gov (United States)

    Lee, Lesley R.; Mason, Sara S.; Babiak-Vazquez, Adriana; Ray, Stacie L.; Van Baalen, Mary

    2015-01-01

    Since the 2010 NASA authorization to make the Life Sciences Data Archive (LSDA) and Lifetime Surveillance of Astronaut Health (LSAH) data archives more accessible by the research and operational communities, demand for data has greatly increased. Correspondingly, both the number and scope of requests have increased, from 142 requests fulfilled in 2011 to 224 in 2014, and with some datasets comprising up to 1 million data points. To meet the demand, the LSAH and LSDA Repositories project was launched, which allows active and retired astronauts to authorize full, partial, or no access to their data for research without individual, study-specific informed consent. A one-on-one personal informed consent briefing is required to fully communicate the implications of the several tiers of consent. Due to the need for personal contact to conduct Repositories consent meetings, the rate of consenting has not kept up with demand for individualized, possibly attributable data. As a result, other methods had to be implemented to allow the release of large datasets, such as release of only de-identified data. However the compilation of large, de-identified data sets places a significant resource burden on LSAH and LSDA and may result in diminished scientific usefulness of the dataset. As a result, LSAH and LSDA worked with the JSC Institutional Review Board Chair, Astronaut Office physicians, and NASA Office of General Counsel personnel to develop a "Remote Consenting" process for retrospective data mining studies. This is particularly useful since the majority of the astronaut cohort is retired from the agency and living outside the Houston area. Originally planned as a method to send informed consent briefing slides and consent forms only by mail, Remote Consenting has evolved into a means to accept crewmember decisions on individual studies via their method of choice: email or paper copy by mail. To date, 100 emails have been sent to request participation in eight HRP

  20. Canonical Information Analysis

    DEFF Research Database (Denmark)

    Vestergaard, Jacob Schack; Nielsen, Allan Aasbjerg

    2015-01-01

    is replaced by the information theoretical, entropy based measure mutual information, which is a much more general measure of association. We make canonical information analysis feasible for large sample problems, including for example multispectral images, due to the use of a fast kernel density estimator......Canonical correlation analysis is an established multivariate statistical method in which correlation between linear combinations of multivariate sets of variables is maximized. In canonical information analysis introduced here, linear correlation as a measure of association between variables...... for entropy estimation. Canonical information analysis is applied successfully to (1) simple simulated data to illustrate the basic idea and evaluate performance, (2) fusion of weather radar and optical geostationary satellite data in a situation with heavy precipitation, and (3) change detection in optical...

  1. Rehabilitation-Related Research on Disability and Employer Practices Using Individual-Based National and Administrative Data Sets

    Science.gov (United States)

    Nazarov, Zafar E.; Erickson, William A.; Bruyère, Susanne M.

    2014-01-01

    Objective: It is useful to examine workplace factors influencing employment outcomes of individuals with disabilities and the interplay of disability, employment-related, and employer characteristics to inform rehabilitation practice. Design: A number of large national survey and administrative data sets provide information on employers and can…

  2. Set of CAMAC modules on the base of large integrated circuits for an accelerator synchronization system

    International Nuclear Information System (INIS)

    Glejbman, Eh.M.; Pilyar, N.V.

    1986-01-01

    Parameters of functional moduli in the CAMAC standard developed for accelerator synchronization system are presented. They comprise BZN-8K and BZ-8K digital delay circuits, timing circuit and pulse selection circuit. In every module 3 large integral circuits of KR 580 VI53 type programmed timer, circuits of the given system bus bar interface with bus bars of crate, circuits of data recording control, 2 peripheric storage devices, circuits of initial regime setting, input and output shapers, circuits of installation and removal of blocking in channels are used

  3. Designing Patient-facing Health Information Technologies for the Outpatient Settings: A Literature Review

    Directory of Open Access Journals (Sweden)

    Yushi Yang

    2016-04-01

    Full Text Available Introduction: The implementation of health information technologies (HITs has changed the dynamics of doctor–patient communication in outpatient settings. Designing patient-facing HITs provides patients with easy access to healthcare information during the visit and has the potential to enhance the patient-centred care.   Objectives: The objectives of this study are to systematically review how the designs of patient-facing HITs have been suggested and evaluated, and how they may potentially affect the doctor–patient communication and patient-centred care.   Method: We conducted an online database search to identify articles published before December 2014 relevant to the objectives of this study. A total of nine papers have been identified and reviewed in this study.   Results: Designing patient-facing HITs is at an early stage. The current literature has been exploring the impact of HITs on doctor–patient communication dynamics. Based on the findings of these studies, there is an emergent need to design more patient-centred HITs. There are also some papers that focus on the usability evaluation of some preliminary prototypes of the patient-facing HITs. The design styles of patient-facing HITs included sharing the health information with the patients on: (1 a separate patient display, (2 a projector, (3 a portable tablet, (4 a touch-based screen and (5 a shared computer display that can be viewed by both doctors and patients. Each of them had the strengths and limitations to facilitate the patient-centred care, and it is worthwhile to make a comparison of them in order to identify future research directions.   Conclusion: The designs of patient-facing HITs in outpatient settings are promising in facilitating the doctor-patient communication and patient engagement. However, their effectiveness and usefulness need to be further evaluated and improved from a systems perspective.

  4. Designing Patient-facing Health Information Technologies for the Outpatient Settings: A Literature Review.

    Science.gov (United States)

    Yang, Yushi; Asan, Onur

    2016-04-06

      The implementation of health information technologies (HITs) has changed the dynamics of doctor-patient communication in outpatient settings. Designing patient-facing HITs provides patients with easy access to healthcare information during the visit and has the potential to enhance the patient-centred care.  The objectives of this study are to systematically review how the designs of patient-facing HITs have been suggested and evaluated, and how they may potentially affect the doctor-patient communication and patient-centred care.  We conducted an online database search to identify articles published before December 2014 relevant to the objectives of this study. A total of nine papers have been identified and reviewed in this study.  Designing patient-facing HITs is at an early stage. The current literature has been exploring the impact of HITs on doctor-patient communication dynamics. Based on the findings of these studies, there is an emergent need to design more patient-centred HITs. There are also some papers that focus on the usability evaluation of some preliminary prototypes of the patient-facing HITs. The design styles of patient-facing HITs included sharing the health information with the patients on: (1) a separate patient display, (2) a projector, (3) a portable tablet, (4) a touch-based screen and (5) a shared computer display that can be viewed by both doctors and patients. Each of them had the strengths and limitations to facilitate the patient-centred care, and it is worthwhile to make a comparison of them in order to identify future research directions.  The designs of patient-facing HITs in outpatient settings are promising in facilitating the doctor-patient communication and patient engagement. However, their effectiveness and usefulness need to be further evaluated and improved from a systems perspective.

  5. Workflow management in large distributed systems

    International Nuclear Information System (INIS)

    Legrand, I; Newman, H; Voicu, R; Dobre, C; Grigoras, C

    2011-01-01

    The MonALISA (Monitoring Agents using a Large Integrated Services Architecture) framework provides a distributed service system capable of controlling and optimizing large-scale, data-intensive applications. An essential part of managing large-scale, distributed data-processing facilities is a monitoring system for computing facilities, storage, networks, and the very large number of applications running on these systems in near realtime. All this monitoring information gathered for all the subsystems is essential for developing the required higher-level services—the components that provide decision support and some degree of automated decisions—and for maintaining and optimizing workflow in large-scale distributed systems. These management and global optimization functions are performed by higher-level agent-based services. We present several applications of MonALISA's higher-level services including optimized dynamic routing, control, data-transfer scheduling, distributed job scheduling, dynamic allocation of storage resource to running jobs and automated management of remote services among a large set of grid facilities.

  6. Workflow management in large distributed systems

    Science.gov (United States)

    Legrand, I.; Newman, H.; Voicu, R.; Dobre, C.; Grigoras, C.

    2011-12-01

    The MonALISA (Monitoring Agents using a Large Integrated Services Architecture) framework provides a distributed service system capable of controlling and optimizing large-scale, data-intensive applications. An essential part of managing large-scale, distributed data-processing facilities is a monitoring system for computing facilities, storage, networks, and the very large number of applications running on these systems in near realtime. All this monitoring information gathered for all the subsystems is essential for developing the required higher-level services—the components that provide decision support and some degree of automated decisions—and for maintaining and optimizing workflow in large-scale distributed systems. These management and global optimization functions are performed by higher-level agent-based services. We present several applications of MonALISA's higher-level services including optimized dynamic routing, control, data-transfer scheduling, distributed job scheduling, dynamic allocation of storage resource to running jobs and automated management of remote services among a large set of grid facilities.

  7. The control gap : the role of budgets, accounting information and (non-) decisions in hospital settings

    OpenAIRE

    Nyland, Kari; Pettersen, Inger Johanne

    2004-01-01

    This paper investigates the link between budgets, accounting information and the decisionmaking processes at both strategic and operational levels in a large Norwegian hospital, as this hospital now is facing the New Public Management reforms which are introduced in Norway. The study has examined the use of budget and accounting information in the management control process. The empirical data are based on interviews with key actors in the decision-making process at all management levels in t...

  8. Implementation of Lifestyle Modification Program Focusing on Physical Activity and Dietary Habits in a Large Group, Community-Based Setting

    Science.gov (United States)

    Stoutenberg, Mark; Falcon, Ashley; Arheart, Kris; Stasi, Selina; Portacio, Francia; Stepanenko, Bryan; Lan, Mary L.; Castruccio-Prince, Catarina; Nackenson, Joshua

    2017-01-01

    Background: Lifestyle modification programs improve several health-related behaviors, including physical activity (PA) and nutrition. However, few of these programs have been expanded to impact a large number of individuals in one setting at one time. Therefore, the purpose of this study was to determine whether a PA- and nutrition-based lifestyle…

  9. Maximising Organisational Information Sharing and Effective Intelligence Analysis in Critical Data Sets. A case study on the information science needs of the Norwegian criminal intelligence and law enforcement community

    OpenAIRE

    Wilhelmsen, Sonja

    2009-01-01

    Organisational information sharing has become more and more important as the amount of information grows. In order to accomplish the most effective and efficient sharing of information, analysis of the information needs and the organisation needs are vital. This dissertation focuses on the information needs sourced through the critical data sets of law enforcement organisations; specifically the Norwegian criminal intelligence and law enforcement community represented by the Na...

  10. The Influence of Company Size on Accounting Information: Evidence in Large Caps and Small Caps Companies Listed on BM&FBovespa

    Directory of Open Access Journals (Sweden)

    Karen Yukari Yokoyama

    2015-09-01

    Full Text Available In this study, the relation between accounting information aspects and the capitalization level o companies listed on the São Paulo Stock Exchange was investigated, classified as Large Caps or Small Caps, companies with larger and smaller capitalization, respectively, between 2010 and 2012. Three accounting information measures were addressed: informativeness, conservatism and relevance, through the application of Easton and Harris’ (1991 models of earnings informativeness, Basu’s (1997 model of conditional conservatism and the value relevance model, based on Ohlson (1995. The results appointed that, although the Large Caps present a higher level of conservatism, their accounting figures were less informative and more relevant when compared to the Large Caps companies. Due to the greater production of private information (predisclosure surrounding larger companies, the market would tend to respond less strongly or surprised to the publication of these companies’ accounting information, while the lack of anticipated information would make the effect of disclosing these figures more preponderant for the Small Caps companies.

  11. A mine of information: can sports analytics provide wisdom from your data?

    OpenAIRE

    Passfield, Louis; Hopker, James G.

    2017-01-01

    This paper explores the notion that the availability and analysis of large datasets has the capacity to improve practice and change the nature of science in the sport and exercise setting. The increasing use of data and information technology in sport is giving rise to this change. Websites hold large data repositories and the development of wearable technology, mobile phone applications and related instruments for monitoring physical activity, training and competition, provide large data set...

  12. DNMT1 is associated with cell cycle and DNA replication gene sets in diffuse large B-cell lymphoma.

    Science.gov (United States)

    Loo, Suet Kee; Ab Hamid, Suzina Sheikh; Musa, Mustaffa; Wong, Kah Keng

    2018-01-01

    Dysregulation of DNA (cytosine-5)-methyltransferase 1 (DNMT1) is associated with the pathogenesis of various types of cancer. It has been previously shown that DNMT1 is frequently expressed in diffuse large B-cell lymphoma (DLBCL), however its functions remain to be elucidated in the disease. In this study, we gene expression profiled (GEP) shRNA targeting DNMT1(shDNMT1)-treated germinal center B-cell-like DLBCL (GCB-DLBCL)-derived cell line (i.e. HT) compared with non-silencing shRNA (control shRNA)-treated HT cells. Independent gene set enrichment analysis (GSEA) performed using GEPs of shRNA-treated HT cells and primary GCB-DLBCL cases derived from two publicly-available datasets (i.e. GSE10846 and GSE31312) produced three separate lists of enriched gene sets for each gene sets collection from Molecular Signatures Database (MSigDB). Subsequent Venn analysis identified 268, 145 and six consensus gene sets from analyzing gene sets in C2 collection (curated gene sets), C5 sub-collection [gene sets from gene ontology (GO) biological process ontology] and Hallmark collection, respectively to be enriched in positive correlation with DNMT1 expression profiles in shRNA-treated HT cells, GSE10846 and GSE31312 datasets [false discovery rate (FDR) 0.8) with DNMT1 expression and significantly downregulated (log fold-change <-1.35; p<0.05) following DNMT1 silencing in HT cells. These results suggest the involvement of DNMT1 in the activation of cell cycle and DNA replication in DLBCL cells. Copyright © 2017 Elsevier GmbH. All rights reserved.

  13. Terrorist Approach to Information Operations

    Science.gov (United States)

    2003-06-01

    its fold; they also represent an understanding of the value of the information medium by setting the group up as the underdog against the large...up and organized street demonstrations to lobby for civil rights. The Stormont (Irish) government branded the movement a front for the IRA and

  14. NASA's Information Power Grid: Large Scale Distributed Computing and Data Management

    Science.gov (United States)

    Johnston, William E.; Vaziri, Arsi; Hinke, Tom; Tanner, Leigh Ann; Feiereisen, William J.; Thigpen, William; Tang, Harry (Technical Monitor)

    2001-01-01

    Large-scale science and engineering are done through the interaction of people, heterogeneous computing resources, information systems, and instruments, all of which are geographically and organizationally dispersed. The overall motivation for Grids is to facilitate the routine interactions of these resources in order to support large-scale science and engineering. Multi-disciplinary simulations provide a good example of a class of applications that are very likely to require aggregation of widely distributed computing, data, and intellectual resources. Such simulations - e.g. whole system aircraft simulation and whole system living cell simulation - require integrating applications and data that are developed by different teams of researchers frequently in different locations. The research team's are the only ones that have the expertise to maintain and improve the simulation code and/or the body of experimental data that drives the simulations. This results in an inherently distributed computing and data management environment.

  15. Estimating the similarity of alternative Affymetrix probe sets using transcriptional networks

    Science.gov (United States)

    2013-01-01

    Background The usefulness of the data from Affymetrix microarray analysis depends largely on the reliability of the files describing the correspondence between probe sets, genes and transcripts. Particularly, when a gene is targeted by several probe sets, these files should give information about the similarity of each alternative probe set pair. Transcriptional networks integrate the multiple correlations that exist between all probe sets and supply much more information than a simple correlation coefficient calculated for two series of signals. In this study, we used the PSAWN (Probe Set Assignment With Networks) programme we developed to investigate whether similarity of alternative probe sets resulted in some specific properties. Findings PSAWNpy delivered a full textual description of each probe set and information on the number and properties of secondary targets. PSAWNml calculated the similarity of each alternative probe set pair and allowed finding relationships between similarity and localisation of probes in common transcripts or exons. Similar alternative probe sets had very low negative correlation, high positive correlation and similar neighbourhood overlap. Using these properties, we devised a test that allowed grouping similar probe sets in a given network. By considering several networks, additional information concerning the similarity reproducibility was obtained, which allowed defining the actual similarity of alternative probe set pairs. In particular, we calculated the common localisation of probes in exons and in known transcripts and we showed that similarity was correctly correlated with them. The information collected on all pairs of alternative probe sets in the most popular 3’ IVT Affymetrix chips is available in tabular form at http://bns.crbm.cnrs.fr/download.html. Conclusions These processed data can be used to obtain a finer interpretation when comparing microarray data between biological conditions. They are particularly well

  16. Estimating the similarity of alternative Affymetrix probe sets using transcriptional networks.

    Science.gov (United States)

    Bellis, Michel

    2013-03-21

    The usefulness of the data from Affymetrix microarray analysis depends largely on the reliability of the files describing the correspondence between probe sets, genes and transcripts. Particularly, when a gene is targeted by several probe sets, these files should give information about the similarity of each alternative probe set pair. Transcriptional networks integrate the multiple correlations that exist between all probe sets and supply much more information than a simple correlation coefficient calculated for two series of signals. In this study, we used the PSAWN (Probe Set Assignment With Networks) programme we developed to investigate whether similarity of alternative probe sets resulted in some specific properties. PSAWNpy delivered a full textual description of each probe set and information on the number and properties of secondary targets. PSAWNml calculated the similarity of each alternative probe set pair and allowed finding relationships between similarity and localisation of probes in common transcripts or exons. Similar alternative probe sets had very low negative correlation, high positive correlation and similar neighbourhood overlap. Using these properties, we devised a test that allowed grouping similar probe sets in a given network. By considering several networks, additional information concerning the similarity reproducibility was obtained, which allowed defining the actual similarity of alternative probe set pairs. In particular, we calculated the common localisation of probes in exons and in known transcripts and we showed that similarity was correctly correlated with them. The information collected on all pairs of alternative probe sets in the most popular 3' IVT Affymetrix chips is available in tabular form at http://bns.crbm.cnrs.fr/download.html. These processed data can be used to obtain a finer interpretation when comparing microarray data between biological conditions. They are particularly well adapted for searching 3' alternative

  17. Meeting the Information Needs of Interdisciplinary Scholars: Issues for Administrators of Large University Libraries.

    Science.gov (United States)

    Searing, Susan E.

    1996-01-01

    Provides an overview of administrative issues in supporting interdisciplinary library use at large universities. Topics include information resources; cataloging and classification; library services to users, including library use education and reference services; library organization; the campus context; and the politics of interdisciplinarity.…

  18. Analyzing regional geological setting of DS uranium deposit based on the extensional research of remote sensing information

    International Nuclear Information System (INIS)

    Liu Dechang; Ye Fawang; Zhao Yingjun

    2006-01-01

    Through analyzing remote sensing image, a special geological environment for uranium ore-formation in Dongsheng-Hangjinqi area consisting of fault-uplift, southern margin fault and annular structure is discovered in this paper. Then the extensional researches on fault-uplift, southern margin fault as well as annular structure are made by using the information-integrated technologies to overlap the remote sensing information with other geoscientific information such as geophysics, geology and so on. Finally, the unusual regional geological setting is analyzed in the view of uranium ore formation, and its influences on the occurrence of DS uranium deposit are also discussed. (authors)

  19. Annotating gene sets by mining large literature collections with protein networks.

    Science.gov (United States)

    Wang, Sheng; Ma, Jianzhu; Yu, Michael Ku; Zheng, Fan; Huang, Edward W; Han, Jiawei; Peng, Jian; Ideker, Trey

    2018-01-01

    Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.

  20. Transitions into puberty and access to sexual and reproductive health information in two humanitarian settings: a cross-sectional survey of very young adolescents from Somalia and Myanmar.

    Science.gov (United States)

    Kågesten, Anna E; Zimmerman, Linnea; Robinson, Courtland; Lee, Catherine; Bawoke, Tenaw; Osman, Shahd; Schlecht, Jennifer

    2017-01-01

    Very young adolescents (VYA) in humanitarian settings are largely neglected in terms of sexual and reproductive health (SRH). This study describes the characteristics of VYA aged 10-14 years in two humanitarian settings, focusing on transitions into puberty and access to SRH information. Data were collected through a cross-sectional survey with Somali VYA residing in the Kobe refugee camp in Ethiopia ( N  = 406) and VYA from Myanmar residing in the Mae Sot and Phop Phra migrant communities in Thailand ( N  = 399). The average age was 12 years (about half were girls) in both communities. Participants were recruited using multi-stage cluster-based sampling with probability proportional to size in each site. Descriptive statistics were used to describe the sociodemographic, family, peer, and schooling characteristics and to explore transitions into puberty and access to SRH information. Most VYA in both sites reported living with both parents; nine in ten reported feeling that their parents/guardians care about them, and over half said that their parents/guardians monitor how and with whom they spend their free time. High proportions in both sites were currently enrolled in school (91.4% Somali, 87.0% from Myanmar). Few VYA, particularly those aged 10-12, reported starting puberty, although one in four Somali indicated not knowing whether they did so. Most girls from Myanmar who had started menstruating reported access to menstrual hygiene supplies (water, sanitation, cloths/pads). No Somali girls reported access to all these supplies. While over half of respondents in both sites reported learning about body changes, less than 20% had learnt about pregnancy and the majority (87.4% Somali, 78.6% from Myanmar) indicated a need for more information about body changes. Parents/guardians were the most common source of SRH information in both sites, however VYA indicated that they would like more information from friends, siblings, teachers and health workers. This

  1. An Analysis of Information Technology Adoption by IRBs of Large Academic Medical Centers in the United States.

    Science.gov (United States)

    He, Shan; Botkin, Jeffrey R; Hurdle, John F

    2015-02-01

    The clinical research landscape has changed dramatically in recent years in terms of both volume and complexity. This poses new challenges for Institutional Review Boards' (IRBs) review efficiency and quality, especially at large academic medical centers. This article discusses the technical facets of IRB modernization. We analyzed the information technology used by IRBs in large academic institutions across the United States. We found that large academic medical centers have a high electronic IRB adoption rate; however, the capabilities of electronic IRB systems vary greatly. We discuss potential use-cases of a fully exploited electronic IRB system that promise to streamline the clinical research work flow. The key to that approach utilizes a structured and standardized information model for the IRB application. © The Author(s) 2014.

  2. Information science team

    Science.gov (United States)

    Billingsley, F.

    1982-01-01

    Concerns are expressed about the data handling aspects of system design and about enabling technology for data handling and data analysis. The status, contributing factors, critical issues, and recommendations for investigations are listed for data handling, rectification and registration, and information extraction. Potential supports to individual P.I., research tasks, systematic data system design, and to system operation. The need for an airborne spectrometer class instrument for fundamental research in high spectral and spatial resolution is indicated. Geographic information system formatting and labelling techniques, very large scale integration, and methods for providing multitype data sets must also be developed.

  3. Pythoscape: a framework for generation of large protein similarity networks.

    Science.gov (United States)

    Barber, Alan E; Babbitt, Patricia C

    2012-11-01

    Pythoscape is a framework implemented in Python for processing large protein similarity networks for visualization in other software packages. Protein similarity networks are graphical representations of sequence, structural and other similarities among proteins for which pairwise all-by-all similarity connections have been calculated. Mapping of biological and other information to network nodes or edges enables hypothesis creation about sequence-structure-function relationships across sets of related proteins. Pythoscape provides several options to calculate pairwise similarities for input sequences or structures, applies filters to network edges and defines sets of similar nodes and their associated data as single nodes (termed representative nodes) for compression of network information and output data or formatted files for visualization.

  4. Analysis and Improvement of Large Payload Bidirectional Quantum Secure Direct Communication Without Information Leakage

    Science.gov (United States)

    Liu, Zhi-Hao; Chen, Han-Wu

    2018-02-01

    As we know, the information leakage problem should be avoided in a secure quantum communication protocol. Unfortunately, it is found that this problem does exist in the large payload bidirectional quantum secure direct communication (BQSDC) protocol (Ye Int. J. Quantum. Inf. 11(5), 1350051 2013) which is based on entanglement swapping between any two Greenberger-Horne-Zeilinger (GHZ) states. To be specific, one half of the information interchanged in this protocol is leaked out unconsciously without any active attack from an eavesdropper. Afterward, this BQSDC protocol is revised to the one without information leakage. It is shown that the improved BQSDC protocol is secure against the general individual attack and has some obvious features compared with the original one.

  5. A comparison of clinicians' access to online knowledge resources using two types of information retrieval applications in an academic hospital setting.

    Science.gov (United States)

    Hunt, Sevgin; Cimino, James J; Koziol, Deloris E

    2013-01-01

    The research studied whether a clinician's preference for online health knowledge resources varied with the use of two applications that were designed for information retrieval in an academic hospital setting. The researchers analyzed a year's worth of computer log files to study differences in the ways that four clinician groups (attending physicians, housestaff physicians, nurse practitioners, and nurses) sought information using two types of information retrieval applications (health resource links or Infobutton icons) across nine resources while they reviewed patients' laboratory results. From a set of 14,979 observations, the authors found statistically significant differences among the 4 clinician groups for accessing resources using the health resources application (Pinformation-seeking behavior of clinicians may vary in relation to their role and the way in which the information is presented. Studying these behaviors can provide valuable insights to those tasked with maintaining information retrieval systems' links to appropriate online knowledge resources.

  6. Concepts for a global resources information system

    Science.gov (United States)

    Billingsley, F. C.; Urena, J. L.

    1984-01-01

    The objective of the Global Resources Information System (GRIS) is to establish an effective and efficient information management system to meet the data access requirements of NASA and NASA-related scientists conducting large-scale, multi-disciplinary, multi-mission scientific investigations. Using standard interfaces and operating guidelines, diverse data systems can be integrated to provide the capabilities to access and process multiple geographically dispersed data sets and to develop the necessary procedures and algorithms to derive global resource information.

  7. Interests-in-motion in an informal, media-rich learning setting

    Directory of Open Access Journals (Sweden)

    Ty Hollett

    2016-01-01

    Full Text Available Much of the literature related to connected learning approaches youth interests as fixed on specific disciplines or activities (e.g. STEM, music production, or game design. As such, mentors design youth-focused programs to serve those interests. Through a micro-ethnographic analysis of two youth’s Minecraft-centered gameplay in a public library, this article makes two primary contributions to research on learning within, and the design of, informal, media-rich settings. First, rather than approach youth interests as fixed on specific disciplines or activities (e.g. STEM, music production, or video games, this article traces youth interests as they spark and emerge among individuals and groups. Then, it follows those interests as they subsequently spread over time, becoming interests-in-motion. Second, recognition of these interests-in-motion can lead mentors to develop program designs that enable learners to work with artifacts (digital and physical that learners can progressively configure and re-configure over time. Mentors, then, design-in-time as they harness the energy surrounding those emergent interests, creating extending learning opportunities in response.

  8. Rhesus monkeys (Macaca mulatta) show robust primacy and recency in memory for lists from small, but not large, image sets.

    Science.gov (United States)

    Basile, Benjamin M; Hampton, Robert R

    2010-02-01

    The combination of primacy and recency produces a U-shaped serial position curve typical of memory for lists. In humans, primacy is often thought to result from rehearsal, but there is little evidence for rehearsal in nonhumans. To further evaluate the possibility that rehearsal contributes to primacy in monkeys, we compared memory for lists of familiar stimuli (which may be easier to rehearse) to memory for unfamiliar stimuli (which are likely difficult to rehearse). Six rhesus monkeys saw lists of five images drawn from either large, medium, or small image sets. After presentation of each list, memory for one item was assessed using a serial probe recognition test. Across four experiments, we found robust primacy and recency with lists drawn from small and medium, but not large, image sets. This finding is consistent with the idea that familiar items are easier to rehearse and that rehearsal contributes to primacy, warranting further study of the possibility of rehearsal in monkeys. However, alternative interpretations are also viable and are discussed. Copyright 2009 Elsevier B.V. All rights reserved.

  9. Covariance approximation for large multivariate spatial data sets with an application to multiple climate model errors

    KAUST Repository

    Sang, Huiyan

    2011-12-01

    This paper investigates the cross-correlations across multiple climate model errors. We build a Bayesian hierarchical model that accounts for the spatial dependence of individual models as well as cross-covariances across different climate models. Our method allows for a nonseparable and nonstationary cross-covariance structure. We also present a covariance approximation approach to facilitate the computation in the modeling and analysis of very large multivariate spatial data sets. The covariance approximation consists of two parts: a reduced-rank part to capture the large-scale spatial dependence, and a sparse covariance matrix to correct the small-scale dependence error induced by the reduced rank approximation. We pay special attention to the case that the second part of the approximation has a block-diagonal structure. Simulation results of model fitting and prediction show substantial improvement of the proposed approximation over the predictive process approximation and the independent blocks analysis. We then apply our computational approach to the joint statistical modeling of multiple climate model errors. © 2012 Institute of Mathematical Statistics.

  10. ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings

    NARCIS (Netherlands)

    Drachsler, Hendrik; Pecceu, Dries; Arts, Tanja; Hutten, Edwin; Rutledge, Lloyd; Van Rosmalen, Peter; Hummel, Hans; Koper, Rob

    2009-01-01

    Drachsler, H., Peccau, D., Arts, T., Hutten, E., Rutledge, L., Van Rosmalen, P., Hummel, H. G. K., & Koper, R. (2009). ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings. Presentation at the 2nd Workshop Mash-Up Personal Learning

  11. Integration of Information Literacy Components into a Large First-Year Lecture-Based Chemistry Course

    Science.gov (United States)

    Locknar, Angela; Mitchell, Rudolph; Rankin, Janet; Sadoway, Donald R.

    2012-01-01

    A first-year chemistry course is ideal for introducing students to finding and using scholarly information early in their academic careers. A four-pronged approach (lectures, homework problems, videos, and model solutions) was used to incorporate library research skills into a large lecture-based course. Pre- and post-course surveying demonstrated…

  12. Pre-start timing information is used to set final linear speed in a C-start manoeuvre.

    Science.gov (United States)

    Reinel, Caroline; Schuster, Stefan

    2014-08-15

    In their unique hunting behaviour, archerfish use a complex motor decision to secure their prey: based solely on how dislodged prey initially falls, they select an adapted C-start manoeuvre that turns the fish right towards the point on the water surface where their prey will later land. Furthermore, they take off at a speed that is set so as to arrive in time. We show here that the C-start manoeuvre and not subsequent tail beating is necessary and sufficient for setting this adaptive level of speed. Furthermore, the C-start pattern is adjusted to independently determine both the turning angle and the take-off speed. The selection of both aspects requires no a priori information and is done based on information sampled from the onset of target motion until the C-start is launched. Fin strokes can occur right after the C-start manoeuvre but are not required to fine-tune take-off speed, but rather to maintain it. By probing the way in which the fish set their take-off speed in a wide range of conditions in which distance from the later catching point and time until impact varied widely and unpredictably, we found that the C-start manoeuvre is programmed based on pre-C-start estimates of distance and time until impact. Our study hence provides the first evidence for a C-start that is fine-tuned to produce an adaptive speed level. © 2014. Published by The Company of Biologists Ltd.

  13. Scalable Algorithms for Clustering Large Geospatiotemporal Data Sets on Manycore Architectures

    Science.gov (United States)

    Mills, R. T.; Hoffman, F. M.; Kumar, J.; Sreepathi, S.; Sripathi, V.

    2016-12-01

    The increasing availability of high-resolution geospatiotemporal data sets from sources such as observatory networks, remote sensing platforms, and computational Earth system models has opened new possibilities for knowledge discovery using data sets fused from disparate sources. Traditional algorithms and computing platforms are impractical for the analysis and synthesis of data sets of this size; however, new algorithmic approaches that can effectively utilize the complex memory hierarchies and the extremely high levels of available parallelism in state-of-the-art high-performance computing platforms can enable such analysis. We describe a massively parallel implementation of accelerated k-means clustering and some optimizations to boost computational intensity and utilization of wide SIMD lanes on state-of-the art multi- and manycore processors, including the second-generation Intel Xeon Phi ("Knights Landing") processor based on the Intel Many Integrated Core (MIC) architecture, which includes several new features, including an on-package high-bandwidth memory. We also analyze the code in the context of a few practical applications to the analysis of climatic and remotely-sensed vegetation phenology data sets, and speculate on some of the new applications that such scalable analysis methods may enable.

  14. Large scale access tests and online interfaces to ATLAS conditions databases

    International Nuclear Information System (INIS)

    Amorim, A; Lopes, L; Pereira, P; Simoes, J; Soloviev, I; Burckhart, D; Schmitt, J V D; Caprini, M; Kolos, S

    2008-01-01

    The access of the ATLAS Trigger and Data Acquisition (TDAQ) system to the ATLAS Conditions Databases sets strong reliability and performance requirements on the database storage and access infrastructures. Several applications were developed to support the integration of Conditions database access with the online services in TDAQ, including the interface to the Information Services (IS) and to the TDAQ Configuration Databases. The information storage requirements were the motivation for the ONline A Synchronous Interface to COOL (ONASIC) from the Information Service (IS) to LCG/COOL databases. ONASIC avoids the possible backpressure from Online Database servers by managing a local cache. In parallel, OKS2COOL was developed to store Configuration Databases into an Offline Database with history record. The DBStressor application was developed to test and stress the access to the Conditions database using the LCG/COOL interface while operating in an integrated way as a TDAQ application. The performance scaling of simultaneous Conditions database read accesses was studied in the context of the ATLAS High Level Trigger large computing farms. A large set of tests were performed involving up to 1000 computing nodes that simultaneously accessed the LCG central database server infrastructure at CERN

  15. Strategic Disclosure of Demand Information by Duopolists

    DEFF Research Database (Denmark)

    Jansen, Jos; Pollak, Andreas

    We study the strategic disclosure of demand information and product-market strategies of duopolists. In a setting where firms may fail to receive information, we show that firms selectively disclose information in equilibrium in order to influence their competitor's product-market strategy....... Subsequently, we analyze the firms' behavior in a laboratory experiment. We find that subjects often use selective disclosure strategies, and this finding appears to be robust to changes in the information structure, the mode of competition, and the degree of product-market conduct that is largely consistent...

  16. Large transverse momentum hadronic processes

    International Nuclear Information System (INIS)

    Darriulat, P.

    1977-01-01

    The possible relations between deep inelastic leptoproduction and large transverse momentum (psub(t)) processes in hadronic collisions are usually considered in the framework of the quark-parton picture. Experiments observing the structure of the final state in proton-proton collisions producing at least one large transverse momentum particle have led to the following conclusions: a large fraction of produced particles are uneffected by the large psub(t) process. The other products are correlated to the large psub(t) particle. Depending upon the sign of scalar product they can be separated into two groups of ''towards-movers'' and ''away-movers''. The experimental evidence are reviewed favouring such a picture and the properties are discussed of each of three groups (underlying normal event, towards-movers and away-movers). Some phenomenological interpretations are presented. The exact nature of away- and towards-movers must be further investigated. Their apparent jet structure has to be confirmed. Angular correlations between leading away and towards movers are very informative. Quantum number flow, both within the set of away and towards-movers, and between it and the underlying normal event, are predicted to behave very differently in different models

  17. LASSIE: the large analogue signal and scaling information environment for FAIR

    International Nuclear Information System (INIS)

    Hoffmann, T.; Braeuning, H.; Haseitl, R.

    2012-01-01

    At FAIR, the Facility for Antiproton and Ion Research, several new accelerators and storage rings such as the SIS-100, HESR, CR, the inter-connecting HEBT beam lines, S-FRS and experiments will be built. All of these installations are equipped with beam diagnostic devices and other components, which deliver time-resolved analogue signals to show status, quality and performance of the accelerators. These signals can originate from particle detectors such as ionization chambers and plastic scintillators, but also from adapted output signals of transformers, collimators, magnet functions, RF cavities and others. To visualize and precisely correlate the time axis of all input signals a dedicated FESA based data acquisition and analysis system named LASSIE, the Large Analogue Signal and Scaling Information Environment, is currently being developed. The main operation mode of LASSIE is currently pulse counting with latching VME scaler boards. Later enhancements for ADC, QDC, or TDC digitization in the future are foreseen. The concept, features and challenges of this large distributed data acquisition system are presented. (authors)

  18. No firewalls or information problem for black holes entangled with large systems

    Science.gov (United States)

    Stoltenberg, Henry; Albrecht, Andreas

    2015-01-01

    We discuss how under certain conditions the black hole information puzzle and the (related) arguments that firewalls are a typical feature of black holes can break down. We first review the arguments of Almheiri, Marolf, Polchinski and Sully favoring firewalls, focusing on entanglements in a simple toy model for a black hole and the Hawking radiation. By introducing a large and inaccessible system entangled with the black hole (representing perhaps a de Sitter stretched horizon or inaccessible part of a landscape), we show complementarity can be restored and firewalls can be avoided throughout the black hole's evolution. Under these conditions black holes do not have an "information problem." We point out flaws in some of our earlier arguments that such entanglement might be generically present in some cosmological scenarios and call out certain ways our picture may still be realized.

  19. Closed sets of nonlocal correlations

    International Nuclear Information System (INIS)

    Allcock, Jonathan; Linden, Noah; Brunner, Nicolas; Popescu, Sandu; Skrzypczyk, Paul; Vertesi, Tamas

    2009-01-01

    We present a fundamental concept - closed sets of correlations - for studying nonlocal correlations. We argue that sets of correlations corresponding to information-theoretic principles, or more generally to consistent physical theories, must be closed under a natural set of operations. Hence, studying the closure of sets of correlations gives insight into which information-theoretic principles are genuinely different, and which are ultimately equivalent. This concept also has implications for understanding why quantum nonlocality is limited, and for finding constraints on physical theories beyond quantum mechanics.

  20. ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings

    NARCIS (Netherlands)

    Drachsler, Hendrik; Pecceu, Dries; Arts, Tanja; Hutten, Edwin; Rutledge, Lloyd; Van Rosmalen, Peter; Hummel, Hans; Koper, Rob

    2009-01-01

    Drachsler, H., Peccau, D., Arts, T., Hutten, E., Rutledge, L., Van Rosmalen, P., Hummel, H. G. K., & Koper, R. (2009). ReMashed – Recommendation Approaches for Mash-Up Personal Learning Environments in Formal and Informal Learning Settings. In F. Wild, M. Kalz, M. Palmér & D. Müller (Eds.),

  1. From Management Information Systems to Business Intelligence: The Development of Management Information Needs

    Directory of Open Access Journals (Sweden)

    Gėlytė Kazakevičienė

    2013-09-01

    Full Text Available Despite the advances in IT, information systems intended for management informing did not uniformly fulfil the increased expectations of users; this can be said mostly about complex information needs. Although some of the technologies for supporting complicated insights, like management decision support systems and technologies, experienced reduction in interest both from researchers and practitioners, this did not reduce the importance of well-supported business informing and decision making. Being attributed to the group of intelligent systems and technologies, decision support (DS technologies have been largely supplemented by business intelligence (BI technologies. Both types of technologies are supported by respective information technologies, which often appear to be quite closely related. The objective of this paper is to define relations between simple and complex informing intended to satisfy different sets of needs and provided by different sets of support tools. The paper attempts to put together decision support and business intelligence technologies, based on common goals of sense-making and use of advanced analytical tools. A model of two interconnected cycles has been developed to relate the activities of decision support and business intelligence. Empirical data from earlier research is used to direct possible further insights into this area.

  2. Using open-source programs to create a web-based portal for hydrologic information

    Science.gov (United States)

    Kim, H.

    2013-12-01

    Some hydrologic data sets, such as basin climatology, precipitation, and terrestrial water storage, are not easily obtainable and distributable due to their size and complexity. We present a Hydrologic Information Portal (HIP) that has been implemented at the University of California for Hydrologic Modeling (UCCHM) and that has been organized around the large river basins of North America. This portal can be easily accessed through a modern web browser that enables easy access and visualization of such hydrologic data sets. Some of the main features of our HIP include a set of data visualization features so that users can search, retrieve, analyze, integrate, organize, and map data within large river basins. Recent information technologies such as Google Maps, Tornado (Python asynchronous web server), NumPy/SciPy (Scientific Library for Python) and d3.js (Visualization library for JavaScript) were incorporated into the HIP to create ease in navigating large data sets. With such open source libraries, HIP can give public users a way to combine and explore various data sets by generating multiple chart types (Line, Bar, Pie, Scatter plot) directly from the Google Maps viewport. Every rendered object such as a basin shape on the viewport is clickable, and this is the first step to access the visualization of data sets.

  3. Spatial part-set cuing facilitation.

    Science.gov (United States)

    Kelley, Matthew R; Parasiuk, Yuri; Salgado-Benz, Jennifer; Crocco, Megan

    2016-07-01

    Cole, Reysen, and Kelley [2013. Part-set cuing facilitation for spatial information. Journal of Experimental Psychology: Learning, Memory, & Cognition, 39, 1615-1620] reported robust part-set cuing facilitation for spatial information using snap circuits (a colour-coded electronics kit designed for children to create rudimentary circuit boards). In contrast, Drinkwater, Dagnall, and Parker [2006. Effects of part-set cuing on experienced and novice chess players' reconstruction of a typical chess midgame position. Perceptual and Motor Skills, 102(3), 645-653] and Watkins, Schwartz, and Lane [1984. Does part-set cuing test for memory organization? Evidence from reconstructions of chess positions. Canadian Journal of Psychology/Revue Canadienne de Psychologie, 38(3), 498-503] showed no influence of part-set cuing for spatial information when using chess boards. One key difference between the two procedures was that the snap circuit stimuli were explicitly connected to one another, whereas chess pieces were not. Two experiments examined the effects of connection type (connected vs. unconnected) and cue type (cued vs. uncued) on memory for spatial information. Using chess boards (Experiment 1) and snap circuits (Experiment 2), part-set cuing facilitation only occurred when the stimuli were explicitly connected; there was no influence of cuing with unconnected stimuli. These results are potentially consistent with the retrieval strategy disruption hypothesis, as well as the two- and three-mechanism accounts of part-set cuing.

  4. Comparing Panelists' Understanding of Standard Setting across Multiple Levels of an Alternate Science Assessment

    Science.gov (United States)

    Hansen, Mary A.; Lyon, Steven R.; Heh, Peter; Zigmond, Naomi

    2013-01-01

    Large-scale assessment programs, including alternate assessments based on alternate achievement standards (AA-AAS), must provide evidence of technical quality and validity. This study provides information about the technical quality of one AA-AAS by evaluating the standard setting for the science component. The assessment was designed to have…

  5. Leaf transpiration plays a role in phosphorus acquisition among a large set of chickpea genotypes.

    Science.gov (United States)

    Pang, Jiayin; Zhao, Hongxia; Bansal, Ruchi; Bohuon, Emilien; Lambers, Hans; Ryan, Megan H; Siddique, Kadambot H M

    2018-01-09

    Low availability of inorganic phosphorus (P) is considered a major constraint for crop productivity worldwide. A unique set of 266 chickpea (Cicer arietinum L.) genotypes, originating from 29 countries and with diverse genetic background, were used to study P-use efficiency. Plants were grown in pots containing sterilized river sand supplied with P at a rate of 10 μg P g -1 soil as FePO 4 , a poorly soluble form of P. The results showed large genotypic variation in plant growth, shoot P content, physiological P-use efficiency, and P-utilization efficiency in response to low P supply. Further investigation of a subset of 100 chickpea genotypes with contrasting growth performance showed significant differences in photosynthetic rate and photosynthetic P-use efficiency. A positive correlation was found between leaf P concentration and transpiration rate of the young fully expanded leaves. For the first time, our study has suggested a role of leaf transpiration in P acquisition, consistent with transpiration-driven mass flow in chickpea grown in low-P sandy soils. The identification of 6 genotypes with high plant growth, P-acquisition, and P-utilization efficiency suggests that the chickpea reference set can be used in breeding programmes to improve both P-acquisition and P-utilization efficiency under low-P conditions. © 2018 John Wiley & Sons Ltd.

  6. Hierarchical Cantor set in the large scale structure with torus geometry

    Energy Technology Data Exchange (ETDEWEB)

    Murdzek, R. [Physics Department, ' Al. I. Cuza' University, Blvd. Carol I, Nr. 11, Iassy 700506 (Romania)], E-mail: rmurdzek@yahoo.com

    2008-12-15

    The formation of large scale structures is considered within a model with string on toroidal space-time. Firstly, the space-time geometry is presented. In this geometry, the Universe is represented by a string describing a torus surface. Thereafter, the large scale structure of the Universe is derived from the string oscillations. The results are in agreement with the cellular structure of the large scale distribution and with the theory of a Cantorian space-time.

  7. Integrative analysis of survival-associated gene sets in breast cancer.

    Science.gov (United States)

    Varn, Frederick S; Ung, Matthew H; Lou, Shao Ke; Cheng, Chao

    2015-03-12

    Patient gene expression information has recently become a clinical feature used to evaluate breast cancer prognosis. The emergence of prognostic gene sets that take advantage of these data has led to a rich library of information that can be used to characterize the molecular nature of a patient's cancer. Identifying robust gene sets that are consistently predictive of a patient's clinical outcome has become one of the main challenges in the field. We inputted our previously established BASE algorithm with patient gene expression data and gene sets from MSigDB to develop the gene set activity score (GSAS), a metric that quantitatively assesses a gene set's activity level in a given patient. We utilized this metric, along with patient time-to-event data, to perform survival analyses to identify the gene sets that were significantly correlated with patient survival. We then performed cross-dataset analyses to identify robust prognostic gene sets and to classify patients by metastasis status. Additionally, we created a gene set network based on component gene overlap to explore the relationship between gene sets derived from MSigDB. We developed a novel gene set based on this network's topology and applied the GSAS metric to characterize its role in patient survival. Using the GSAS metric, we identified 120 gene sets that were significantly associated with patient survival in all datasets tested. The gene overlap network analysis yielded a novel gene set enriched in genes shared by the robustly predictive gene sets. This gene set was highly correlated to patient survival when used alone. Most interestingly, removal of the genes in this gene set from the gene pool on MSigDB resulted in a large reduction in the number of predictive gene sets, suggesting a prominent role for these genes in breast cancer progression. The GSAS metric provided a useful medium by which we systematically investigated how gene sets from MSigDB relate to breast cancer patient survival. We used

  8. 78 FR 54996 - Information Reporting by Applicable Large Employers on Health Insurance Coverage Offered Under...

    Science.gov (United States)

    2013-09-09

    ... Information Reporting by Applicable Large Employers on Health Insurance Coverage Offered Under Employer... credit to help individuals and families afford health insurance coverage purchased through an Affordable... or group health insurance coverage offered by an employer to the employee that is (1) a governmental...

  9. Efficacy of formative evaluation using a focus group for a large classroom setting in an accelerated pharmacy program.

    Science.gov (United States)

    Nolette, Shaun; Nguyen, Alyssa; Kogan, David; Oswald, Catherine; Whittaker, Alana; Chakraborty, Arup

    2017-07-01

    Formative evaluation is a process utilized to improve communication between students and faculty. This evaluation method allows the ability to address pertinent issues in a timely manner; however, implementation of formative evaluation can be a challenge, especially in a large classroom setting. Using mediated formative evaluation, the purpose of this study is to determine if a student based focus group is a viable option to improve efficacy of communication between an instructor and students as well as time management in a large classroom setting. Out of 140 total students, six students were selected to form a focus group - one from each of six total sections of the classroom. Each focus group representative was responsible for collecting all the questions from students of their corresponding sections and submitting them to the instructor two to three times a day. Responses from the instructor were either passed back to pertinent students by the focus group representatives or addressed directly with students by the instructor. This study was conducted using a fifteen-question survey after the focus group model was utilized for one month. A printed copy of the survey was distributed in the class by student investigators. Questions were of varying types, including Likert scale, yes/no, and open-ended response. One hundred forty surveys were administered, and 90 complete responses were collected. Surveys showed that 93.3% of students found that use of the focus group made them more likely to ask questions for understanding. The surveys also showed 95.5% of students found utilizing the focus group for questions allowed for better understanding of difficult concepts. General open-ended answer portions of the survey showed that most students found the focus group allowed them to ask questions more easily since they did not feel intimidated by asking in front of the whole class. No correlation was found between demographic characteristics and survey responses. This may

  10. Novel Visualization of Large Health Related Data Sets

    Science.gov (United States)

    2015-03-01

    lower all-cause mortality. 3 While large cross-sectional studies of populations such as the National Health and Nutrition Examination Survey find a...due to impaired renal and hepatic metabolism, decreased dietary intake related to anorexia or nausea, and falsely low HbA1c secondary to uremia or...Renal Nutrition . 2009:19(1):33- 37. 2014 Workshop on Visual Analytics in Healthcare ! ! !"#$%&’(%’$)*+%,"’#%-’$%./*.0*12,$)345%6)*7’$%./’#*8)’#$9*1

  11. Application of Digital Object Identifiers to data sets at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC)

    Science.gov (United States)

    Vollmer, B.; Ostrenga, D.; Johnson, J. E.; Savtchenko, A. K.; Shen, S.; Teng, W. L.; Wei, J. C.

    2013-12-01

    Digital Object Identifiers (DOIs) are applied to selected data sets at the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC). The DOI system provides an Internet resolution service for unique and persistent identifiers of digital objects. Products assigned DOIs include data from the NASA MEaSUREs Program, the Earth Observing System (EOS) Aqua Atmospheric Infrared Sounder (AIRS) and EOS Aura High Resolution Dynamics Limb Sounder (HIRDLS). DOIs are acquired and registered through EZID, California Digital Library and DataCite. GES DISC hosts a data set landing page associated with each DOI containing information on and access to the data including a recommended data citation when using the product in research or applications. This work includes participation with the earth science community (e.g., Earth Science Information Partners (ESIP) Federation) and the NASA Earth Science Data and Information System (ESDIS) Project to identify, establish and implement best practices for assigning DOIs and managing supporting information, including metadata, for earth science data sets. Future work includes (1) coordination with NASA mission Science Teams and other data providers on the assignment of DOIs for other GES DISC data holdings, particularly for future missions such as Orbiting Carbon Observatory -2 and -3 (OCO-2, OCO-3) and projects (MEaSUREs 2012), (2) construction of landing pages that are both human and machine readable, and (3) pursuing the linking of data and publications with tools such as the Thomson Reuters Data Citation Index.

  12. Software Manages Documentation in a Large Test Facility

    Science.gov (United States)

    Gurneck, Joseph M.

    2001-01-01

    The 3MCS computer program assists and instrumentation engineer in performing the 3 essential functions of design, documentation, and configuration management of measurement and control systems in a large test facility. Services provided by 3MCS are acceptance of input from multiple engineers and technicians working at multiple locations;standardization of drawings;automated cross-referencing; identification of errors;listing of components and resources; downloading of test settings; and provision of information to customers.

  13. Calculations in support of a potential definition of large release

    International Nuclear Information System (INIS)

    Hanson, A.L.; Davis, R.E.; Mubayi, V.

    1994-05-01

    The Nuclear Regulatory Commission has stated a hierarchy of safety goals with the qualitative safety goals as Level I of the hierarchy, backed up by the quantitative health objectives as Level II and the large release guideline as Level III. The large release guideline has been stated in qualitative terms as a magnitude of release of the core inventory whose frequency should not exceed 10 -6 per reactor year. However, the Commission did not provide a quantitative specification of a large release. This report describes various specifications of a large release and focuses, in particular, on an examination of releases which have a potential to lead to one prompt fatality in the mean. The basic information required to set up the calculations was derived from the simplified source terms which were obtained from approximations of the NUREG-1150 source terms. Since the calculation of consequences is affected by a large number of assumptions, a generic site with a (conservatively determined) population density and meteorology was specified. At this site, various emergency responses (including no response) were assumed based on information derived from earlier studies. For each of the emergency response assumptions, a set of calculations were performed with the simplified source terms; these included adjustments to the source terms, such as the timing of the release, the core inventory, and the release fractions of different radionuclides, to arrive at a result of one mean prompt fatality in each case. Each of the source terms, so defined, has the potential to be a candidate for a large release. The calculations show that there are many possible candidate source terms for a large release depending on the characteristics which are felt to be important

  14. Fuzzy-Set Case Studies

    Science.gov (United States)

    Mikkelsen, Kim Sass

    2017-01-01

    Contemporary case studies rely on verbal arguments and set theory to build or evaluate theoretical claims. While existing procedures excel in the use of qualitative information (information about kind), they ignore quantitative information (information about degree) at central points of the analysis. Effectively, contemporary case studies rely on…

  15. Analysis of Usage Patterns in Large Multimedia Websites

    Science.gov (United States)

    Singh, Rahul; Bhattarai, Bibek

    User behavior in a website is a critical indicator of the web site's usability and success. Therefore an understanding of usage patterns is essential to website design optimization. In this context, large multimedia websites pose a significant challenge for comprehension of the complex and diverse user behaviors they sustain. This is due to the complexity of analyzing and understanding user-data interactions in media-rich contexts. In this chapter we present a novel multi-perspective approach for usability analysis of large media rich websites. Our research combines multimedia web content analysis with elements of web-log analysis and visualization/visual mining of web usage metadata. Multimedia content analysis allows direct estimation of the information-cues presented to a user by the web content. Analysis of web logs and usage-metadata, such as location, type, and frequency of interactions provides a complimentary perspective on the site's usage. The entire set of information is leveraged through powerful visualization and interactive querying techniques to provide analysis of usage patterns, measure of design quality, as well as the ability to rapidly identify problems in the web-site design. Experiments on media rich sites including the SkyServer - a large multimedia web-based astronomy information repository demonstrate the efficacy and promise of the proposed approach.

  16. The nuclear emergency information system based on GRRS

    International Nuclear Information System (INIS)

    Wang Bairong; Fu Li; Ma Jie; Zheng Qiyan

    2012-01-01

    By utilizing high operation characteristic of GPRS and advantage of transferring largely data packets, this paper set up a wireless communication network and nuclear emergency information system. This system studies useful data, short message, picture, storage and processing function for wireless control network platform. (authors)

  17. Cloud-enabled large-scale land surface model simulations with the NASA Land Information System

    Science.gov (United States)

    Duffy, D.; Vaughan, G.; Clark, M. P.; Peters-Lidard, C. D.; Nijssen, B.; Nearing, G. S.; Rheingrover, S.; Kumar, S.; Geiger, J. V.

    2017-12-01

    Developed by the Hydrological Sciences Laboratory at NASA Goddard Space Flight Center (GSFC), the Land Information System (LIS) is a high-performance software framework for terrestrial hydrology modeling and data assimilation. LIS provides the ability to integrate satellite and ground-based observational products and advanced modeling algorithms to extract land surface states and fluxes. Through a partnership with the National Center for Atmospheric Research (NCAR) and the University of Washington, the LIS model is currently being extended to include the Structure for Unifying Multiple Modeling Alternatives (SUMMA). With the addition of SUMMA in LIS, meaningful simulations containing a large multi-model ensemble will be enabled and can provide advanced probabilistic continental-domain modeling capabilities at spatial scales relevant for water managers. The resulting LIS/SUMMA application framework is difficult for non-experts to install due to the large amount of dependencies on specific versions of operating systems, libraries, and compilers. This has created a significant barrier to entry for domain scientists that are interested in using the software on their own systems or in the cloud. In addition, the requirement to support multiple run time environments across the LIS community has created a significant burden on the NASA team. To overcome these challenges, LIS/SUMMA has been deployed using Linux containers, which allows for an entire software package along with all dependences to be installed within a working runtime environment, and Kubernetes, which orchestrates the deployment of a cluster of containers. Within a cloud environment, users can now easily create a cluster of virtual machines and run large-scale LIS/SUMMA simulations. Installations that have taken weeks and months can now be performed in minutes of time. This presentation will discuss the steps required to create a cloud-enabled large-scale simulation, present examples of its use, and

  18. Combining Two Large MRI Data Sets (AddNeuroMed and ADNI) Using Multivariate Data Analysis to Distinguish between Patients with Alzheimer's Disease and Healthy Controls

    DEFF Research Database (Denmark)

    Westman, Eric; Simmons, Andrew; Muehlboeck, J.-Sebastian

    2010-01-01

    Background: The European Union AddNeuroMed project and the US-based Alzheimer Disease Neuroimaging Initiative (ADNI) are two large multi-centre initiatives designed to analyse and validate biomarkers for AD. This study aims to compare and combine magnetic resonance imaging (MRI) data from the two...... study cohorts using an automated image analysis pipeline and multivariate data analysis. Methods: A total of 664 subjects were included in this study (AddNeuroMed: 126 AD, 115 CTL, ADNI: 194 AD, 229 CTL) Data acquisition for the AddNeuroMed project was set up to be compatible with the ADNI study...... used are robust and that large data sets can be combined if MRI imaging protocols are carefully aligned....

  19. Visualization and Integrated Data Mining of Disparate Information

    Energy Technology Data Exchange (ETDEWEB)

    Saffer, Jeffrey D.(OMNIVIZ, INC); Albright, Cory L.(BATTELLE (PACIFIC NW LAB)); Calapristi, Augustin J.(BATTELLE (PACIFIC NW LAB)); Chen, Guang (OMNIVIZ, INC); Crow, Vernon L.(BATTELLE (PACIFIC NW LAB)); Decker, Scott D.(BATTELLE (PACIFIC NW LAB)); Groch, Kevin M.(BATTELLE (PACIFIC NW LAB)); Havre, Susan L.(BATTELLE (PACIFIC NW LAB)); Malard, Joel (BATTELLE (PACIFIC NW LAB)); Martin, Tonya J.(BATTELLE (PACIFIC NW LAB)); Miller, Nancy E.(BATTELLE (PACIFIC NW LAB)); Monroe, Philip J.(OMNIVIZ, INC); Nowell, Lucy T.(BATTELLE (PACIFIC NW LAB)); Payne, Deborah A.(BATTELLE (PACIFIC NW LAB)); Reyes Spindola, Jorge F.(BATTELLE (PACIFIC NW LAB)); Scarberry, Randall E.(OMNIVIZ, INC); Sofia, Heidi J.(BATTELLE (PACIFIC NW LAB)); Stillwell, Lisa C.(OMNIVIZ, INC); Thomas, Gregory S.(BATTELLE (PACIFIC NW LAB)); Thurston, Sarah J.(OMNIVIZ, INC); Williams, Leigh K.(BATTELLE (PACIFIC NW LAB)); Zabriskie, Sean J.(OMNIVIZ, INC); MG Hicks

    2001-05-11

    The volumes and diversity of information in the discovery, development, and business processes within the chemical and life sciences industries require new approaches for analysis. Traditional list- or spreadsheet-based methods are easily overwhelmed by large amounts of data. Furthermore, generating strong hypotheses and, just as importantly, ruling out weak ones, requires integration across different experimental and informational sources. We have developed a framework for this integration, including common conceptual data models for multiple data types and linked visualizations that provide an overview of the entire data set, a measure of how each data record is related to every other record, and an assessment of the associations within the data set.

  20. Rank Order Coding: a Retinal Information Decoding Strategy Revealed by Large-Scale Multielectrode Array Retinal Recordings.

    Science.gov (United States)

    Portelli, Geoffrey; Barrett, John M; Hilgen, Gerrit; Masquelier, Timothée; Maccione, Alessandro; Di Marco, Stefano; Berdondini, Luca; Kornprobst, Pierre; Sernagor, Evelyne

    2016-01-01

    How a population of retinal ganglion cells (RGCs) encodes the visual scene remains an open question. Going beyond individual RGC coding strategies, results in salamander suggest that the relative latencies of a RGC pair encode spatial information. Thus, a population code based on this concerted spiking could be a powerful mechanism to transmit visual information rapidly and efficiently. Here, we tested this hypothesis in mouse by recording simultaneous light-evoked responses from hundreds of RGCs, at pan-retinal level, using a new generation of large-scale, high-density multielectrode array consisting of 4096 electrodes. Interestingly, we did not find any RGCs exhibiting a clear latency tuning to the stimuli, suggesting that in mouse, individual RGC pairs may not provide sufficient information. We show that a significant amount of information is encoded synergistically in the concerted spiking of large RGC populations. Thus, the RGC population response described with relative activities, or ranks, provides more relevant information than classical independent spike count- or latency- based codes. In particular, we report for the first time that when considering the relative activities across the whole population, the wave of first stimulus-evoked spikes is an accurate indicator of stimulus content. We show that this coding strategy coexists with classical neural codes, and that it is more efficient and faster. Overall, these novel observations suggest that already at the level of the retina, concerted spiking provides a reliable and fast strategy to rapidly transmit new visual scenes.

  1. Electronic Health Information Legal Epidemiology Data Set 2014

    Data.gov (United States)

    U.S. Department of Health & Human Services — Authors: Cason Schmit, JD, Gregory Sunshine, JD, Dawn Pepin, JD, MPH, Tara Ramanathan, JD, MPH, Akshara Menon, JD, MPH, Matthew Penn, JD, MLIS This legal data set...

  2. High-Throughput Tabular Data Processor - Platform independent graphical tool for processing large data sets.

    Science.gov (United States)

    Madanecki, Piotr; Bałut, Magdalena; Buckley, Patrick G; Ochocka, J Renata; Bartoszewski, Rafał; Crossman, David K; Messiaen, Ludwine M; Piotrowski, Arkadiusz

    2018-01-01

    High-throughput technologies generate considerable amount of data which often requires bioinformatic expertise to analyze. Here we present High-Throughput Tabular Data Processor (HTDP), a platform independent Java program. HTDP works on any character-delimited column data (e.g. BED, GFF, GTF, PSL, WIG, VCF) from multiple text files and supports merging, filtering and converting of data that is produced in the course of high-throughput experiments. HTDP can also utilize itemized sets of conditions from external files for complex or repetitive filtering/merging tasks. The program is intended to aid global, real-time processing of large data sets using a graphical user interface (GUI). Therefore, no prior expertise in programming, regular expression, or command line usage is required of the user. Additionally, no a priori assumptions are imposed on the internal file composition. We demonstrate the flexibility and potential of HTDP in real-life research tasks including microarray and massively parallel sequencing, i.e. identification of disease predisposing variants in the next generation sequencing data as well as comprehensive concurrent analysis of microarray and sequencing results. We also show the utility of HTDP in technical tasks including data merge, reduction and filtering with external criteria files. HTDP was developed to address functionality that is missing or rudimentary in other GUI software for processing character-delimited column data from high-throughput technologies. Flexibility, in terms of input file handling, provides long term potential functionality in high-throughput analysis pipelines, as the program is not limited by the currently existing applications and data formats. HTDP is available as the Open Source software (https://github.com/pmadanecki/htdp).

  3. Large area synchrotron X-ray fluorescence mapping of biological samples

    International Nuclear Information System (INIS)

    Kempson, I.; Thierry, B.; Smith, E.; Gao, M.; De Jonge, M.

    2014-01-01

    Large area mapping of inorganic material in biological samples has suffered severely from prohibitively long acquisition times. With the advent of new detector technology we can now generate statistically relevant information for studying cell populations, inter-variability and bioinorganic chemistry in large specimen. We have been implementing ultrafast synchrotron-based XRF mapping afforded by the MAIA detector for large area mapping of biological material. For example, a 2.5 million pixel map can be acquired in 3 hours, compared to a typical synchrotron XRF set-up needing over 1 month of uninterrupted beamtime. Of particular focus to us is the fate of metals and nanoparticles in cells, 3D tissue models and animal tissues. The large area scanning has for the first time provided statistically significant information on sufficiently large numbers of cells to provide data on intercellular variability in uptake of nanoparticles. Techniques such as flow cytometry generally require analysis of thousands of cells for statistically meaningful comparison, due to the large degree of variability. Large area XRF now gives comparable information in a quantifiable manner. Furthermore, we can now image localised deposition of nanoparticles in tissues that would be highly improbable to 'find' by typical XRF imaging. In addition, the ultra fast nature also makes it viable to conduct 3D XRF tomography over large dimensions. This technology avails new opportunities in biomonitoring and understanding metal and nanoparticle fate ex-vivo. Following from this is extension to molecular imaging through specific anti-body targeted nanoparticles to label specific tissues and monitor cellular process or biological consequence

  4. Hierarchical sets: analyzing pangenome structure through scalable set visualizations

    Science.gov (United States)

    2017-01-01

    Abstract Motivation: The increase in available microbial genome sequences has resulted in an increase in the size of the pangenomes being analyzed. Current pangenome visualizations are not intended for the pangenome sizes possible today and new approaches are necessary in order to convert the increase in available information to increase in knowledge. As the pangenome data structure is essentially a collection of sets we explore the potential for scalable set visualization as a tool for pangenome analysis. Results: We present a new hierarchical clustering algorithm based on set arithmetics that optimizes the intersection sizes along the branches. The intersection and union sizes along the hierarchy are visualized using a composite dendrogram and icicle plot, which, in pangenome context, shows the evolution of pangenome and core size along the evolutionary hierarchy. Outlying elements, i.e. elements whose presence pattern do not correspond with the hierarchy, can be visualized using hierarchical edge bundles. When applied to pangenome data this plot shows putative horizontal gene transfers between the genomes and can highlight relationships between genomes that is not represented by the hierarchy. We illustrate the utility of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. Availability and Implementation: The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https://cran.r-project.org/web/packages/hierarchicalSets) Contact: thomasp85@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28130242

  5. Validation and evaluation of common large-area display set (CLADS) performance specification

    Science.gov (United States)

    Hermann, David J.; Gorenflo, Ronald L.

    1998-09-01

    Battelle is under contract with Warner Robins Air Logistics Center to design a Common Large Area Display Set (CLADS) for use in multiple Command, Control, Communications, Computers, and Intelligence (C4I) applications that currently use 19- inch Cathode Ray Tubes (CRTs). Battelle engineers have built and fully tested pre-production prototypes of the CLADS design for AWACS, and are completing pre-production prototype displays for three other platforms simultaneously. With the CLADS design, any display technology that can be packaged to meet the form, fit, and function requirements defined by the Common Large Area Display Head Assembly (CLADHA) performance specification is a candidate for CLADS applications. This technology independent feature reduced the risk of CLADS development, permits life long technology insertion upgrades without unnecessary redesign, and addresses many of the obsolescence problems associated with COTS technology-based acquisition. Performance and environmental testing were performed on the AWACS CLADS and continues on other platforms as a part of the performance specification validation process. A simulator assessment and flight assessment were successfully completed for the AWACS CLADS, and lessons learned from these assessments are being incorporated into the performance specifications. Draft CLADS specifications were released to potential display integrators and manufacturers for review in 1997, and the final version of the performance specifications are scheduled to be released to display integrators and manufacturers in May, 1998. Initial USAF applications include replacements for the E-3 AWACS color monitor assembly, E-8 Joint STARS graphics display unit, and ABCCC airborne color display. Initial U.S. Navy applications include the E-2C ACIS display. For these applications, reliability and maintainability are key objectives. The common design will reduce the cost of operation and maintenance by an estimated 3.3M per year on E-3 AWACS

  6. Basic set theory

    CERN Document Server

    Levy, Azriel

    2002-01-01

    An advanced-level treatment of the basics of set theory, this text offers students a firm foundation, stopping just short of the areas employing model-theoretic methods. Geared toward upper-level undergraduate and graduate students, it consists of two parts: the first covers pure set theory, including the basic motions, order and well-foundedness, cardinal numbers, the ordinals, and the axiom of choice and some of it consequences; the second deals with applications and advanced topics such as point set topology, real spaces, Boolean algebras, and infinite combinatorics and large cardinals. An

  7. How Informed Are Informal Educators?

    Science.gov (United States)

    Lederman, Norman G.; Niess, Margaret L.

    1998-01-01

    Explores current reforms in both mathematics and science education that emphasize the importance of learning in informal settings. Suggests that informal education must include planned and purposeful attempts to facilitate students' understanding of mathematics and science in community settings other than the local school. (Author/CCM)

  8. Ensuring Adequate Health and Safety Information for Decision Makers during Large-Scale Chemical Releases

    Science.gov (United States)

    Petropoulos, Z.; Clavin, C.; Zuckerman, B.

    2015-12-01

    The 2014 4-Methylcyclohexanemethanol (MCHM) spill in the Elk River of West Virginia highlighted existing gaps in emergency planning for, and response to, large-scale chemical releases in the United States. The Emergency Planning and Community Right-to-Know Act requires that facilities with hazardous substances provide Material Safety Data Sheets (MSDSs), which contain health and safety information on the hazardous substances. The MSDS produced by Eastman Chemical Company, the manufacturer of MCHM, listed "no data available" for various human toxicity subcategories, such as reproductive toxicity and carcinogenicity. As a result of incomplete toxicity data, the public and media received conflicting messages on the safety of the contaminated water from government officials, industry, and the public health community. Two days after the governor lifted the ban on water use, the health department partially retracted the ban by warning pregnant women to continue avoiding the contaminated water, which the Centers for Disease Control and Prevention deemed safe three weeks later. The response in West Virginia represents a failure in risk communication and calls to question if government officials have sufficient information to support evidence-based decisions during future incidents. Research capabilities, like the National Science Foundation RAPID funding, can provide a solution to some of the data gaps, such as information on environmental fate in the case of the MCHM spill. In order to inform policy discussions on this issue, a methodology for assessing the outcomes of RAPID and similar National Institutes of Health grants in the context of emergency response is employed to examine the efficacy of research-based capabilities in enhancing public health decision making capacity. The results of this assessment highlight potential roles rapid scientific research can fill in ensuring adequate health and safety data is readily available for decision makers during large

  9. Practice settings and dentists' job satisfaction.

    Science.gov (United States)

    Lo Sasso, Anthony T; Starkel, Rebecca L; Warren, Matthew N; Guay, Albert H; Vujicic, Marko

    2015-08-01

    The nature and organization of dental practice is changing. The aim of this study was to explore how job satisfaction among dentists is associated with dental practice setting. A survey measured satisfaction with income, benefits, hours worked, clinical autonomy, work-life balance, emotional exhaustion, and overall satisfaction among dentists working in large group, small group, and solo practice settings; 2,171 dentists responded. The authors used logistic regression to measure differences in reported levels of satisfaction across practice settings. Dentists working in small group settings reported the most satisfaction overall. Dentists working in large group settings reported more satisfaction with income and benefits than dentists in solo practice, as well as having the least stress. Findings suggest possible advantages and disadvantages of working in different types of practice settings. Dentists working in different practice settings reported differences in satisfaction. These results may help dentists decide which practice setting is best for them. Copyright © 2015 American Dental Association. Published by Elsevier Inc. All rights reserved.

  10. SARS and hospital priority setting: a qualitative case study and evaluation

    Directory of Open Access Journals (Sweden)

    Upshur Ross EG

    2004-12-01

    Full Text Available Abstract Background Priority setting is one of the most difficult issues facing hospitals because of funding restrictions and changing patient need. A deadly communicable disease outbreak, such as the Severe Acute Respiratory Syndrome (SARS in Toronto in 2003, amplifies the difficulties of hospital priority setting. The purpose of this study is to describe and evaluate priority setting in a hospital in response to SARS using the ethical framework 'accountability for reasonableness'. Methods This study was conducted at a large tertiary hospital in Toronto, Canada. There were two data sources: 1 over 200 key documents (e.g. emails, bulletins, and 2 35 interviews with key informants. Analysis used a modified thematic technique in three phases: open coding, axial coding, and evaluation. Results Participants described the types of priority setting decisions, the decision making process and the reasoning used. Although the hospital leadership made an effort to meet the conditions of 'accountability for reasonableness', they acknowledged that the decision making was not ideal. We described good practices and opportunities for improvement. Conclusions 'Accountability for reasonableness' is a framework that can be used to guide fair priority setting in health care organizations, such as hospitals. In the midst of a crisis such as SARS where guidance is incomplete, consequences uncertain, and information constantly changing, where hour-by-hour decisions involve life and death, fairness is more important rather than less.

  11. SARS and hospital priority setting: a qualitative case study and evaluation.

    Science.gov (United States)

    Bell, Jennifer A H; Hyland, Sylvia; DePellegrin, Tania; Upshur, Ross E G; Bernstein, Mark; Martin, Douglas K

    2004-12-19

    Priority setting is one of the most difficult issues facing hospitals because of funding restrictions and changing patient need. A deadly communicable disease outbreak, such as the Severe Acute Respiratory Syndrome (SARS) in Toronto in 2003, amplifies the difficulties of hospital priority setting. The purpose of this study is to describe and evaluate priority setting in a hospital in response to SARS using the ethical framework 'accountability for reasonableness'. This study was conducted at a large tertiary hospital in Toronto, Canada. There were two data sources: 1) over 200 key documents (e.g. emails, bulletins), and 2) 35 interviews with key informants. Analysis used a modified thematic technique in three phases: open coding, axial coding, and evaluation. Participants described the types of priority setting decisions, the decision making process and the reasoning used. Although the hospital leadership made an effort to meet the conditions of 'accountability for reasonableness', they acknowledged that the decision making was not ideal. We described good practices and opportunities for improvement. 'Accountability for reasonableness' is a framework that can be used to guide fair priority setting in health care organizations, such as hospitals. In the midst of a crisis such as SARS where guidance is incomplete, consequences uncertain, and information constantly changing, where hour-by-hour decisions involve life and death, fairness is more important rather than less.

  12. AN EFFICIENT DATA MINING METHOD TO FIND FREQUENT ITEM SETS IN LARGE DATABASE USING TR- FCTM

    Directory of Open Access Journals (Sweden)

    Saravanan Suba

    2016-01-01

    Full Text Available Mining association rules in large database is one of most popular data mining techniques for business decision makers. Discovering frequent item set is the core process in association rule mining. Numerous algorithms are available in the literature to find frequent patterns. Apriori and FP-tree are the most common methods for finding frequent items. Apriori finds significant frequent items using candidate generation with more number of data base scans. FP-tree uses two database scans to find significant frequent items without using candidate generation. This proposed TR-FCTM (Transaction Reduction- Frequency Count Table Method discovers significant frequent items by generating full candidates once to form frequency count table with one database scan. Experimental results of TR-FCTM shows that this algorithm outperforms than Apriori and FP-tree.

  13. Argentine Population Genetic Structure: Large Variance in Amerindian Contribution

    Science.gov (United States)

    Seldin, Michael F.; Tian, Chao; Shigeta, Russell; Scherbarth, Hugo R.; Silva, Gabriel; Belmont, John W.; Kittles, Rick; Gamron, Susana; Allevi, Alberto; Palatnik, Simon A.; Alvarellos, Alejandro; Paira, Sergio; Caprarulo, Cesar; Guillerón, Carolina; Catoggio, Luis J.; Prigione, Cristina; Berbotto, Guillermo A.; García, Mercedes A.; Perandones, Carlos E.; Pons-Estel, Bernardo A.; Alarcon-Riquelme, Marta E.

    2011-01-01

    Argentine population genetic structure was examined using a set of 78 ancestry informative markers (AIMs) to assess the contributions of European, Amerindian, and African ancestry in 94 individuals members of this population. Using the Bayesian clustering algorithm STRUCTURE, the mean European contribution was 78%, the Amerindian contribution was 19.4%, and the African contribution was 2.5%. Similar results were found using weighted least mean square method: European, 80.2%; Amerindian, 18.1%; and African, 1.7%. Consistent with previous studies the current results showed very few individuals (four of 94) with greater than 10% African admixture. Notably, when individual admixture was examined, the Amerindian and European admixture showed a very large variance and individual Amerindian contribution ranged from 1.5 to 84.5% in the 94 individual Argentine subjects. These results indicate that admixture must be considered when clinical epidemiology or case control genetic analyses are studied in this population. Moreover, the current study provides a set of informative SNPs that can be used to ascertain or control for this potentially hidden stratification. In addition, the large variance in admixture proportions in individual Argentine subjects shown by this study suggests that this population is appropriate for future admixture mapping studies. PMID:17177183

  14. Cost-effectiveness of providing patients with information on managing mild low-back symptoms in an occupational health setting

    Directory of Open Access Journals (Sweden)

    J. Rantonen

    2016-04-01

    Full Text Available Abstract Background Evidence shows that low back specific patient information is effective in sub-acute low back pain (LBP, but effectiveness and cost-effectiveness (CE of information in early phase symptoms is not clear. We assessed effectiveness and CE of patient information in mild LBP in the occupational health (OH setting in a quasi-experimental study. Methods A cohort of employees (N = 312, aged <57 with non-specific, mild LBP (Visual Analogue Scale between 10–34 mm was selected from the respondents of an employee survey (N = 2480; response rate 71 %. A random sample, representing the natural course of LBP (NC, N = 83; no intervention, was extracted as a control group. Remaining employees were invited (181 included, 47 declined, one excluded into a randomised controlled study with two 1:1 allocated parallel intervention arms (“Booklet”, N = 92; “Combined”, N = 89. All participants received the “Back Book” patient information booklet and the Combined also an individual verbal review of the booklet. Physical impairment (PHI, LBP, health care (HC utilisation, and all-cause sickness absence (SA were assessed at two years. CE of the interventions on SA days was analysed by using direct HC costs in one year, two years from baseline. Multiple imputation was used for missing values. Results Compared to NC, the Booklet reduced HC costs by 196€ and SA by 3.5 days per year. In 81 % of the bootstrapped cases the Booklet was both cost saving and effective on SA. Compared to NC, in the Combined arm, the figures were 107€, 0.4 days, and 54 %, respectively. PHI decreased in both interventions. Conclusions Booklet information alone was cost-effective in comparison to natural course of mild LBP. Combined information reduced HC costs. Both interventions reduced physical impairment. Mere booklet information is beneficial for employees who report mild LBP in the OH setting, and is also cost saving for the health care

  15. Internet Browser for Ice, Weather and Ocean Information

    DEFF Research Database (Denmark)

    Pedersen, Leif Toudal; Saldo, Roberto

    2005-01-01

    Abstract An Internet based distribution system for ice, weather and ocean information has been set up. The system provides near real time access to a large variety of data about the polar environment in a standard user environment. The system is freely available at: http://www.seaice.dk Specific...

  16. Registering coherent change detection products associated with large image sets and long capture intervals

    Science.gov (United States)

    Perkins, David Nikolaus; Gonzales, Antonio I

    2014-04-08

    A set of co-registered coherent change detection (CCD) products is produced from a set of temporally separated synthetic aperture radar (SAR) images of a target scene. A plurality of transformations are determined, which transformations are respectively for transforming a plurality of the SAR images to a predetermined image coordinate system. The transformations are used to create, from a set of CCD products produced from the set of SAR images, a corresponding set of co-registered CCD products.

  17. Developing a Minimum Data Set for an Information Management System to Study Traffic Accidents in Iran.

    Science.gov (United States)

    Mohammadi, Ali; Ahmadi, Maryam; Gharagozlu, Alireza

    2016-03-01

    Each year, around 1.2 million people die in the road traffic incidents. Reducing traffic accidents requires an exact understanding of the risk factors associated with traffic patterns and behaviors. Properly analyzing these factors calls for a comprehensive system for collecting and processing accident data. The aim of this study was to develop a minimum data set (MDS) for an information management system to study traffic accidents in Iran. This descriptive, cross-sectional study was performed in 2014. Data were collected from the traffic police, trauma centers, medical emergency centers, and via the internet. The investigated resources for this study were forms, databases, and documents retrieved from the internet. Forms and databases were identical, and one sample of each was evaluated. The related internet-sourced data were evaluated in their entirety. Data were collected using three checklists. In order to arrive at a consensus about the data elements, the decision Delphi technique was applied using questionnaires. The content validity and reliability of the questionnaires were assessed by experts' opinions and the test-retest method, respectively. An (MDS) of a traffic accident information management system was assigned to three sections: a minimum data set for traffic police with six classes, including 118 data elements; a trauma center with five data classes, including 57 data elements; and a medical emergency center, with 11 classes, including 64 data elements. Planning for the prevention of traffic accidents requires standardized data. As the foundation for crash prevention efforts, existing standard data infrastructures present policymakers and government officials with a great opportunity to strengthen and integrate existing accident information systems to better track road traffic injuries and fatalities.

  18. Minimum Data Set Active Resident Information Report

    Data.gov (United States)

    U.S. Department of Health & Human Services — The MDS Active Resident Report summarizes information for residents currently in nursing homes. The source of these counts is the residents MDS assessment record....

  19. Physics Mining of Multi-Source Data Sets

    Science.gov (United States)

    Helly, John; Karimabadi, Homa; Sipes, Tamara

    2012-01-01

    Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission replanning to optimize the allocation of observational resources. The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool's outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as physics-mining of data. The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.

  20. Understanding Information Anxiety and How Academic Librarians Can Minimize Its Effects

    Science.gov (United States)

    Eklof, Ashley

    2013-01-01

    Information anxiety is a serious issue that has the potential to hinder the success of a large percentage of the population in both education and professional settings. It has become more prevalent as societies begin to focus more on the value of technology, multitasking, and instant information access. The majority of the population has felt, to…

  1. Galaxy Evolution Insights from Spectral Modeling of Large Data Sets from the Sloan Digital Sky Survey

    Energy Technology Data Exchange (ETDEWEB)

    Hoversten, Erik A. [Johns Hopkins Univ., Baltimore, MD (United States)

    2007-10-01

    This thesis centers on the use of spectral modeling techniques on data from the Sloan Digital Sky Survey (SDSS) to gain new insights into current questions in galaxy evolution. The SDSS provides a large, uniform, high quality data set which can be exploited in a number of ways. One avenue pursued here is to use the large sample size to measure precisely the mean properties of galaxies of increasingly narrow parameter ranges. The other route taken is to look for rare objects which open up for exploration new areas in galaxy parameter space. The crux of this thesis is revisiting the classical Kennicutt method for inferring the stellar initial mass function (IMF) from the integrated light properties of galaxies. A large data set (~ 105 galaxies) from the SDSS DR4 is combined with more in-depth modeling and quantitative statistical analysis to search for systematic IMF variations as a function of galaxy luminosity. Galaxy Hα equivalent widths are compared to a broadband color index to constrain the IMF. It is found that for the sample as a whole the best fitting IMF power law slope above 0.5 M is Γ = 1.5 ± 0.1 with the error dominated by systematics. Galaxies brighter than around Mr,0.1 = -20 (including galaxies like the Milky Way which has Mr,0.1 ~ -21) are well fit by a universal Γ ~ 1.4 IMF, similar to the classical Salpeter slope, and smooth, exponential star formation histories (SFH). Fainter galaxies prefer steeper IMFs and the quality of the fits reveal that for these galaxies a universal IMF with smooth SFHs is actually a poor assumption. Related projects are also pursued. A targeted photometric search is conducted for strongly lensed Lyman break galaxies (LBG) similar to MS1512-cB58. The evolution of the photometric selection technique is described as are the results of spectroscopic follow-up of the best targets. The serendipitous discovery of two interesting blue compact dwarf galaxies is reported. These

  2. Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks.

    Science.gov (United States)

    Teng, Xian; Pei, Sen; Morone, Flaviano; Makse, Hernán A

    2016-10-26

    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called "Collective Influence (CI)" has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes' significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct "virtual" information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes' importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community.

  3. Setting up crowd science projects.

    Science.gov (United States)

    Scheliga, Kaja; Friesike, Sascha; Puschmann, Cornelius; Fecher, Benedikt

    2016-11-29

    Crowd science is scientific research that is conducted with the participation of volunteers who are not professional scientists. Thanks to the Internet and online platforms, project initiators can draw on a potentially large number of volunteers. This crowd can be involved to support data-rich or labour-intensive projects that would otherwise be unfeasible. So far, research on crowd science has mainly focused on analysing individual crowd science projects. In our research, we focus on the perspective of project initiators and explore how crowd science projects are set up. Based on multiple case study research, we discuss the objectives of crowd science projects and the strategies of their initiators for accessing volunteers. We also categorise the tasks allocated to volunteers and reflect on the issue of quality assurance as well as feedback mechanisms. With this article, we contribute to a better understanding of how crowd science projects are set up and how volunteers can contribute to science. We suggest that our findings are of practical relevance for initiators of crowd science projects, for science communication as well as for informed science policy making. © The Author(s) 2016.

  4. Protein complex prediction in large ontology attributed protein-protein interaction networks.

    Science.gov (United States)

    Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Li, Yanpeng; Xu, Bo

    2013-01-01

    Protein complexes are important for unraveling the secrets of cellular organization and function. Many computational approaches have been developed to predict protein complexes in protein-protein interaction (PPI) networks. However, most existing approaches focus mainly on the topological structure of PPI networks, and largely ignore the gene ontology (GO) annotation information. In this paper, we constructed ontology attributed PPI networks with PPI data and GO resource. After constructing ontology attributed networks, we proposed a novel approach called CSO (clustering based on network structure and ontology attribute similarity). Structural information and GO attribute information are complementary in ontology attributed networks. CSO can effectively take advantage of the correlation between frequent GO annotation sets and the dense subgraph for protein complex prediction. Our proposed CSO approach was applied to four different yeast PPI data sets and predicted many well-known protein complexes. The experimental results showed that CSO was valuable in predicting protein complexes and achieved state-of-the-art performance.

  5. Set theory essentials

    CERN Document Server

    Milewski, Emil G

    2012-01-01

    REA's Essentials provide quick and easy access to critical information in a variety of different fields, ranging from the most basic to the most advanced. As its name implies, these concise, comprehensive study guides summarize the essentials of the field covered. Essentials are helpful when preparing for exams, doing homework and will remain a lasting reference source for students, teachers, and professionals. Set Theory includes elementary logic, sets, relations, functions, denumerable and non-denumerable sets, cardinal numbers, Cantor's theorem, axiom of choice, and order relations.

  6. Decree No. 77-1233 of 10 November 1977 setting up a Council for information on nuclear electricity generation

    International Nuclear Information System (INIS)

    1977-01-01

    This Decree sets up a Council for information on nuclear electricity generation directly under the authority of and appointed by the Prime Minister. It has 18 members who include, inter alia, mayors of the communes concerned by nuclear power plant siting, representatives of nature and environmental protection associations, science academicians and economics, energy and communications experts. The Council's purpose is to ensure that the public has access to infomation on nuclear electricity generation from the technical, health, ecological, economic and financial viewpoints. It advises the Government on the public's conditions of access to information and proposes methods for its dissemination. (NEA) [fr

  7. Registration of an enterprise information system development by formal specifications

    Directory of Open Access Journals (Sweden)

    Milan Mišovič

    2006-01-01

    Full Text Available The economical view from the Enterprise process sets ERP, SCM, CRM, BI, … to a functionality and Enterprise Information System structure by informaticians is demonstrable reality. A comprehensive Enterprise Information System software solution, that respects the mentioned economical platform by large software firms, has got required attributes of a data, process and communication integrity but there is not financially sustainable for small enterprises. These enterprises are predominantly oriented to progressive computerization of enterprise processes and rather gradually buy application packages for individual process sets. Large and small software firms provide needed partial solutions, nevertheless small firms solutions are connected with the data, process and communication disintegration. Since the compatibility requirement is not generally accepted, finding of an EAI solution have become one of the main System Integration tasks. This article provides one specific style for a complex or partial Enterprise Information System solution. This solution is founded on formal and descriptive specifications that can sustain required data, process and communication integration among packages of applications. As a result, this style provides the new view for the effectiveness of the associated process of information modeling.

  8. Are large-scale flow experiments informing the science and management of freshwater ecosystems?

    Science.gov (United States)

    Olden, Julian D.; Konrad, Christopher P.; Melis, Theodore S.; Kennard, Mark J.; Freeman, Mary C.; Mims, Meryl C.; Bray, Erin N.; Gido, Keith B.; Hemphill, Nina P.; Lytle, David A.; McMullen, Laura E.; Pyron, Mark; Robinson, Christopher T.; Schmidt, John C.; Williams, John G.

    2013-01-01

    Greater scientific knowledge, changing societal values, and legislative mandates have emphasized the importance of implementing large-scale flow experiments (FEs) downstream of dams. We provide the first global assessment of FEs to evaluate their success in advancing science and informing management decisions. Systematic review of 113 FEs across 20 countries revealed that clear articulation of experimental objectives, while not universally practiced, was crucial for achieving management outcomes and changing dam-operating policies. Furthermore, changes to dam operations were three times less likely when FEs were conducted primarily for scientific purposes. Despite the recognized importance of riverine flow regimes, four-fifths of FEs involved only discrete flow events. Over three-quarters of FEs documented both abiotic and biotic outcomes, but only one-third examined multiple taxonomic responses, thus limiting how FE results can inform holistic dam management. Future FEs will present new opportunities to advance scientifically credible water policies.

  9. Future Research in Health Information Technology: A Review.

    Science.gov (United States)

    Hemmat, Morteza; Ayatollahi, Haleh; Maleki, Mohammad Reza; Saghafi, Fatemeh

    2017-01-01

    Currently, information technology is considered an important tool to improve healthcare services. To adopt the right technologies, policy makers should have adequate information about present and future advances. This study aimed to review and compare studies with a focus on the future of health information technology. This review study was completed in 2015. The databases used were Scopus, Web of Science, ProQuest, Ovid Medline, and PubMed. Keyword searches were used to identify papers and materials published between 2000 and 2015. Initially, 407 papers were obtained, and they were reduced to 11 papers at the final stage. The selected papers were described and compared in terms of the country of origin, objective, methodology, and time horizon. The papers were divided into two groups: those forecasting the future of health information technology (seven papers) and those providing health information technology foresight (four papers). The results showed that papers related to forecasting the future of health information technology were mostly a literature review, and the time horizon was up to 10 years in most of these studies. In the health information technology foresight group, most of the studies used a combination of techniques, such as scenario building and Delphi methods, and had long-term objectives. To make the most of an investment and to improve planning and successful implementation of health information technology, a strategic plan for the future needs to be set. To achieve this aim, methods such as forecasting the future of health information technology and offering health information technology foresight can be applied. The forecasting method is used when the objectives are not very large, and the foresight approach is recommended when large-scale objectives are set to be achieved. In the field of health information technology, the results of foresight studies can help to establish realistic long-term expectations of the future of health information

  10. THE IMPORTANCE OF INFORMATION SYSTEMS IN THE MANAGEMENT AND PROCESSING OF LARGE DATA VOLUMES IN PUBLIC INSTITUTIONS

    Directory of Open Access Journals (Sweden)

    CARINA-ELENA STEGĂROIU

    2016-12-01

    Full Text Available Under a computerized society, technological resources become a source of identification for any community, institution or country. Globalization of information becomes a reality, all the resources having entered into a relationship of subordination with the World Wide Web, the information highways and the Internet. "Information technology - with its most important branch, data management computer science - enters a new era, in which the computer leads to the benefit of a navigable and transparent communication space, focusing on information". Therefore, in an information-based economy, information systems have been established which, based on management systems through the methods of algebra, with applications in economic engineering, have come to manage and process large volumes of data, especially in public institutions. Consequently, the Ministry of Public Affairs has implemented the “Increasing the public administration’s responsibility by modernising the information systems for generating the reports of the financial situations of public institutions” project (FOREXEBUG”, cod SMIS 34952, for which it received in 2012 non-refundable financing from the European Social Fund through the Operational Program for Developing the Administrative Capacity 2007-2013, based on which this paper will analyse the usefulness of implementing such a program in public institutions. Such a system aims to achieve a new form of reporting of budget execution and financial statements (including information related to legal commitments submitted monthly by each public institution in electronic, standardized, secure form, with increasing the reliability of data collected by cross-checking data from the treasury and providing reliable information for use by the Ministry of Finance, public institutions, other relevant institutions and the public, both at the level of detail and the consolidation possibilities at various levels, in parallel with their use for

  11. Pseudo-set framing.

    Science.gov (United States)

    Barasz, Kate; John, Leslie K; Keenan, Elizabeth A; Norton, Michael I

    2017-10-01

    Pseudo-set framing-arbitrarily grouping items or tasks together as part of an apparent "set"-motivates people to reach perceived completion points. Pseudo-set framing changes gambling choices (Study 1), effort (Studies 2 and 3), giving behavior (Field Data and Study 4), and purchase decisions (Study 5). These effects persist in the absence of any reward, when a cost must be incurred, and after participants are explicitly informed of the arbitrariness of the set. Drawing on Gestalt psychology, we develop a conceptual account that predicts what will-and will not-act as a pseudo-set, and defines the psychological process through which these pseudo-sets affect behavior: over and above typical reference points, pseudo-set framing alters perceptions of (in)completeness, making intermediate progress seem less complete. In turn, these feelings of incompleteness motivate people to persist until the pseudo-set has been fulfilled. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  12. Empirical Mining of Large Data Sets Already Helps to Solve Practical Ecological Problems; A Panoply of Working Examples (Invited)

    Science.gov (United States)

    Hargrove, W. W.; Hoffman, F. M.; Kumar, J.; Spruce, J.; Norman, S. P.

    2013-12-01

    Here we present diverse examples where empirical mining and statistical analysis of large data sets have already been shown to be useful for a wide variety of practical decision-making problems within the realm of large-scale ecology. Because a full understanding and appreciation of particular ecological phenomena are possible only after hypothesis-directed research regarding the existence and nature of that process, some ecologists may feel that purely empirical data harvesting may represent a less-than-satisfactory approach. Restricting ourselves exclusively to process-driven approaches, however, may actually slow progress, particularly for more complex or subtle ecological processes. We may not be able to afford the delays caused by such directed approaches. Rather than attempting to formulate and ask every relevant question correctly, empirical methods allow trends, relationships and associations to emerge freely from the data themselves, unencumbered by a priori theories, ideas and prejudices that have been imposed upon them. Although they cannot directly demonstrate causality, empirical methods can be extremely efficient at uncovering strong correlations with intermediate "linking" variables. In practice, these correlative structures and linking variables, once identified, may provide sufficient predictive power to be useful themselves. Such correlation "shadows" of causation can be harnessed by, e.g., Bayesian Belief Nets, which bias ecological management decisions, made with incomplete information, toward favorable outcomes. Empirical data-harvesting also generates a myriad of testable hypotheses regarding processes, some of which may even be correct. Quantitative statistical regionalizations based on quantitative multivariate similarity have lended insights into carbon eddy-flux direction and magnitude, wildfire biophysical conditions, phenological ecoregions useful for vegetation type mapping and monitoring, forest disease risk maps (e.g., sudden oak

  13. The Influence of Company Size on Accounting Information: Evidence in Large Caps and Small Caps Companies Listed on BM&FBovespa

    OpenAIRE

    Karen Yukari Yokoyama; Vitor Gomes Baioco; William Brasil Rodrigues Sobrinho; Alfredo Sarlo Neto

    2015-01-01

    In this study, the relation between accounting information aspects and the capitalization level o companies listed on the São Paulo Stock Exchange was investigated, classified as Large Caps or Small Caps, companies with larger and smaller capitalization, respectively, between 2010 and 2012. Three accounting information measures were addressed: informativeness, conservatism and relevance, through the application of Easton and Harris’ (1991) models of earnings informativeness, Basu’s (1997) mod...

  14. Prototype Vector Machine for Large Scale Semi-Supervised Learning

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Kai; Kwok, James T.; Parvin, Bahram

    2009-04-29

    Practicaldataminingrarelyfalls exactlyinto the supervisedlearning scenario. Rather, the growing amount of unlabeled data poses a big challenge to large-scale semi-supervised learning (SSL). We note that the computationalintensivenessofgraph-based SSLarises largely from the manifold or graph regularization, which in turn lead to large models that are dificult to handle. To alleviate this, we proposed the prototype vector machine (PVM), a highlyscalable,graph-based algorithm for large-scale SSL. Our key innovation is the use of"prototypes vectors" for effcient approximation on both the graph-based regularizer and model representation. The choice of prototypes are grounded upon two important criteria: they not only perform effective low-rank approximation of the kernel matrix, but also span a model suffering the minimum information loss compared with the complete model. We demonstrate encouraging performance and appealing scaling properties of the PVM on a number of machine learning benchmark data sets.

  15. [Wound information management system: a standardized scheme for acquisition, storage and management of wound information].

    Science.gov (United States)

    Liu, Hu; Su, Rong-jia; Wu, Min-jie; Zhang, Yi; Qiu, Xiang-jun; Feng, Jian-gang; Xie, Ting; Lu, Shu-liang

    2012-06-01

    To form a wound information management scheme with objectivity, standardization, and convenience by means of wound information management system. A wound information management system was set up with the acquisition terminal, the defined wound description, the data bank, and related softwares. The efficacy of this system was evaluated in clinical practice. The acquisition terminal was composed of the third generation mobile phone and the software. It was feasible to get access to the wound information, including description, image, and therapeutic plan from the data bank by mobile phone. During 4 months, a collection of a total of 232 wound treatment information was entered, and accordingly standardized data of 38 patients were formed automatically. This system can provide standardized wound information management by standardized techniques of acquisition, transmission, and storage of wound information. It can be used widely in hospitals, especially primary medical institutions. Data resource of the system makes it possible for epidemiological study with large sample size in future.

  16. AN EDUCATIONAL THEORY MODEL--(SIGGS), AN INTEGRATION OF SET THEORY, INFORMATION THEORY, AND GRAPH THEORY WITH GENERAL SYSTEMS THEORY.

    Science.gov (United States)

    MACCIA, ELIZABETH S.; AND OTHERS

    AN ANNOTATED BIBLIOGRAPHY OF 20 ITEMS AND A DISCUSSION OF ITS SIGNIFICANCE WAS PRESENTED TO DESCRIBE CURRENT UTILIZATION OF SUBJECT THEORIES IN THE CONSTRUCTION OF AN EDUCATIONAL THEORY. ALSO, A THEORY MODEL WAS USED TO DEMONSTRATE CONSTRUCTION OF A SCIENTIFIC EDUCATIONAL THEORY. THE THEORY MODEL INCORPORATED SET THEORY (S), INFORMATION THEORY…

  17. Large number discrimination in newborn fish.

    Directory of Open Access Journals (Sweden)

    Laura Piffer

    Full Text Available Quantitative abilities have been reported in a wide range of species, including fish. Recent studies have shown that adult guppies (Poecilia reticulata can spontaneously select the larger number of conspecifics. In particular the evidence collected in literature suggest the existence of two distinct systems of number representation: a precise system up to 4 units, and an approximate system for larger numbers. Spontaneous numerical abilities, however, seem to be limited to 4 units at birth and it is currently unclear whether or not the large number system is absent during the first days of life. In the present study, we investigated whether newborn guppies can be trained to discriminate between large quantities. Subjects were required to discriminate between groups of dots with a 0.50 ratio (e.g., 7 vs. 14 in order to obtain a food reward. To dissociate the roles of number and continuous quantities that co-vary with numerical information (such as cumulative surface area, space and density, three different experiments were set up: in Exp. 1 number and continuous quantities were simultaneously available. In Exp. 2 we controlled for continuous quantities and only numerical information was available; in Exp. 3 numerical information was made irrelevant and only continuous quantities were available. Subjects successfully solved the tasks in Exp. 1 and 2, providing the first evidence of large number discrimination in newborn fish. No discrimination was found in experiment 3, meaning that number acuity is better than spatial acuity. A comparison with the onset of numerical abilities observed in shoal-choice tests suggests that training procedures can promote the development of numerical abilities in guppies.

  18. D Webgis and Visualization Issues for Architectures and Large Sites

    Science.gov (United States)

    De Amicis, R.; Conti, G.; Girardi, G.; Andreolli, M.

    2011-09-01

    Traditionally, within the field of archaeology and, more generally, within the cultural heritage domain, Geographical Information Systems (GIS) have been mostly used as support to cataloguing activities, essentially operating as gateways to large geo-referenced archives of specialised cultural heritage information. Additionally GIS have proved to be essential to help cultural heritage institutions improve management of their historical information, providing the means for detection of otherwise hard-to-discover spatial patterns, supporting with computation tools necessary to perform spatial clustering, proximity and orientation analysis. This paper presents a platform developed to answer to both the aforementioned issues, by allowing geo-referenced cataloguing of multi-media resources of cultural relevance as well as access, in a user-friendly manner, through an interactive 3D geobrowser which operates as single point of access to the available digital repositories. The solution has been showcased in the context of "Festival dell'economia" (the Fair of Economics) a major event recently occurred in Trento, Italy and it has allowed visitors of the event to interactively access an extremely large repository of information, as well as their metadata, available across the area of the Autonomous Province of Trento, in Italy. Within the event, an extremely large repository was made accessible, via the network, through web-services, from a 3D interactive geobrowser developed by the authors. The 3D scene was enriched with a number of Points of Interest (POIs) linking to information available within various databases. The software package was deployed with a complex hardware set-up composed of a large composite panoramic screen covering a horizontal field of view of 240 degrees.

  19. Automatic physical inference with information maximizing neural networks

    Science.gov (United States)

    Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

    2018-04-01

    Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.

  20. Data Mining and Visualization of Large Human Behavior Data Sets

    DEFF Research Database (Denmark)

    Cuttone, Andrea

    and credit card transactions – have provided us new sources for studying our behavior. In particular smartphones have emerged as new tools for collecting data about human activity, thanks to their sensing capabilities and their ubiquity. This thesis investigates the question of what we can learn about human...... behavior from this rich and pervasive mobile sensing data. In the first part, we describe a large-scale data collection deployment collecting high-resolution data for over 800 students at the Technical University of Denmark using smartphones, including location, social proximity, calls and SMS. We provide...... an overview of the technical infrastructure, the experimental design, and the privacy measures. The second part investigates the usage of this mobile sensing data for understanding personal behavior. We describe two large-scale user studies on the deployment of self-tracking apps, in order to understand...

  1. Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks

    Science.gov (United States)

    Teng, Xian; Pei, Sen; Morone, Flaviano; Makse, Hernán A.

    2016-01-01

    Identifying the most influential spreaders that maximize information flow is a central question in network theory. Recently, a scalable method called “Collective Influence (CI)” has been put forward through collective influence maximization. In contrast to heuristic methods evaluating nodes’ significance separately, CI method inspects the collective influence of multiple spreaders. Despite that CI applies to the influence maximization problem in percolation model, it is still important to examine its efficacy in realistic information spreading. Here, we examine real-world information flow in various social and scientific platforms including American Physical Society, Facebook, Twitter and LiveJournal. Since empirical data cannot be directly mapped to ideal multi-source spreading, we leverage the behavioral patterns of users extracted from data to construct “virtual” information spreading processes. Our results demonstrate that the set of spreaders selected by CI can induce larger scale of information propagation. Moreover, local measures as the number of connections or citations are not necessarily the deterministic factors of nodes’ importance in realistic information spreading. This result has significance for rankings scientists in scientific networks like the APS, where the commonly used number of citations can be a poor indicator of the collective influence of authors in the community. PMID:27782207

  2. A Combined Eulerian-Lagrangian Data Representation for Large-Scale Applications.

    Science.gov (United States)

    Sauer, Franz; Xie, Jinrong; Ma, Kwan-Liu

    2017-10-01

    The Eulerian and Lagrangian reference frames each provide a unique perspective when studying and visualizing results from scientific systems. As a result, many large-scale simulations produce data in both formats, and analysis tasks that simultaneously utilize information from both representations are becoming increasingly popular. However, due to their fundamentally different nature, drawing correlations between these data formats is a computationally difficult task, especially in a large-scale setting. In this work, we present a new data representation which combines both reference frames into a joint Eulerian-Lagrangian format. By reorganizing Lagrangian information according to the Eulerian simulation grid into a "unit cell" based approach, we can provide an efficient out-of-core means of sampling, querying, and operating with both representations simultaneously. We also extend this design to generate multi-resolution subsets of the full data to suit the viewer's needs and provide a fast flow-aware trajectory construction scheme. We demonstrate the effectiveness of our method using three large-scale real world scientific datasets and provide insight into the types of performance gains that can be achieved.

  3. Large-N in Volcano Settings: Volcanosri

    Science.gov (United States)

    Lees, J. M.; Song, W.; Xing, G.; Vick, S.; Phillips, D.

    2014-12-01

    We seek a paradigm shift in the approach we take on volcano monitoring where the compromise from high fidelity to large numbers of sensors is used to increase coverage and resolution. Accessibility, danger and the risk of equipment loss requires that we develop systems that are independent and inexpensive. Furthermore, rather than simply record data on hard disk for later analysis we desire a system that will work autonomously, capitalizing on wireless technology and in field network analysis. To this end we are currently producing a low cost seismic array which will incorporate, at the very basic level, seismological tools for first cut analysis of a volcano in crises mode. At the advanced end we expect to perform tomographic inversions in the network in near real time. Geophone (4 Hz) sensors connected to a low cost recording system will be installed on an active volcano where triggering earthquake location and velocity analysis will take place independent of human interaction. Stations are designed to be inexpensive and possibly disposable. In one of the first implementations the seismic nodes consist of an Arduino Due processor board with an attached Seismic Shield. The Arduino Due processor board contains an Atmel SAM3X8E ARM Cortex-M3 CPU. This 32 bit 84 MHz processor can filter and perform coarse seismic event detection on a 1600 sample signal in fewer than 200 milliseconds. The Seismic Shield contains a GPS module, 900 MHz high power mesh network radio, SD card, seismic amplifier, and 24 bit ADC. External sensors can be attached to either this 24-bit ADC or to the internal multichannel 12 bit ADC contained on the Arduino Due processor board. This allows the node to support attachment of multiple sensors. By utilizing a high-speed 32 bit processor complex signal processing tasks can be performed simultaneously on multiple sensors. Using a 10 W solar panel, second system being developed can run autonomously and collect data on 3 channels at 100Hz for 6 months

  4. Information findability: An informal study to explore options for improving information findability for the systems analysis group

    Energy Technology Data Exchange (ETDEWEB)

    Stoecker, Nora Kathleen [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2014-03-01

    A Systems Analysis Group has existed at Sandia National Laboratories since at least the mid-1950s. Much of the groups work output (reports, briefing documents, and other materials) has been retained, along with large numbers of related documents. Over time the collection has grown to hundreds of thousands of unstructured documents in many formats contained in one or more of several different shared drives or SharePoint sites, with perhaps five percent of the collection still existing in print format. This presents a challenge. How can the group effectively find, manage, and build on information contained somewhere within such a large set of unstructured documents? In response, a project was initiated to identify tools that would be able to meet this challenge. This report documents the results found and recommendations made as of August 2013.

  5. Network connectivity paradigm for the large data produced by weather radar systems

    Science.gov (United States)

    Guenzi, Diego; Bechini, Renzo; Boraso, Rodolfo; Cremonini, Roberto; Fratianni, Simona

    2014-05-01

    The traffic over Internet is constantly increasing; this is due in particular to social networks activities but also to the enormous exchange of data caused especially by the so-called "Internet of Things". With this term we refer to every device that has the capability of exchanging information with other devices on the web. In geoscience (and, in particular, in meteorology and climatology) there is a constantly increasing number of sensors that are used to obtain data from different sources (like weather radars, digital rain gauges, etc.). This information-gathering activity, frequently, must be followed by a complex data analysis phase, especially when we have large data sets that can be very difficult to analyze (very long historical series of large data sets, for example), like the so called big data. These activities are particularly intensive in resource consumption and they lead to new computational models (like cloud computing) and new methods for storing data (like object store, linked open data, NOSQL or NewSQL). The weather radar systems can be seen as one of the sensors mentioned above: it transmit a large amount of raw data over the network (up to 40 megabytes every five minutes), with 24h/24h continuity and in any weather condition. Weather radar are often located in peaks and in wild areas where connectivity is poor. For this reason radar measurements are sometimes processed partially on site and reduced in size to adapt them to the limited bandwidth currently available by data transmission systems. With the aim to preserve the maximum flow of information, an innovative network connectivity paradigm for the large data produced by weather radar system is here presented. The study is focused on the Monte Settepani operational weather radar system, located over a wild peak summit in north-western Italy.

  6. A strategy to improve priority setting in developing countries.

    Science.gov (United States)

    Kapiriri, Lydia; Martin, Douglas K

    2007-09-01

    Because the demand for health services outstrips the available resources, priority setting is one of the most difficult issues faced by health policy makers, particularly those in developing countries. Priority setting in developing countries is fraught with uncertainty due to lack of credible information, weak priority setting institutions, and unclear priority setting processes. Efforts to improve priority setting in these contexts have focused on providing information and tools. In this paper we argue that priority setting is a value laden and political process, and although important, the available information and tools are not sufficient to address the priority setting challenges in developing countries. Additional complementary efforts are required. Hence, a strategy to improve priority setting in developing countries should also include: (i) capturing current priority setting practices, (ii) improving the legitimacy and capacity of institutions that set priorities, and (iii) developing fair priority setting processes.

  7. Timetable-based simulation method for choice set generation in large-scale public transport networks

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Kjær; Anderson, Marie Karen; Nielsen, Otto Anker

    2016-01-01

    The composition and size of the choice sets are a key for the correct estimation of and prediction by route choice models. While existing literature has posed a great deal of attention towards the generation of path choice sets for private transport problems, the same does not apply to public...... transport problems. This study proposes a timetable-based simulation method for generating path choice sets in a multimodal public transport network. Moreover, this study illustrates the feasibility of its implementation by applying the method to reproduce 5131 real-life trips in the Greater Copenhagen Area...... and to assess the choice set quality in a complex multimodal transport network. Results illustrate the applicability of the algorithm and the relevance of the utility specification chosen for the reproduction of real-life path choices. Moreover, results show that the level of stochasticity used in choice set...

  8. Simple multi-party set reconciliation

    DEFF Research Database (Denmark)

    Mitzenmacher, Michael; Pagh, Rasmus

    2017-01-01

     set reconciliation: two parties A1A1 and A2A2 each hold a set of keys, named S1S1 and S2S2 respectively, and the goal is for both parties to obtain S1∪S2S1∪S2. Typically, set reconciliation is interesting algorithmically when sets are large but the set difference |S1−S2|+|S2−S1||S1−S2|+|S2−S1| is small...

  9. Nature of the optical information recorded in speckles

    Science.gov (United States)

    Sciammarella, Cesar A.

    1998-09-01

    The process of encoding displacement information in electronic Holographic Interferometry is reviewed. Procedures to extend the applicability of this technique to large deformations are given. The proposed techniques are applied and results from these experiments are compared with results obtained by other means. The similarity between the two sets of results illustrates the validity for the new techniques.

  10. Efficient structure from motion on large scenes using UAV with position and pose information

    Science.gov (United States)

    Teng, Xichao; Yu, Qifeng; Shang, Yang; Luo, Jing; Wang, Gang

    2018-04-01

    In this paper, we exploit prior information from global positioning systems and inertial measurement units to speed up the process of large scene reconstruction from images acquired by Unmanned Aerial Vehicles. We utilize weak pose information and intrinsic parameter to obtain the projection matrix for each view. As compared to unmanned aerial vehicles' flight altitude, topographic relief can usually be ignored, we assume that the scene is flat and use weak perspective camera to get projective transformations between two views. Furthermore, we propose an overlap criterion and select potentially matching view pairs between projective transformed views. A robust global structure from motion method is used for image based reconstruction. Our real world experiments show that the approach is accurate, scalable and computationally efficient. Moreover, projective transformations between views can also be used to eliminate false matching.

  11. The system of computer simulation and organizational management of large enterprises activity

    Directory of Open Access Journals (Sweden)

    E. D. Chertov

    2016-01-01

    Full Text Available Study on the construction of an integrated technical support is carried out by the example of organizational information systems (or administrative and economic management activities of large organizations. As part of the management information system, comprehensive technical support related to other parts of the system, first of all, to the information database managementsystem, which covers all types of information required for planning and management, and an algorithm for processing this information. This means that not only the control system determines the required set of technical means, but it features a significant effect on the composition and organization of the management information system database. A feature of the integrated logistics is the variety of hardware functions, a large number of device types, different ways of interaction of the operator and equipment, the possibility of a different line-up and aggregation devices. The complex of technical means of information management systems have all the features of a complex system: versatility, availability feedbacks multicriteriality, hierarchical structure, the presence of allocated parts connected to each other by complex interactions, the uncertainty of the behavior of these parts, which is the result of the ultimate reliability of technical means and the influence of environmental disturbances . For this reason, the tasks associated with the creation of an integrated logistics management information system should be solved with the system approach. To maximize the efficiency of the information management system required the construction of technological complex with minimal installation and operation, which leads to the need to choose the optimal variant of technical means of the number of possible. The decision of the main objectives of integrated logistics can be reduced to the construction of the joint number of languages - character sets or alphabets describing the input

  12. Multi-level discriminative dictionary learning with application to large scale image classification.

    Science.gov (United States)

    Shen, Li; Sun, Gang; Huang, Qingming; Wang, Shuhui; Lin, Zhouchen; Wu, Enhua

    2015-10-01

    The sparse coding technique has shown flexibility and capability in image representation and analysis. It is a powerful tool in many visual applications. Some recent work has shown that incorporating the properties of task (such as discrimination for classification task) into dictionary learning is effective for improving the accuracy. However, the traditional supervised dictionary learning methods suffer from high computation complexity when dealing with large number of categories, making them less satisfactory in large scale applications. In this paper, we propose a novel multi-level discriminative dictionary learning method and apply it to large scale image classification. Our method takes advantage of hierarchical category correlation to encode multi-level discriminative information. Each internal node of the category hierarchy is associated with a discriminative dictionary and a classification model. The dictionaries at different layers are learnt to capture the information of different scales. Moreover, each node at lower layers also inherits the dictionary of its parent, so that the categories at lower layers can be described with multi-scale information. The learning of dictionaries and associated classification models is jointly conducted by minimizing an overall tree loss. The experimental results on challenging data sets demonstrate that our approach achieves excellent accuracy and competitive computation cost compared with other sparse coding methods for large scale image classification.

  13. Fundamental energy limits of SET-based Brownian NAND and half-adder circuits. Preliminary findings from a physical-information-theoretic methodology

    Science.gov (United States)

    Ercan, İlke; Suyabatmaz, Enes

    2018-06-01

    The saturation in the efficiency and performance scaling of conventional electronic technologies brings about the development of novel computational paradigms. Brownian circuits are among the promising alternatives that can exploit fluctuations to increase the efficiency of information processing in nanocomputing. A Brownian cellular automaton, where signals propagate randomly and are driven by local transition rules, can be made computationally universal by embedding arbitrary asynchronous circuits on it. One of the potential realizations of such circuits is via single electron tunneling (SET) devices since SET technology enable simulation of noise and fluctuations in a fashion similar to Brownian search. In this paper, we perform a physical-information-theoretic analysis on the efficiency limitations in a Brownian NAND and half-adder circuits implemented using SET technology. The method we employed here establishes a solid ground that enables studying computational and physical features of this emerging technology on an equal footing, and yield fundamental lower bounds that provide valuable insights into how far its efficiency can be improved in principle. In order to provide a basis for comparison, we also analyze a NAND gate and half-adder circuit implemented in complementary metal oxide semiconductor technology to show how the fundamental bound of the Brownian circuit compares against a conventional paradigm.

  14. PC-based support programs coupled with the sets code for large fault tree analysis

    International Nuclear Information System (INIS)

    Hioki, K.; Nakai, R.

    1989-01-01

    Power Reactor and Nuclear Fuel Development Corporation (PNC) has developed four PC programs: IEIQ (Initiating Event Identification and Quantification), MODESTY (Modular Even Description for a Variety of Systems), FAUST (Fault Summary Tables Generation Program) and ETAAS (Event Tree Analysis Assistant System). These programs prepare the input data for the SETS (Set Equation Transformation System) code and construct and quantify event trees (E/Ts) using the output of the SETS code. The capability of these programs is described and some examples of the results are presented in this paper. With these PC programs and the SETS code, PSA can now be performed with more consistency and less manpower

  15. Large parallel volumes of finite and compact sets in d-dimensional Euclidean space

    DEFF Research Database (Denmark)

    Kampf, Jürgen; Kiderlen, Markus

    The r-parallel volume V (Cr) of a compact subset C in d-dimensional Euclidean space is the volume of the set Cr of all points of Euclidean distance at most r > 0 from C. According to Steiner’s formula, V (Cr) is a polynomial in r when C is convex. For finite sets C satisfying a certain geometric...

  16. The ESO Diffuse Interstellar Band Large Exploration Survey (EDIBLES)

    Science.gov (United States)

    Cami, J.; Cox, N. L.; Farhang, A.; Smoker, J.; Elyajouri, M.; Lallement, R.; Bacalla, X.; Bhatt, N. H.; Bron, E.; Cordiner, M. A.; de Koter, A..; Ehrenfreund, P.; Evans, C.; Foing, B. H.; Javadi, A.; Joblin, C.; Kaper, L.; Khosroshahi, H. G.; Laverick, M.; Le Petit, F..; Linnartz, H.; Marshall, C. C.; Monreal-Ibero, A.; Mulas, G.; Roueff, E.; Royer, P.; Salama, F.; Sarre, P. J.; Smith, K. T.; Spaans, M.; van Loon, J. T..; Wade, G.

    2018-03-01

    The ESO Diffuse Interstellar Band Large Exploration Survey (EDIBLES) is a Large Programme that is collecting high-signal-to-noise (S/N) spectra with UVES of a large sample of O and B-type stars covering a large spectral range. The goal of the programme is to extract a unique sample of high-quality interstellar spectra from these data, representing different physical and chemical environments, and to characterise these environments in great detail. An important component of interstellar spectra is the diffuse interstellar bands (DIBs), a set of hundreds of unidentified interstellar absorption lines. With the detailed line-of-sight information and the high-quality spectra, EDIBLES will derive strong constraints on the potential DIB carrier molecules. EDIBLES will thus guide the laboratory experiments necessary to identify these interstellar “mystery molecules”, and turn DIBs into powerful diagnostics of their environments in our Milky Way Galaxy and beyond. We present some preliminary results showing the unique capabilities of the EDIBLES programme.

  17. Goal Setting as Teacher Development Practice

    Science.gov (United States)

    Camp, Heather

    2017-01-01

    This article explores goal setting as a teacher development practice in higher education. It reports on a study of college teacher goal setting informed by goal setting theory. Analysis of study participants' goal setting practices and their experiences with goal pursuit offers a framework for thinking about the kinds of goals teachers might set…

  18. ObspyDMT: a Python toolbox for retrieving and processing large seismological data sets

    Directory of Open Access Journals (Sweden)

    K. Hosseini

    2017-10-01

    Full Text Available We present obspyDMT, a free, open-source software toolbox for the query, retrieval, processing and management of seismological data sets, including very large, heterogeneous and/or dynamically growing ones. ObspyDMT simplifies and speeds up user interaction with data centers, in more versatile ways than existing tools. The user is shielded from the complexities of interacting with different data centers and data exchange protocols and is provided with powerful diagnostic and plotting tools to check the retrieved data and metadata. While primarily a productivity tool for research seismologists and observatories, easy-to-use syntax and plotting functionality also make obspyDMT an effective teaching aid. Written in the Python programming language, it can be used as a stand-alone command-line tool (requiring no knowledge of Python or can be integrated as a module with other Python codes. It facilitates data archiving, preprocessing, instrument correction and quality control – routine but nontrivial tasks that can consume much user time. We describe obspyDMT's functionality, design and technical implementation, accompanied by an overview of its use cases. As an example of a typical problem encountered in seismogram preprocessing, we show how to check for inconsistencies in response files of two example stations. We also demonstrate the fully automated request, remote computation and retrieval of synthetic seismograms from the Synthetics Engine (Syngine web service of the Data Management Center (DMC at the Incorporated Research Institutions for Seismology (IRIS.

  19. ObspyDMT: a Python toolbox for retrieving and processing large seismological data sets

    Science.gov (United States)

    Hosseini, Kasra; Sigloch, Karin

    2017-10-01

    We present obspyDMT, a free, open-source software toolbox for the query, retrieval, processing and management of seismological data sets, including very large, heterogeneous and/or dynamically growing ones. ObspyDMT simplifies and speeds up user interaction with data centers, in more versatile ways than existing tools. The user is shielded from the complexities of interacting with different data centers and data exchange protocols and is provided with powerful diagnostic and plotting tools to check the retrieved data and metadata. While primarily a productivity tool for research seismologists and observatories, easy-to-use syntax and plotting functionality also make obspyDMT an effective teaching aid. Written in the Python programming language, it can be used as a stand-alone command-line tool (requiring no knowledge of Python) or can be integrated as a module with other Python codes. It facilitates data archiving, preprocessing, instrument correction and quality control - routine but nontrivial tasks that can consume much user time. We describe obspyDMT's functionality, design and technical implementation, accompanied by an overview of its use cases. As an example of a typical problem encountered in seismogram preprocessing, we show how to check for inconsistencies in response files of two example stations. We also demonstrate the fully automated request, remote computation and retrieval of synthetic seismograms from the Synthetics Engine (Syngine) web service of the Data Management Center (DMC) at the Incorporated Research Institutions for Seismology (IRIS).

  20. Electricity procurement for large consumers based on Information Gap Decision Theory

    International Nuclear Information System (INIS)

    Zare, Kazem; Moghaddam, Mohsen Parsa; Sheikh El Eslami, Mohammad Kazem

    2010-01-01

    In the competitive electricity market, consumers seek strategies to meet their electricity needs at minimum cost and risk. This paper provides a technique based on Information Gap Decision Theory (IGDT) to assess different procurement strategies for large consumers. Supply sources include bilateral contracts, a limited self-generating facility, and the pool. It is considered that the pool price is uncertain and its volatility around the estimated value is modeled using an IGDT model. The proposed method does not minimize the procurement cost but assesses the risk aversion or risk-taking nature of some procurement strategies with regard to the minimum cost. Using this method, the robustness of experiencing costs higher than the expected one is optimized and the related strategy is determined. The proposed method deals with optimizing the opportunities to take advantage of low procurement costs or low pool prices. A case study is used to illustrate the proposed technique.

  1. Simulating the complex output of rainfall and hydrological processes using the information contained in large data sets: the Direct Sampling approach.

    Science.gov (United States)

    Oriani, Fabio

    2017-04-01

    The unpredictable nature of rainfall makes its estimation as much difficult as it is essential to hydrological applications. Stochastic simulation is often considered a convenient approach to asses the uncertainty of rainfall processes, but preserving their irregular behavior and variability at multiple scales is a challenge even for the most advanced techniques. In this presentation, an overview on the Direct Sampling technique [1] and its recent application to rainfall and hydrological data simulation [2, 3] is given. The algorithm, having its roots in multiple-point statistics, makes use of a training data set to simulate the outcome of a process without inferring any explicit probability measure: the data are simulated in time or space by sampling the training data set where a sufficiently similar group of neighbor data exists. This approach allows preserving complex statistical dependencies at different scales with a good approximation, while reducing the parameterization to the minimum. The straights and weaknesses of the Direct Sampling approach are shown through a series of applications to rainfall and hydrological data: from time-series simulation to spatial rainfall fields conditioned by elevation or a climate scenario. In the era of vast databases, is this data-driven approach a valid alternative to parametric simulation techniques? [1] Mariethoz G., Renard P., and Straubhaar J. (2010), The Direct Sampling method to perform multiple-point geostatistical simulations, Water. Rerous. Res., 46(11), http://dx.doi.org/10.1029/2008WR007621 [2] Oriani F., Straubhaar J., Renard P., and Mariethoz G. (2014), Simulation of rainfall time series from different climatic regions using the direct sampling technique, Hydrol. Earth Syst. Sci., 18, 3015-3031, http://dx.doi.org/10.5194/hess-18-3015-2014 [3] Oriani F., Borghi A., Straubhaar J., Mariethoz G., Renard P. (2016), Missing data simulation inside flow rate time-series using multiple-point statistics, Environ. Model

  2. IMNN: Information Maximizing Neural Networks

    Science.gov (United States)

    Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.

    2018-04-01

    This software trains artificial neural networks to find non-linear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). As compressing large data sets vastly simplifies both frequentist and Bayesian inference, important information may be inadvertently missed. Likelihood-free inference based on automatically derived IMNN summaries produces summaries that are good approximations to sufficient statistics. IMNNs are robustly capable of automatically finding optimal, non-linear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima.

  3. Assessing the quality of informed consent in a resource-limited setting: A cross-sectional study

    Directory of Open Access Journals (Sweden)

    Kiguba Ronald

    2012-08-01

    Full Text Available Abstract Background The process of obtaining informed consent continues to be a contentious issue in clinical and public health research carried out in resource-limited settings. We sought to evaluate this process among human research participants in randomly selected active research studies approved by the School of Medicine Research and Ethics Committee at the College of Health Sciences, Makerere University. Methods Data were collected using semi-structured interviewer-administered questionnaires on clinic days after initial or repeat informed consent procedures for the respective clinical studies had been administered to each study participant. Results Of the 600 participants interviewed, two thirds (64.2%, 385/600 were female. Overall mean age of study participants was 37.6 (SD = 7.7 years. Amongst all participants, less than a tenth (5.9%, 35/598 reported that they were not given enough information before making a decision to participate. A similar proportion (5.7%, 34/597 reported that they had not signed a consent form prior to making a decision to participate in the study. A third (33.7%, 201/596 of the participants were not aware that they could, at any time, voluntarily withdraw participation from these studies. Participants in clinical trials were 50% less likely than those in observational studies [clinical trial vs. observational; (odds ratio, OR = 0.5; 95% CI: 0.35-0.78] to perceive that refusal to participate in the parent research project would affect their regular medical care. Conclusions Most of the participants signed informed consent forms and a vast majority felt that they received enough information before deciding to participate. On the contrary, several were not aware that they could voluntarily withdraw their participation. Participants in observational studies were more likely than those in clinical trials to perceive that refusal to participate in the parent study would affect their regular medical care.

  4. Assessing the quality of informed consent in a resource-limited setting: a cross-sectional study.

    Science.gov (United States)

    Kiguba, Ronald; Kutyabami, Paul; Kiwuwa, Stephen; Katabira, Elly; Sewankambo, Nelson K

    2012-08-21

    The process of obtaining informed consent continues to be a contentious issue in clinical and public health research carried out in resource-limited settings. We sought to evaluate this process among human research participants in randomly selected active research studies approved by the School of Medicine Research and Ethics Committee at the College of Health Sciences, Makerere University. Data were collected using semi-structured interviewer-administered questionnaires on clinic days after initial or repeat informed consent procedures for the respective clinical studies had been administered to each study participant. Of the 600 participants interviewed, two thirds (64.2%, 385/600) were female. Overall mean age of study participants was 37.6 (SD = 7.7) years. Amongst all participants, less than a tenth (5.9%, 35/598) reported that they were not given enough information before making a decision to participate. A similar proportion (5.7%, 34/597) reported that they had not signed a consent form prior to making a decision to participate in the study. A third (33.7%, 201/596) of the participants were not aware that they could, at any time, voluntarily withdraw participation from these studies. Participants in clinical trials were 50% less likely than those in observational studies [clinical trial vs. observational; (odds ratio, OR = 0.5; 95% CI: 0.35-0.78)] to perceive that refusal to participate in the parent research project would affect their regular medical care. Most of the participants signed informed consent forms and a vast majority felt that they received enough information before deciding to participate. On the contrary, several were not aware that they could voluntarily withdraw their participation. Participants in observational studies were more likely than those in clinical trials to perceive that refusal to participate in the parent study would affect their regular medical care.

  5. Analyzing large data sets acquired through telemetry from rats exposed to organophosphorous compounds: an EEG study.

    Science.gov (United States)

    de Araujo Furtado, Marcio; Zheng, Andy; Sedigh-Sarvestani, Madineh; Lumley, Lucille; Lichtenstein, Spencer; Yourick, Debra

    2009-10-30

    The organophosphorous compound soman is an acetylcholinesterase inhibitor that causes damage to the brain. Exposure to soman causes neuropathology as a result of prolonged and recurrent seizures. In the present study, long-term recordings of cortical EEG were used to develop an unbiased means to quantify measures of seizure activity in a large data set while excluding other signal types. Rats were implanted with telemetry transmitters and exposed to soman followed by treatment with therapeutics similar to those administered in the field after nerve agent exposure. EEG, activity and temperature were recorded continuously for a minimum of 2 days pre-exposure and 15 days post-exposure. A set of automatic MATLAB algorithms have been developed to remove artifacts and measure the characteristics of long-term EEG recordings. The algorithms use short-time Fourier transforms to compute the power spectrum of the signal for 2-s intervals. The spectrum is then divided into the delta, theta, alpha, and beta frequency bands. A linear fit to the power spectrum is used to distinguish normal EEG activity from artifacts and high amplitude spike wave activity. Changes in time spent in seizure over a prolonged period are a powerful indicator of the effects of novel therapeutics against seizures. A graphical user interface has been created that simultaneously plots the raw EEG in the time domain, the power spectrum, and the wavelet transform. Motor activity and temperature are associated with EEG changes. The accuracy of this algorithm is also verified against visual inspection of video recordings up to 3 days after exposure.

  6. Productive international collaboration in the large coil task

    International Nuclear Information System (INIS)

    Haubenreich, P.N.; Komarek, P.; Shimamoto, S.; Vecsey, G.

    1987-01-01

    The Large Coil Task (LCT), initiated in 1977, has been very productive of useful technical information about superconducting toroidal field (TF) coil design and manufacture. Moreover, it has demonstrated close international collaboration in fusion technology development, including integration of large components built in four different countries. Each of six 40-t test coils was designed and produced by a major industrial team, with government laboratory guidance, to a common set of specifications. The six were assembled into a toroidal array for testing in the International Fusion Superconducting Magnet Test Facility (IFSMTF) at Oak Ridge. Testing was done by a team of representatives of EURATOM, Japan, Switzerland, and the United States, with each participant having full access to all data. Coils were thoroughly instrumented, enabling penetrating analysis of behavior

  7. Hands-on Activities for Exploring the Solar System in K-14 Formal and Informal Education Settings

    Science.gov (United States)

    Allen, J. S.; Tobola, K. W.

    2004-12-01

    Introduction: Activities developed by NASA scientists and teachers focus on integrating Planetary Science activities with existing Earth science, math, and language arts curriculum. Educators may choose activities that fit a particular concept or theme within their curriculum from activities that highlight missions and research pertaining to exploring the solar system. Most of the activities use simple, inexpensive techniques that help students understand the how and why of what scientists are learning about comets, asteroids, meteorites, moons and planets. The web sites for the activities contain current information so students experience recent mission information such as data from Mars rovers or the status of Stardust sample return. The Johnson Space Center Astromaterials Research and Exploration Science education team has compiled a variety of NASA solar system activities to produce an annotated thematic syllabus useful to classroom educators and informal educators as they teach space science. An important aspect of the syllabus is that it highlights appropriate science content information and key science and math concepts so educators can easily identify activities that will enhance curriculum development. The outline contains URLs for the activities and NASA educator guides as well as links to NASA mission science and technology. In the informal setting, educators can use solar system exploration activities to reinforce learning in association with thematic displays, planetarium programs, youth group gatherings, or community events. In both the informal and the primary education levels the activities are appropriately designed to excite interest, arouse curiosity and easily take the participants from pre-awareness to the awareness stage. Middle school educators will find activities that enhance thematic science and encourage students to think about the scientific process of investigation. Some of the activities offered may easily be adapted for the upper

  8. Large deviations

    CERN Document Server

    Varadhan, S R S

    2016-01-01

    The theory of large deviations deals with rates at which probabilities of certain events decay as a natural parameter in the problem varies. This book, which is based on a graduate course on large deviations at the Courant Institute, focuses on three concrete sets of examples: (i) diffusions with small noise and the exit problem, (ii) large time behavior of Markov processes and their connection to the Feynman-Kac formula and the related large deviation behavior of the number of distinct sites visited by a random walk, and (iii) interacting particle systems, their scaling limits, and large deviations from their expected limits. For the most part the examples are worked out in detail, and in the process the subject of large deviations is developed. The book will give the reader a flavor of how large deviation theory can help in problems that are not posed directly in terms of large deviations. The reader is assumed to have some familiarity with probability, Markov processes, and interacting particle systems.

  9. Using evaluation theory in priority setting and resource allocation.

    Science.gov (United States)

    Smith, Neale; Mitton, Craig; Cornelissen, Evelyn; Gibson, Jennifer; Peacock, Stuart

    2012-01-01

    Public sector interest in methods for priority setting and program or policy evaluation has grown considerably over the last several decades, given increased expectations for accountable and efficient use of resources and emphasis on evidence-based decision making as a component of good management practice. While there has been some occasional effort to conduct evaluation of priority setting projects, the literatures around priority setting and evaluation have largely evolved separately. In this paper, the aim is to bring them together. The contention is that evaluation theory is a means by which evaluators reflect upon what it is they are doing when they do evaluation work. Theories help to organize thinking, sort out relevant from irrelevant information, provide transparent grounds for particular implementation choices, and can help resolve problematic issues which may arise in the conduct of an evaluation project. A detailed review of three major branches of evaluation theory--methods, utilization, and valuing--identifies how such theories can guide the development of efforts to evaluate priority setting and resource allocation initiatives. Evaluation theories differ in terms of their guiding question, anticipated setting or context, evaluation foci, perspective from which benefits are calculated, and typical methods endorsed. Choosing a particular theoretical approach will structure the way in which any priority setting process is evaluated. The paper suggests that explicitly considering evaluation theory makes key aspects of the evaluation process more visible to all stakeholders, and can assist in the design of effective evaluation of priority setting processes; this should iteratively serve to improve the understanding of priority setting practices themselves.

  10. The Outcome and Assessment Information Set (OASIS): A Review of Validity and Reliability

    Science.gov (United States)

    O’CONNOR, MELISSA; DAVITT, JOAN K.

    2015-01-01

    The Outcome and Assessment Information Set (OASIS) is the patient-specific, standardized assessment used in Medicare home health care to plan care, determine reimbursement, and measure quality. Since its inception in 1999, there has been debate over the reliability and validity of the OASIS as a research tool and outcome measure. A systematic literature review of English-language articles identified 12 studies published in the last 10 years examining the validity and reliability of the OASIS. Empirical findings indicate the validity and reliability of the OASIS range from low to moderate but vary depending on the item studied. Limitations in the existing research include: nonrepresentative samples; inconsistencies in methods used, items tested, measurement, and statistical procedures; and the changes to the OASIS itself over time. The inconsistencies suggest that these results are tentative at best; additional research is needed to confirm the value of the OASIS for measuring patient outcomes, research, and quality improvement. PMID:23216513

  11. Using answer set programming to integrate RNA expression with signalling pathway information to infer how mutations affect ageing.

    Science.gov (United States)

    Papatheodorou, Irene; Ziehm, Matthias; Wieser, Daniela; Alic, Nazif; Partridge, Linda; Thornton, Janet M

    2012-01-01

    A challenge of systems biology is to integrate incomplete knowledge on pathways with existing experimental data sets and relate these to measured phenotypes. Research on ageing often generates such incomplete data, creating difficulties in integrating RNA expression with information about biological processes and the phenotypes of ageing, including longevity. Here, we develop a logic-based method that employs Answer Set Programming, and use it to infer signalling effects of genetic perturbations, based on a model of the insulin signalling pathway. We apply our method to RNA expression data from Drosophila mutants in the insulin pathway that alter lifespan, in a foxo dependent fashion. We use this information to deduce how the pathway influences lifespan in the mutant animals. We also develop a method for inferring the largest common sub-paths within each of our signalling predictions. Our comparisons reveal consistent homeostatic mechanisms across both long- and short-lived mutants. The transcriptional changes observed in each mutation usually provide negative feedback to signalling predicted for that mutation. We also identify an S6K-mediated feedback in two long-lived mutants that suggests a crosstalk between these pathways in mutants of the insulin pathway, in vivo. By formulating the problem as a logic-based theory in a qualitative fashion, we are able to use the efficient search facilities of Answer Set Programming, allowing us to explore larger pathways, combine molecular changes with pathways and phenotype and infer effects on signalling in in vivo, whole-organism, mutants, where direct signalling stimulation assays are difficult to perform. Our methods are available in the web-service NetEffects: http://www.ebi.ac.uk/thornton-srv/software/NetEffects.

  12. Wind and solar resource data sets: Wind and solar resource data sets

    Energy Technology Data Exchange (ETDEWEB)

    Clifton, Andrew [National Renewable Energy Laboratory, Golden CO USA; Hodge, Bri-Mathias [National Renewable Energy Laboratory, Golden CO USA; Power Systems Engineering Center, National Renewable Energy Laboratory, Golden CO USA; Draxl, Caroline [National Renewable Energy Laboratory, Golden CO USA; National Wind Technology Center, National Renewable Energy Laboratory, Golden CO USA; Badger, Jake [Department of Wind Energy, Danish Technical University, Copenhagen Denmark; Habte, Aron [National Renewable Energy Laboratory, Golden CO USA; Power Systems Engineering Center, National Renewable Energy Laboratory, Golden CO USA

    2017-12-05

    The range of resource data sets spans from static cartography showing the mean annual wind speed or solar irradiance across a region to high temporal and high spatial resolution products that provide detailed information at a potential wind or solar energy facility. These data sets are used to support continental-scale, national, or regional renewable energy development; facilitate prospecting by developers; and enable grid integration studies. This review first provides an introduction to the wind and solar resource data sets, then provides an overview of the common methods used for their creation and validation. A brief history of wind and solar resource data sets is then presented, followed by areas for future research.

  13. Efficient gate set tomography on a multi-qubit superconducting processor

    Science.gov (United States)

    Nielsen, Erik; Rudinger, Kenneth; Blume-Kohout, Robin; Bestwick, Andrew; Bloom, Benjamin; Block, Maxwell; Caldwell, Shane; Curtis, Michael; Hudson, Alex; Orgiazzi, Jean-Luc; Papageorge, Alexander; Polloreno, Anthony; Reagor, Matt; Rubin, Nicholas; Scheer, Michael; Selvanayagam, Michael; Sete, Eyob; Sinclair, Rodney; Smith, Robert; Vahidpour, Mehrnoosh; Villiers, Marius; Zeng, William; Rigetti, Chad

    Quantum information processors with five or more qubits are becoming common. Complete, predictive characterization of such devices e.g. via any form of tomography, including gate set tomography appears impossible because the parameter space is intractably large. Randomized benchmarking scales well, but cannot predict device behavior or diagnose failure modes. We introduce a new type of gate set tomography that uses an efficient ansatz to model physically plausible errors, but scales polynomially with the number of qubits. We will describe the theory behind this multi-qubit tomography and present experimental results from using it to characterize a multi-qubit processor made by Rigetti Quantum Computing. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidary of Lockheed Martin Corporation, for the US Department of Energy's NNSA under contract DE-AC04-94AL85000.

  14. Electricity procurement for large consumers based on Information Gap Decision Theory

    Energy Technology Data Exchange (ETDEWEB)

    Zare, Kazem; Moghaddam, Mohsen Parsa; Sheikh El Eslami, Mohammad Kazem [Tarbiat Modares University, P.O. Box 14115-111, Tehran (Iran)

    2010-01-15

    In the competitive electricity market, consumers seek strategies to meet their electricity needs at minimum cost and risk. This paper provides a technique based on Information Gap Decision Theory (IGDT) to assess different procurement strategies for large consumers. Supply sources include bilateral contracts, a limited self-generating facility, and the pool. It is considered that the pool price is uncertain and its volatility around the estimated value is modeled using an IGDT model. The proposed method does not minimize the procurement cost but assesses the risk aversion or risk-taking nature of some procurement strategies with regard to the minimum cost. Using this method, the robustness of experiencing costs higher than the expected one is optimized and the related strategy is determined. The proposed method deals with optimizing the opportunities to take advantage of low procurement costs or low pool prices. A case study is used to illustrate the proposed technique. (author)

  15. The algebras of large N matrix mechanics

    Energy Technology Data Exchange (ETDEWEB)

    Halpern, M.B.; Schwartz, C.

    1999-09-16

    Extending early work, we formulate the large N matrix mechanics of general bosonic, fermionic and supersymmetric matrix models, including Matrix theory: The Hamiltonian framework of large N matrix mechanics provides a natural setting in which to study the algebras of the large N limit, including (reduced) Lie algebras, (reduced) supersymmetry algebras and free algebras. We find in particular a broad array of new free algebras which we call symmetric Cuntz algebras, interacting symmetric Cuntz algebras, symmetric Bose/Fermi/Cuntz algebras and symmetric Cuntz superalgebras, and we discuss the role of these algebras in solving the large N theory. Most important, the interacting Cuntz algebras are associated to a set of new (hidden!) local quantities which are generically conserved only at large N. A number of other new large N phenomena are also observed, including the intrinsic nonlocality of the (reduced) trace class operators of the theory and a closely related large N field identification phenomenon which is associated to another set (this time nonlocal) of new conserved quantities at large N.

  16. Developing a Data-Set for Stereopsis

    Directory of Open Access Journals (Sweden)

    D.W Hunter

    2014-08-01

    Full Text Available Current research on binocular stereopsis in humans and non-human primates has been limited by a lack of available data-sets. Current data-sets fall into two categories; stereo-image sets with vergence but no ranging information (Hibbard, 2008, Vision Research, 48(12, 1427-1439 or combinations of depth information with binocular images and video taken from cameras in fixed fronto-parallel configurations exhibiting neither vergence or focus effects (Hirschmuller & Scharstein, 2007, IEEE Conf. Computer Vision and Pattern Recognition. The techniques for generating depth information are also imperfect. Depth information is normally inaccurate or simply missing near edges and on partially occluded surfaces. For many areas of vision research these are the most interesting parts of the image (Goutcher, Hunter, Hibbard, 2013, i-Perception, 4(7, 484; Scarfe & Hibbard, 2013, Vision Research. Using state-of-the-art open-source ray-tracing software (PBRT as a back-end, our intention is to release a set of tools that will allow researchers in this field to generate artificial binocular stereoscopic data-sets. Although not as realistic as photographs, computer generated images have significant advantages in terms of control over the final output and ground-truth information about scene depth is easily calculated at all points in the scene, even partially occluded areas. While individual researchers have been developing similar stimuli by hand for many decades, we hope that our software will greatly reduce the time and difficulty of creating naturalistic binocular stimuli. Our intension in making this presentation is to elicit feedback from the vision community about what sort of features would be desirable in such software.

  17. A geometrical correction for the inter- and intra-molecular basis set superposition error in Hartree-Fock and density functional theory calculations for large systems

    Science.gov (United States)

    Kruse, Holger; Grimme, Stefan

    2012-04-01

    chemistry yields MAD=0.68 kcal/mol, which represents a huge improvement over plain B3LYP/6-31G* (MAD=2.3 kcal/mol). Application of gCP-corrected B97-D3 and HF-D3 on a set of large protein-ligand complexes prove the robustness of the method. Analytical gCP gradients make optimizations of large systems feasible with small basis sets, as demonstrated for the inter-ring distances of 9-helicene and most of the complexes in Hobza's S22 test set. The method is implemented in a freely available FORTRAN program obtainable from the author's website.

  18. A geometrical correction for the inter- and intra-molecular basis set superposition error in Hartree-Fock and density functional theory calculations for large systems.

    Science.gov (United States)

    Kruse, Holger; Grimme, Stefan

    2012-04-21

    chemistry yields MAD=0.68 kcal/mol, which represents a huge improvement over plain B3LYP/6-31G* (MAD=2.3 kcal/mol). Application of gCP-corrected B97-D3 and HF-D3 on a set of large protein-ligand complexes prove the robustness of the method. Analytical gCP gradients make optimizations of large systems feasible with small basis sets, as demonstrated for the inter-ring distances of 9-helicene and most of the complexes in Hobza's S22 test set. The method is implemented in a freely available FORTRAN program obtainable from the author's website.

  19. A Model to Identify the Most Effective Business Rule in Information Systems using Rough Set Theory: Study on Loan Business Process

    Directory of Open Access Journals (Sweden)

    Mohammad Aghdasi

    2011-09-01

    In this paper, a practical model is used to identify the most effective rules in information systems. In this model, first, critical business attributes which fit to strategic expectations are taken into account. These are the attributes which their changes are more important than others in achieving the strategic expectations. To identify these attributes we utilize rough set theory. Those business rules which use critical information attribute in their structures are identified as the most effective business rules. The Proposed model helps information system developers to identify scope of effective business rules. It causes a decrease in time and cost of information system maintenance. Also it helps business analyst to focus on managing critical business attributes in order to achieve a specific goal.

  20. Large datasets: Segmentation, feature extraction, and compression

    Energy Technology Data Exchange (ETDEWEB)

    Downing, D.J.; Fedorov, V.; Lawkins, W.F.; Morris, M.D.; Ostrouchov, G.

    1996-07-01

    Large data sets with more than several mission multivariate observations (tens of megabytes or gigabytes of stored information) are difficult or impossible to analyze with traditional software. The amount of output which must be scanned quickly dilutes the ability of the investigator to confidently identify all the meaningful patterns and trends which may be present. The purpose of this project is to develop both a theoretical foundation and a collection of tools for automated feature extraction that can be easily customized to specific applications. Cluster analysis techniques are applied as a final step in the feature extraction process, which helps make data surveying simple and effective.

  1. Large-scale sequential quadratic programming algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Eldersveld, S.K.

    1992-09-01

    The problem addressed is the general nonlinear programming problem: finding a local minimizer for a nonlinear function subject to a mixture of nonlinear equality and inequality constraints. The methods studied are in the class of sequential quadratic programming (SQP) algorithms, which have previously proved successful for problems of moderate size. Our goal is to devise an SQP algorithm that is applicable to large-scale optimization problems, using sparse data structures and storing less curvature information but maintaining the property of superlinear convergence. The main features are: 1. The use of a quasi-Newton approximation to the reduced Hessian of the Lagrangian function. Only an estimate of the reduced Hessian matrix is required by our algorithm. The impact of not having available the full Hessian approximation is studied and alternative estimates are constructed. 2. The use of a transformation matrix Q. This allows the QP gradient to be computed easily when only the reduced Hessian approximation is maintained. 3. The use of a reduced-gradient form of the basis for the null space of the working set. This choice of basis is more practical than an orthogonal null-space basis for large-scale problems. The continuity condition for this choice is proven. 4. The use of incomplete solutions of quadratic programming subproblems. Certain iterates generated by an active-set method for the QP subproblem are used in place of the QP minimizer to define the search direction for the nonlinear problem. An implementation of the new algorithm has been obtained by modifying the code MINOS. Results and comparisons with MINOS and NPSOL are given for the new algorithm on a set of 92 test problems.

  2. Decentralized health care priority-setting in Tanzania

    DEFF Research Database (Denmark)

    Maluka, Stephen; Kamuzora, Peter; Sebastiån, Miguel San

    2010-01-01

    Priority-setting has become one of the biggest challenges faced by health decision-makers worldwide. Fairness is a key goal of priority-setting and Accountability for Reasonableness has emerged as a guiding framework for fair priority-setting. This paper describes the processes of setting health...... care priorities in Mbarali district, Tanzania, and evaluates the descriptions against Accountability for Reasonableness. Key informant interviews were conducted with district health managers, local government officials and other stakeholders using a semi-structured interview guide. Relevant documents...... no formal mechanisms in place to ensure that this information reached the public. There were neither formal mechanisms for challenging decisions nor an adequate enforcement mechanism to ensure that decisions were made in a fair and equitable manner. Therefore, priority-setting in Mbarali district did...

  3. An advanced joint inversion system for CO2 storage modeling with large date sets for characterization and real-time monitoring-enhancing storage performance and reducing failure risks under uncertainties

    Energy Technology Data Exchange (ETDEWEB)

    Kitanidis, Peter [Stanford Univ., CA (United States)

    2016-04-30

    As large-scale, commercial storage projects become operational, the problem of utilizing information from diverse sources becomes more critically important. In this project, we developed, tested, and applied an advanced joint data inversion system for CO2 storage modeling with large data sets for use in site characterization and real-time monitoring. Emphasis was on the development of advanced and efficient computational algorithms for joint inversion of hydro-geophysical data, coupled with state-of-the-art forward process simulations. The developed system consists of (1) inversion tools using characterization data, such as 3D seismic survey (amplitude images), borehole log and core data, as well as hydraulic, tracer and thermal tests before CO2 injection, (2) joint inversion tools for updating the geologic model with the distribution of rock properties, thus reducing uncertainty, using hydro-geophysical monitoring data, and (3) highly efficient algorithms for directly solving the dense or sparse linear algebra systems derived from the joint inversion. The system combines methods from stochastic analysis, fast linear algebra, and high performance computing. The developed joint inversion tools have been tested through synthetic CO2 storage examples.

  4. Development a minimum data set of the information management system for burns.

    Science.gov (United States)

    Ahmadi, Maryam; Alipour, Jahanpour; Mohammadi, Ali; Khorami, Farid

    2015-08-01

    Burns are the most common and destructive injuries in across of the world and especially in developing countries. Nevertheless, a standard tool for collecting the data of burn injury has not been developed yet. The purpose of this study was to develop a minimum data set (MDS) of the information management system for burns in Iran. This descriptive and cross-sectional study was performed in 2014. Data were collected from hospitals affiliated with Hormozgan and Iran University of Medical Sciences and medical documents centers, emergency centers and legal medicine centers located in Bandar Abbas city, in addition to internet access and library. Investigated documents were burn injury records in 2013, and documents that retrieved from the internet, and printed materials. Records were selected randomly based on T20-T29 categories from ICD-10. Data were collected using a checklist. In order to make a consensus about the data elements the decision Delphi technique was applied using a questionnaire. The content validity and reliability of questionnaire were assessed by expert's opinions and test-retest method, respectively. An MDS of burns was developed. This MDS divided into two categories: administrative and clinical with six and 17 section and 161 and 311 data elements respectively. This study showed that comprehensive and uniform data elements about burns do not exist in Iran. Therefore a MDS was developed for burns in Iran. Development of an MDS will result in standardization and effective management of the data through providing uniform and comprehensive data elements for burns. Thus, comparability of the extracted information from different analyses and researches will be possible in various levels. In addition, establishment of policies and prevention and control of burns will be possible, which results in the improvement of the quality of care and containment of costs. Copyright © 2014 Elsevier Ltd and ISBI. All rights reserved.

  5. Two decades of satellite observations of AOD over mainland China using ATSR-2, AATSR and MODIS/Terra: data set evaluation and large-scale patterns

    Science.gov (United States)

    de Leeuw, Gerrit; Sogacheva, Larisa; Rodriguez, Edith; Kourtidis, Konstantinos; Georgoulias, Aristeidis K.; Alexandri, Georgia; Amiridis, Vassilis; Proestakis, Emmanouil; Marinou, Eleni; Xue, Yong; van der A, Ronald

    2018-02-01

    The retrieval of aerosol properties from satellite observations provides their spatial distribution over a wide area in cloud-free conditions. As such, they complement ground-based measurements by providing information over sparsely instrumented areas, albeit that significant differences may exist in both the type of information obtained and the temporal information from satellite and ground-based observations. In this paper, information from different types of satellite-based instruments is used to provide a 3-D climatology of aerosol properties over mainland China, i.e., vertical profiles of extinction coefficients from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP), a lidar flying aboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) satellite and the column-integrated extinction (aerosol optical depth - AOD) available from three radiometers: the European Space Agency (ESA)'s Along-Track Scanning Radiometer version 2 (ATSR-2), Advanced Along-Track Scanning Radiometer (AATSR) (together referred to as ATSR) and NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the Terra satellite, together spanning the period 1995-2015. AOD data are retrieved from ATSR using the ATSR dual view (ADV) v2.31 algorithm, while for MODIS Collection 6 (C6) the AOD data set is used that was obtained from merging the AODs obtained from the dark target (DT) and deep blue (DB) algorithms, further referred to as the DTDB merged AOD product. These data sets are validated and differences are compared using Aerosol Robotic Network (AERONET) version 2 L2.0 AOD data as reference. The results show that, over China, ATSR slightly underestimates the AOD and MODIS slightly overestimates the AOD. Consequently, ATSR AOD is overall lower than that from MODIS, and the difference increases with increasing AOD. The comparison also shows that neither of the ATSR and MODIS AOD data sets is better than the other one everywhere. However, ATSR ADV

  6. Inequalities for quantum skew information

    DEFF Research Database (Denmark)

    Audenaert, Koenraad; Cai, Liang; Hansen, Frank

    2008-01-01

    relation on the set of functions representing quantum Fisher information that renders the set into a lattice with an involution. This order structure generates new inequalities for the metric adjusted skew informations. In particular, the Wigner-Yanase skew information is the maximal skew information...... with respect to this order structure in the set of Wigner-Yanase-Dyson skew informations....

  7. Modelling and management of subjective information in a fuzzy setting

    Science.gov (United States)

    Bouchon-Meunier, Bernadette; Lesot, Marie-Jeanne; Marsala, Christophe

    2013-01-01

    Subjective information is very natural for human beings. It is an issue at the crossroad of cognition, semiotics, linguistics, and psycho-physiology. Its management requires dedicated methods, among which we point out the usefulness of fuzzy and possibilistic approaches and related methods, such as evidence theory. We distinguish three aspects of subjectivity: the first deals with perception and sensory information, including the elicitation of quality assessment and the establishment of a link between physical and perceived properties; the second is related to emotions, their fuzzy nature, and their identification; and the last aspect stems from natural language and takes into account information quality and reliability of information.

  8. Determining an Estimate of an Equivalence Relation for Moderate and Large Sized Sets

    Directory of Open Access Journals (Sweden)

    Leszek Klukowski

    2017-01-01

    Full Text Available This paper presents two approaches to determining estimates of an equivalence relation on the basis of pairwise comparisons with random errors. Obtaining such an estimate requires the solution of a discrete programming problem which minimizes the sum of the differences between the form of the relation and the comparisons. The problem is NP hard and can be solved with the use of exact algorithms for sets of moderate size, i.e. about 50 elements. In the case of larger sets, i.e. at least 200 comparisons for each element, it is necessary to apply heuristic algorithms. The paper presents results (a statistical preprocessing, which enable us to determine the optimal or a near-optimal solution with acceptable computational cost. They include: the development of a statistical procedure producing comparisons with low probabilities of errors and a heuristic algorithm based on such comparisons. The proposed approach guarantees the applicability of such estimators for any size of set. (original abstract

  9. Setting healthcare priorities in hospitals: a review of empirical studies.

    Science.gov (United States)

    Barasa, Edwine W; Molyneux, Sassy; English, Mike; Cleary, Susan

    2015-04-01

    Priority setting research has focused on the macro (national) and micro (bedside) level, leaving the meso (institutional, hospital) level relatively neglected. This is surprising given the key role that hospitals play in the delivery of healthcare services and the large proportion of health systems resources that they absorb. To explore the factors that impact upon priority setting at the hospital level, we conducted a thematic review of empirical studies. A systematic search of PubMed, EBSCOHOST, Econlit databases and Google scholar was supplemented by a search of key websites and a manual search of relevant papers' reference lists. A total of 24 papers were identified from developed and developing countries. We applied a policy analysis framework to examine and synthesize the findings of the selected papers. Findings suggest that priority setting practice in hospitals was influenced by (1) contextual factors such as decision space, resource availability, financing arrangements, availability and use of information, organizational culture and leadership, (2) priority setting processes that depend on the type of priority setting activity, (3) content factors such as priority setting criteria and (4) actors, their interests and power relations. We observe that there is need for studies to examine these issues and the interplay between them in greater depth and propose a conceptual framework that might be useful in examining priority setting practices in hospitals. Published by Oxford University Press in association with The London School of Hygiene and Tropical Medicine © The Author 2014; all rights reserved.

  10. Deterministic sensitivity and uncertainty analysis for large-scale computer models

    International Nuclear Information System (INIS)

    Worley, B.A.; Pin, F.G.; Oblow, E.M.; Maerker, R.E.; Horwedel, J.E.; Wright, R.Q.

    1988-01-01

    The fields of sensitivity and uncertainty analysis have traditionally been dominated by statistical techniques when large-scale modeling codes are being analyzed. These methods are able to estimate sensitivities, generate response surfaces, and estimate response probability distributions given the input parameter probability distributions. Because the statistical methods are computationally costly, they are usually applied only to problems with relatively small parameter sets. Deterministic methods, on the other hand, are very efficient and can handle large data sets, but generally require simpler models because of the considerable programming effort required for their implementation. The first part of this paper reports on the development and availability of two systems, GRESS and ADGEN, that make use of computer calculus compilers to automate the implementation of deterministic sensitivity analysis capability into existing computer models. This automation removes the traditional limitation of deterministic sensitivity methods. This second part of the paper describes a deterministic uncertainty analysis method (DUA) that uses derivative information as a basis to propagate parameter probability distributions to obtain result probability distributions. This paper is applicable to low-level radioactive waste disposal system performance assessment

  11. Evidence-informed capacity building for setting health priorities in low- and middle-income countries: A framework and recommendations for further research.

    Science.gov (United States)

    Li, Ryan; Ruiz, Francis; Culyer, Anthony J; Chalkidou, Kalipso; Hofman, Karen J

    2017-01-01

    Priority-setting in health is risky and challenging, particularly in resource-constrained settings. It is not simply a narrow technical exercise, and involves the mobilisation of a wide range of capacities among stakeholders - not only the technical capacity to "do" research in economic evaluations. Using the Individuals, Nodes, Networks and Environment (INNE) framework, we identify those stakeholders, whose capacity needs will vary along the evidence-to-policy continuum. Policymakers and healthcare managers require the capacity to commission and use relevant evidence (including evidence of clinical and cost-effectiveness, and of social values); academics need to understand and respond to decision-makers' needs to produce relevant research. The health system at all levels will need institutional capacity building to incentivise routine generation and use of evidence. Knowledge brokers, including priority-setting agencies (such as England's National Institute for Health and Care Excellence, and Health Interventions and Technology Assessment Program, Thailand) and the media can play an important role in facilitating engagement and knowledge transfer between the various actors. Especially at the outset but at every step, it is critical that patients and the public understand that trade-offs are inherent in priority-setting, and careful efforts should be made to engage them, and to hear their views throughout the process. There is thus no single approach to capacity building; rather a spectrum of activities that recognises the roles and skills of all stakeholders. A range of methods, including formal and informal training, networking and engagement, and support through collaboration on projects, should be flexibly employed (and tailored to specific needs of each country) to support institutionalisation of evidence-informed priority-setting. Finally, capacity building should be a two-way process; those who build capacity should also attend to their own capacity

  12. Agenda-setting for Canadian caregivers: using media analysis of the maternity leave benefit to inform the compassionate care benefit.

    Science.gov (United States)

    Dykeman, Sarah; Williams, Allison M

    2014-04-24

    The Compassionate Care Benefit was implemented in Canada in 2004 to support employed informal caregivers, the majority of which we know are women given the gendered nature of caregiving. In order to examine how this policy might evolve over time, we examine the evolution of a similar employment insurance program, Canada's Maternity Leave Benefit. National media articles were reviewed (n = 2,698) and, based on explicit criteria, were analyzed using content analysis. Through the application of Kingdon's policy agenda-setting framework, the results define key recommendations for the Compassionate Care Benefit, as informed by the developmental trajectory of the Maternity Leave Benefit. Recommendations for revising the Compassionate Care Benefit are made.

  13. Getting the Most out of Macroeconomic Information for Predicting Stock Returns and Volatility

    NARCIS (Netherlands)

    C. Cakmakli (Cem); D.J.C. van Dijk (Dick)

    2010-01-01

    textabstractThis paper documents that factors extracted from a large set of macroeconomic variables bear useful information for predicting monthly US excess stock returns and volatility over the period 1980-2005. Factor-augmented predictive regression models improve upon both benchmark models that

  14. Getting the most out of macroeconomic information for predicting stock returns and volatility

    NARCIS (Netherlands)

    Cakmakli, C.; van Dijk, D.

    2011-01-01

    This paper documents that factors extracted from a large set of macroeconomic variables bear useful information for predicting monthly US excess stock returns and volatility over the period 1980-2005. Factor-augmented predictive regression models improve upon both benchmark models that only include

  15. Diverse Data Sets Can Yield Reliable Information through Mechanistic Modeling: Salicylic Acid Clearance.

    Science.gov (United States)

    Raymond, G M; Bassingthwaighte, J B

    This is a practical example of a powerful research strategy: putting together data from studies covering a diversity of conditions can yield a scientifically sound grasp of the phenomenon when the individual observations failed to provide definitive understanding. The rationale is that defining a realistic, quantitative, explanatory hypothesis for the whole set of studies, brings about a "consilience" of the often competing hypotheses considered for individual data sets. An internally consistent conjecture linking multiple data sets simultaneously provides stronger evidence on the characteristics of a system than does analysis of individual data sets limited to narrow ranges of conditions. Our example examines three very different data sets on the clearance of salicylic acid from humans: a high concentration set from aspirin overdoses; a set with medium concentrations from a research study on the influences of the route of administration and of sex on the clearance kinetics, and a set on low dose aspirin for cardiovascular health. Three models were tested: (1) a first order reaction, (2) a Michaelis-Menten (M-M) approach, and (3) an enzyme kinetic model with forward and backward reactions. The reaction rates found from model 1 were distinctly different for the three data sets, having no commonality. The M-M model 2 fitted each of the three data sets but gave a reliable estimates of the Michaelis constant only for the medium level data (K m = 24±5.4 mg/L); analyzing the three data sets together with model 2 gave K m = 18±2.6 mg/L. (Estimating parameters using larger numbers of data points in an optimization increases the degrees of freedom, constraining the range of the estimates). Using the enzyme kinetic model (3) increased the number of free parameters but nevertheless improved the goodness of fit to the combined data sets, giving tighter constraints, and a lower estimated K m = 14.6±2.9 mg/L, demonstrating that fitting diverse data sets with a single model

  16. Legume information system (LegumeInfo.org): a key component of a set of federated data resources for the legume family.

    Science.gov (United States)

    Dash, Sudhansu; Campbell, Jacqueline D; Cannon, Ethalinda K S; Cleary, Alan M; Huang, Wei; Kalberer, Scott R; Karingula, Vijay; Rice, Alex G; Singh, Jugpreet; Umale, Pooja E; Weeks, Nathan T; Wilkey, Andrew P; Farmer, Andrew D; Cannon, Steven B

    2016-01-04

    Legume Information System (LIS), at http://legumeinfo.org, is a genomic data portal (GDP) for the legume family. LIS provides access to genetic and genomic information for major crop and model legumes. With more than two-dozen domesticated legume species, there are numerous specialists working on particular species, and also numerous GDPs for these species. LIS has been redesigned in the last three years both to better integrate data sets across the crop and model legumes, and to better accommodate specialized GDPs that serve particular legume species. To integrate data sets, LIS provides genome and map viewers, holds synteny mappings among all sequenced legume species and provides a set of gene families to allow traversal among orthologous and paralogous sequences across the legumes. To better accommodate other specialized GDPs, LIS uses open-source GMOD components where possible, and advocates use of common data templates, formats, schemas and interfaces so that data collected by one legume research community are accessible across all legume GDPs, through similar interfaces and using common APIs. This federated model for the legumes is managed as part of the 'Legume Federation' project (accessible via http://legumefederation.org), which can be thought of as an umbrella project encompassing LIS and other legume GDPs. Published by Oxford University Press on behalf of Nucleic Acids Research 2015. This work is written by (a) US Government employee(s) and is in the public domain in the US.

  17. Spatial fingerprints of community structure in human interaction network for an extensive set of large-scale regions.

    Science.gov (United States)

    Kallus, Zsófia; Barankai, Norbert; Szüle, János; Vattay, Gábor

    2015-01-01

    Human interaction networks inferred from country-wide telephone activity recordings were recently used to redraw political maps by projecting their topological partitions into geographical space. The results showed remarkable spatial cohesiveness of the network communities and a significant overlap between the redrawn and the administrative borders. Here we present a similar analysis based on one of the most popular online social networks represented by the ties between more than 5.8 million of its geo-located users. The worldwide coverage of their measured activity allowed us to analyze the large-scale regional subgraphs of entire continents and an extensive set of examples for single countries. We present results for North and South America, Europe and Asia. In our analysis we used the well-established method of modularity clustering after an aggregation of the individual links into a weighted graph connecting equal-area geographical pixels. Our results show fingerprints of both of the opposing forces of dividing local conflicts and of uniting cross-cultural trends of globalization.

  18. A quantitative perspective on ethics in large team science.

    Science.gov (United States)

    Petersen, Alexander M; Pavlidis, Ioannis; Semendeferi, Ioanna

    2014-12-01

    The gradual crowding out of singleton and small team science by large team endeavors is challenging key features of research culture. It is therefore important for the future of scientific practice to reflect upon the individual scientist's ethical responsibilities within teams. To facilitate this reflection we show labor force trends in the US revealing a skewed growth in academic ranks and increased levels of competition for promotion within the system; we analyze teaming trends across disciplines and national borders demonstrating why it is becoming difficult to distribute credit and to avoid conflicts of interest; and we use more than a century of Nobel prize data to show how science is outgrowing its old institutions of singleton awards. Of particular concern within the large team environment is the weakening of the mentor-mentee relation, which undermines the cultivation of virtue ethics across scientific generations. These trends and emerging organizational complexities call for a universal set of behavioral norms that transcend team heterogeneity and hierarchy. To this end, our expository analysis provides a survey of ethical issues in team settings to inform science ethics education and science policy.

  19. Usability-driven pruning of large ontologies: the case of SNOMED CT.

    Science.gov (United States)

    López-García, Pablo; Boeker, Martin; Illarramendi, Arantza; Schulz, Stefan

    2012-06-01

    To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. Graph-traversal heuristics provided high coverage (71-96% of terms in the test sets of discharge summaries) at the expense of subset size (17-51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24-55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available.

  20. Response to a Large Polio Outbreak in a Setting of Conflict - Middle East, 2013-2015.

    Science.gov (United States)

    Mbaeyi, Chukwuma; Ryan, Michael J; Smith, Philip; Mahamud, Abdirahman; Farag, Noha; Haithami, Salah; Sharaf, Magdi; Jorba, Jaume C; Ehrhardt, Derek

    2017-03-03

    As the world advances toward the eradication of polio, outbreaks of wild poliovirus (WPV) in polio-free regions pose a substantial risk to the timeline for global eradication. Countries and regions experiencing active conflict, chronic insecurity, and large-scale displacement of persons are particularly vulnerable to outbreaks because of the disruption of health care and immunization services (1). A polio outbreak occurred in the Middle East, beginning in Syria in 2013 with subsequent spread to Iraq (2). The outbreak occurred 2 years after the onset of the Syrian civil war, resulted in 38 cases, and was the first time WPV was detected in Syria in approximately a decade (3,4). The national governments of eight countries designated the outbreak a public health emergency and collaborated with partners in the Global Polio Eradication Initiative (GPEI) to develop a multiphase outbreak response plan focused on improving the quality of acute flaccid paralysis (AFP) surveillance* and administering polio vaccines to >27 million children during multiple rounds of supplementary immunization activities (SIAs). † Successful implementation of the response plan led to containment and interruption of the outbreak within 6 months of its identification. The concerted approach adopted in response to this outbreak could serve as a model for responding to polio outbreaks in settings of conflict and political instability.

  1. Contextual control over task-set retrieval.

    Science.gov (United States)

    Crump, Matthew J C; Logan, Gordon D

    2010-11-01

    Contextual cues signaling task likelihood or the likelihood of task repetition are known to modulate the size of switch costs. We follow up on the finding by Leboe, Wong, Crump, and Stobbe (2008) that location cues predictive of the proportion of switch or repeat trials modulate switch costs. Their design employed one cue per task, whereas our experiment employed two cues per task, which allowed separate assessment of modulations to the cue-repetition benefit, a measure of lower level cue-encoding processes, and to the task-alternation cost, a measure of higher level processes representing task-set information. We demonstrate that location information predictive of switch proportion modulates performance at the level of task-set representations. Furthermore, we demonstrate that contextual control occurs even when subjects are unaware of the associations between context and switch likelihood. We discuss the notion that contextual information provides rapid, unconscious control over the extent to which prior task-set representations are retrieved in the service of guiding online performance.

  2. A behavior setting assessment for community programs and residences.

    Science.gov (United States)

    Perkins, D V; Baker, F

    1991-10-01

    Using the concept of person-environment fit to determine the effectiveness of residential and program placements for chronic psychiatric clients requires systematic and concrete information about these community environments in addition to information about the clients themselves. The conceptual and empirical development of the Behavior Setting Assessment (BSA), a measure based on Barker's behavior setting theory, is described. Use of the BSA with 28 residences (117 settings) and 11 programs (176 settings) from two community support systems demonstrated that all 293 settings assessed could be described and analyzed in terms of differences in their demands for self-care skills, food preparation and consumption, verbal/cognitive responses, and solitary or group activities. The BSA is an efficient measure for obtaining specific, concrete information about the behavioral demands of important community environments.

  3. Viewers Extract Mean and Individual Identity from Sets of Famous Faces

    Science.gov (United States)

    Neumann, Markus F.; Schweinberger, Stefan R.; Burton, A. Mike

    2013-01-01

    When viewers are shown sets of similar objects (for example circles), they may extract summary information (e.g., average size) while retaining almost no information about the individual items. A similar observation can be made when using sets of unfamiliar faces: Viewers tend to merge identity or expression information from the set exemplars into…

  4. Hierarchical models for informing general biomass equations with felled tree data

    Science.gov (United States)

    Brian J. Clough; Matthew B. Russell; Christopher W. Woodall; Grant M. Domke; Philip J. Radtke

    2015-01-01

    We present a hierarchical framework that uses a large multispecies felled tree database to inform a set of general models for predicting tree foliage biomass, with accompanying uncertainty, within the FIA database. Results suggest significant prediction uncertainty for individual trees and reveal higher errors when predicting foliage biomass for larger trees and for...

  5. Efficiency of Choice Set Generation Methods for Bicycle Routes

    DEFF Research Database (Denmark)

    Halldórsdóttir, Katrín; Rieser-Schüssler, Nadine; W. Axhausen, Kay

    behaviour, observed choices and alternatives composing the choice set of each cyclist are necessary. However, generating the alternative choice sets can prove challenging. This paper analyses the efficiency of various choice set generation methods for bicycle routes in order to contribute to our...... travelling information with GPS loggers, compared to self-reported RP data, is more accurate geographic locations and routes. Also, the GPS traces give more reliable information on times and prevent trip underreporting, and it is possible to collect information on many trips by the same person without...

  6. Coordinated SLNR based Precoding in Large-Scale Heterogeneous Networks

    KAUST Repository

    Boukhedimi, Ikram

    2017-03-06

    This work focuses on the downlink of large-scale two-tier heterogeneous networks composed of a macro-cell overlaid by micro-cell networks. Our interest is on the design of coordinated beamforming techniques that allow to mitigate the inter-cell interference. Particularly, we consider the case in which the coordinating base stations (BSs) have imperfect knowledge of the channel state information. Under this setting, we propose a regularized SLNR based precoding design in which the regularization factor is used to allow better resilience with respect to the channel estimation errors. Based on tools from random matrix theory, we provide an analytical analysis of the SINR and SLNR performances. These results are then exploited to propose a proper setting of the regularization factor. Simulation results are finally provided in order to validate our findings and to confirm the performance of the proposed precoding scheme.

  7. Coordinated SLNR based Precoding in Large-Scale Heterogeneous Networks

    KAUST Repository

    Boukhedimi, Ikram; Kammoun, Abla; Alouini, Mohamed-Slim

    2017-01-01

    This work focuses on the downlink of large-scale two-tier heterogeneous networks composed of a macro-cell overlaid by micro-cell networks. Our interest is on the design of coordinated beamforming techniques that allow to mitigate the inter-cell interference. Particularly, we consider the case in which the coordinating base stations (BSs) have imperfect knowledge of the channel state information. Under this setting, we propose a regularized SLNR based precoding design in which the regularization factor is used to allow better resilience with respect to the channel estimation errors. Based on tools from random matrix theory, we provide an analytical analysis of the SINR and SLNR performances. These results are then exploited to propose a proper setting of the regularization factor. Simulation results are finally provided in order to validate our findings and to confirm the performance of the proposed precoding scheme.

  8. Towards a Set Theoretical Approach to Big Data Analytics

    DEFF Research Database (Denmark)

    Mukkamala, Raghava Rao; Hussain, Abid; Vatrapu, Ravi

    2014-01-01

    Formal methods, models and tools for social big data analytics are largely limited to graph theoretical approaches such as social network analysis (SNA) informed by relational sociology. There are no other unified modeling approaches to social big data that integrate the conceptual, formal...... this technique to the data analysis of big social data collected from Facebook page of the fast fashion company, H&M....... and software realms. In this paper, we first present and discuss a theory and conceptual model of social data. Second, we outline a formal model based on set theory and discuss the semantics of the formal model with a real-world social data example from Facebook. Third, we briefly present and discuss...

  9. HEDIS Limited Data Set

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Healthcare Effectiveness Data and Information Set (HEDIS) is a tool used by more than 90 percent of Americas health plans to measure performance on important...

  10. Automated syndrome detection in a set of clinical facial photographs.

    Science.gov (United States)

    Boehringer, Stefan; Guenther, Manuel; Sinigerova, Stella; Wurtz, Rolf P; Horsthemke, Bernhard; Wieczorek, Dagmar

    2011-09-01

    Computer systems play an important role in clinical genetics and are a routine part of finding clinical diagnoses but make it difficult to fully exploit information derived from facial appearance. So far, automated syndrome diagnosis based on digital, facial photographs has been demonstrated under study conditions but has not been applied in clinical practice. We have therefore investigated how well statistical classifiers trained on study data comprising 202 individuals affected by one of 14 syndromes could classify a set of 91 patients for whom pictures were taken under regular, less controlled conditions in clinical practice. We found a classification accuracy of 21% percent in the clinical sample representing a ratio of 3.0 over a random choice. This contrasts with a 60% accuracy or 8.5 ratio in the training data. Producing average images in both groups from sets of pictures for each syndrome demonstrates that the groups exhibit large phenotypic differences explaining discrepancies in accuracy. A broadening of the data set is suggested in order to improve accuracy in clinical practice. In order to further this goal, a software package is made available that allows application of the procedures and contributions toward an improved data set. Copyright © 2011 Wiley-Liss, Inc.

  11. Promoting Shifts in Preservice Science Teachers' Thinking through Teaching and Action Research in Informal Science Settings

    Science.gov (United States)

    Wallace, Carolyn S.

    2013-08-01

    The purpose of this study was to investigate the influence of an integrated experiential learning and action research project on preservice science teachers' developing ideas about science teaching, learning, and action research itself. The qualitative, interpretive study examined the action research of 10 master's degree students who were involved in service learning with children in informal education settings. Results indicated that all of the participants enhanced their knowledge of children as diverse learners and the importance of prior knowledge in science learning. In-depth case studies for three of the participants indicated that two developed deeper understandings of science learners and learning. However, one participant was resistant to learning and gained more limited understandings.

  12. Design of Availability-Dependent Distributed Services in Large-Scale Uncooperative Settings

    Science.gov (United States)

    Morales, Ramses Victor

    2009-01-01

    Thesis Statement: "Availability-dependent global predicates can be efficiently and scalably realized for a class of distributed services, in spite of specific selfish and colluding behaviors, using local and decentralized protocols". Several types of large-scale distributed systems spanning the Internet have to deal with availability variations…

  13. Machine Learning Algorithms for Statistical Patterns in Large Data Sets

    Science.gov (United States)

    2018-02-01

    SUBJECT TERMS Text Analysis, Text Exploitation, Situation Awareness of Text , Document Processing, Document Ingestion, Full Text Search, Information...Assortativity: Proclivity Index for Attributed Networks (PRONE).” Pacific-Asia Conference on Knowledge Discovery and Data Mining , 2017. pp. 225-237...international conference on Knowledge discovery and data mining , 2013. pp. 212-220. [18] Sutherland, D.J., Xiong, L., Póczos, B., and Schneider, J

  14. Superfund TIO videos: Set B. Community relations, communicating with the media and presenting technical information. Part 9. Audio-Visual

    International Nuclear Information System (INIS)

    1990-01-01

    The videotape is divided into three sections. Section 1 discusses the Superfund Community Relations (CR) Program and its history and objectives. Community Relations requirements as defined by CERCLA for Superfund actions are outlined. Community Relations requirements, the nature of community involvement in CR plans, effective CR techniques, and the roles of the OSC, RPM, and EPA Community Relations Coordinator (CRC) are discussed. Section 2 (1) describes the media's perspective on seeking information; (2) identifies five settings and mechanisms for interacting with the media; (3) offers good media-relations techniques; and (4) lists tips for conducting media interviews. Section 3 outlines techniques for presenting technical information, describes how to be prepared to address typical issues of community concern, and identifies the four key elements in handling tough questions

  15. How to derive biological information from the value of the normalization constant in allometric equations.

    Science.gov (United States)

    Kaitaniemi, Pekka

    2008-04-09

    Allometric equations are widely used in many branches of biological science. The potential information content of the normalization constant b in allometric equations of the form Y = bX(a) has, however, remained largely neglected. To demonstrate the potential for utilizing this information, I generated a large number of artificial datasets that resembled those that are frequently encountered in biological studies, i.e., relatively small samples including measurement error or uncontrolled variation. The value of X was allowed to vary randomly within the limits describing different data ranges, and a was set to a fixed theoretical value. The constant b was set to a range of values describing the effect of a continuous environmental variable. In addition, a normally distributed random error was added to the values of both X and Y. Two different approaches were then used to model the data. The traditional approach estimated both a and b using a regression model, whereas an alternative approach set the exponent a at its theoretical value and only estimated the value of b. Both approaches produced virtually the same model fit with less than 0.3% difference in the coefficient of determination. Only the alternative approach was able to precisely reproduce the effect of the environmental variable, which was largely lost among noise variation when using the traditional approach. The results show how the value of b can be used as a source of valuable biological information if an appropriate regression model is selected.

  16. Distributed Large Independent Sets in One Round On Bounded-independence Graphs

    OpenAIRE

    Halldorsson , Magnus M.; Konrad , Christian

    2015-01-01

    International audience; We present a randomized one-round, single-bit messages, distributed algorithm for the maximum independent set problem in polynomially bounded-independence graphs with poly-logarithmic approximation factor. Bounded-independence graphs capture various models of wireless networks such as the unit disc graphs model and the quasi unit disc graphs model. For instance, on unit disc graphs, our achieved approximation ratio is O((log(n)/log(log(n)))^2).A starting point of our w...

  17. How large a training set is needed to develop a classifier for microarray data?

    Science.gov (United States)

    Dobbin, Kevin K; Zhao, Yingdong; Simon, Richard M

    2008-01-01

    A common goal of gene expression microarray studies is the development of a classifier that can be used to divide patients into groups with different prognoses, or with different expected responses to a therapy. These types of classifiers are developed on a training set, which is the set of samples used to train a classifier. The question of how many samples are needed in the training set to produce a good classifier from high-dimensional microarray data is challenging. We present a model-based approach to determining the sample size required to adequately train a classifier. It is shown that sample size can be determined from three quantities: standardized fold change, class prevalence, and number of genes or features on the arrays. Numerous examples and important experimental design issues are discussed. The method is adapted to address ex post facto determination of whether the size of a training set used to develop a classifier was adequate. An interactive web site for performing the sample size calculations is provided. We showed that sample size calculations for classifier development from high-dimensional microarray data are feasible, discussed numerous important considerations, and presented examples.

  18. Extreme Simplification and Rendering of Point Sets using Algebraic Multigrid

    NARCIS (Netherlands)

    Reniers, Dennie; Telea, Alexandru

    2005-01-01

    We present a novel approach for extreme simplification of point set models in the context of real-time rendering. Point sets are often rendered using simple point primitives, such as oriented discs. However efficient, simple primitives are less effective in approximating large surface areas. A large

  19. Modelling large scale human activity in San Francisco

    Science.gov (United States)

    Gonzalez, Marta

    2010-03-01

    Diverse group of people with a wide variety of schedules, activities and travel needs compose our cities nowadays. This represents a big challenge for modeling travel behaviors in urban environments; those models are of crucial interest for a wide variety of applications such as traffic forecasting, spreading of viruses, or measuring human exposure to air pollutants. The traditional means to obtain knowledge about travel behavior is limited to surveys on travel journeys. The obtained information is based in questionnaires that are usually costly to implement and with intrinsic limitations to cover large number of individuals and some problems of reliability. Using mobile phone data, we explore the basic characteristics of a model of human travel: The distribution of agents is proportional to the population density of a given region, and each agent has a characteristic trajectory size contain information on frequency of visits to different locations. Additionally we use a complementary data set given by smart subway fare cards offering us information about the exact time of each passenger getting in or getting out of the subway station and the coordinates of it. This allows us to uncover the temporal aspects of the mobility. Since we have the actual time and place of individual's origin and destination we can understand the temporal patterns in each visited location with further details. Integrating two described data set we provide a dynamical model of human travels that incorporates different aspects observed empirically.

  20. Setting analyst: A practical harvest planning technique

    Science.gov (United States)

    Olivier R.M. Halleux; W. Dale Greene

    2001-01-01

    Setting Analyst is an ArcView extension that facilitates practical harvest planning for ground-based systems. By modeling the travel patterns of ground-based machines, it compares different harvesting settings based on projected average skidding distance, logging costs, and site disturbance levels. Setting Analyst uses information commonly available to consulting...

  1. Lebesgue Sets Immeasurable Existence

    Directory of Open Access Journals (Sweden)

    Diana Marginean Petrovai

    2012-12-01

    Full Text Available It is well known that the notion of measure and integral were released early enough in close connection with practical problems of measuring of geometric figures. Notion of measure was outlined in the early 20th century through H. Lebesgue’s research, founder of the modern theory of measure and integral. It was developed concurrently a technique of integration of functions. Gradually it was formed a specific area todaycalled the measure and integral theory. Essential contributions to building this theory was made by a large number of mathematicians: C. Carathodory, J. Radon, O. Nikodym, S. Bochner, J. Pettis, P. Halmos and many others. In the following we present several abstract sets, classes of sets. There exists the sets which are not Lebesgue measurable and the sets which are Lebesgue measurable but are not Borel measurable. Hence B ⊂ L ⊂ P(X.

  2. Delve: A Data Set Retrieval and Document Analysis System

    KAUST Repository

    Akujuobi, Uchenna Thankgod

    2017-12-29

    Academic search engines (e.g., Google scholar or Microsoft academic) provide a medium for retrieving various information on scholarly documents. However, most of these popular scholarly search engines overlook the area of data set retrieval, which should provide information on relevant data sets used for academic research. Due to the increasing volume of publications, it has become a challenging task to locate suitable data sets on a particular research area for benchmarking or evaluations. We propose Delve, a web-based system for data set retrieval and document analysis. This system is different from other scholarly search engines as it provides a medium for both data set retrieval and real time visual exploration and analysis of data sets and documents.

  3. Setting Learning Analytics in Context: Overcoming the Barriers to Large-Scale Adoption

    Science.gov (United States)

    Ferguson, Rebecca; Macfadyen, Leah P.; Clow, Doug; Tynan, Belinda; Alexander, Shirley; Dawson, Shane

    2014-01-01

    A core goal for most learning analytic projects is to move from small-scale research towards broader institutional implementation, but this introduces a new set of challenges because institutions are stable systems, resistant to change. To avoid failure and maximize success, implementation of learning analytics at scale requires explicit and…

  4. Priority Setting for Universal Health Coverage: We Need to Focus Both on Substance and on Process; Comment on “Priority Setting for Universal Health Coverage: We Need Evidence-Informed Deliberative Processes, not Just More Evidence on Cost-Effectiveness”

    Directory of Open Access Journals (Sweden)

    Jeremy A. Lauer

    2017-10-01

    Full Text Available In an editorial published in this journal, Baltussen et al argue that information on cost-effectiveness is not sufficient for priority setting for universal health coverage (UHC, a claim which is correct as far as it goes. However, their focus on the procedural legitimacy of ‘micro’ priority setting processes (eg, decisions concerning the reimbursement of specific interventions, and their related assumption that values for priority setting are determined only at this level, leads them to ignore the relevance of higher level, ‘macro’ priority setting processes, for example, consultations held by World Health Organization (WHO Member States and other global stakeholders that have resulted in widespread consensus on the principles of UHC. Priority setting is not merely about discrete choices, nor should the focus be exclusively (or even mainly on improving the procedural elements of micro priority setting processes. Systemic activities that shape the health system environment, such as strategic planning, as well as the substantive content of global policy instruments, are critical elements for priority setting for UHC.

  5. Neighborhood Discriminant Hashing for Large-Scale Image Retrieval.

    Science.gov (United States)

    Tang, Jinhui; Li, Zechao; Wang, Meng; Zhao, Ruizhen

    2015-09-01

    With the proliferation of large-scale community-contributed images, hashing-based approximate nearest neighbor search in huge databases has aroused considerable interest from the fields of computer vision and multimedia in recent years because of its computational and memory efficiency. In this paper, we propose a novel hashing method named neighborhood discriminant hashing (NDH) (for short) to implement approximate similarity search. Different from the previous work, we propose to learn a discriminant hashing function by exploiting local discriminative information, i.e., the labels of a sample can be inherited from the neighbor samples it selects. The hashing function is expected to be orthogonal to avoid redundancy in the learned hashing bits as much as possible, while an information theoretic regularization is jointly exploited using maximum entropy principle. As a consequence, the learned hashing function is compact and nonredundant among bits, while each bit is highly informative. Extensive experiments are carried out on four publicly available data sets and the comparison results demonstrate the outperforming performance of the proposed NDH method over state-of-the-art hashing techniques.

  6. Combining qualitative and quantitative operational research methods to inform quality improvement in pathways that span multiple settings

    Science.gov (United States)

    Crowe, Sonya; Brown, Katherine; Tregay, Jenifer; Wray, Jo; Knowles, Rachel; Ridout, Deborah A; Bull, Catherine; Utley, Martin

    2017-01-01

    Background Improving integration and continuity of care across sectors within resource constraints is a priority in many health systems. Qualitative operational research methods of problem structuring have been used to address quality improvement in services involving multiple sectors but not in combination with quantitative operational research methods that enable targeting of interventions according to patient risk. We aimed to combine these methods to augment and inform an improvement initiative concerning infants with congenital heart disease (CHD) whose complex care pathway spans multiple sectors. Methods Soft systems methodology was used to consider systematically changes to services from the perspectives of community, primary, secondary and tertiary care professionals and a patient group, incorporating relevant evidence. Classification and regression tree (CART) analysis of national audit datasets was conducted along with data visualisation designed to inform service improvement within the context of limited resources. Results A ‘Rich Picture’ was developed capturing the main features of services for infants with CHD pertinent to service improvement. This was used, along with a graphical summary of the CART analysis, to guide discussions about targeting interventions at specific patient risk groups. Agreement was reached across representatives of relevant health professions and patients on a coherent set of targeted recommendations for quality improvement. These fed into national decisions about service provision and commissioning. Conclusions When tackling complex problems in service provision across multiple settings, it is important to acknowledge and work with multiple perspectives systematically and to consider targeting service improvements in response to confined resources. Our research demonstrates that applying a combination of qualitative and quantitative operational research methods is one approach to doing so that warrants further

  7. Which components of health information technology will drive financial value?

    Science.gov (United States)

    Kern, Lisa M; Wilcox, Adam; Shapiro, Jason; Dhopeshwarkar, Rina V; Kaushal, Rainu

    2012-08-01

    The financial effects of electronic health records (EHRs) and health information exchange (HIE) are largely unknown, despite unprecedented federal incentives for their use. We sought to understand which components of EHRs and HIE are most likely to drive financial savings in the ambulatory, inpatient, and emergency department settings. Framework development and a national expert panel. We searched the literature to identify functionalities enabled by EHRs and HIE across the 3 healthcare settings. We rated each of 233 functionality-setting combinations on their likelihood of having a positive financial effect. We validated the top-scoring functionalities with a panel of 28 national experts, and we compared the high-scoring functionalities with Stage 1 meaningful use criteria. We identified 54 high-scoring functionality- setting combinations, 27 for EHRs and 27 for HIE. Examples of high-scoring functionalities included providing alerts for expensive medications, providing alerts for redundant lab orders, sending and receiving imaging reports, and enabling structured medication reconciliation. Of the 54 high-scoring functionalities, 25 (46%) are represented in Stage 1 meaningful use. Many of the functionalities not yet represented in meaningful use correspond with functionalities that focus directly on healthcare utilization and costs rather than on healthcare quality per se. This work can inform the development and selection of future meaningful use measures; inform implementation efforts, as clinicians and hospitals choose from among a "menu" of measures for meaningful use; and inform evaluation efforts, as investigators seek to measure the actual financial impact of EHRs and HIE.

  8. An introduction to random sets

    CERN Document Server

    Nguyen, Hung T

    2006-01-01

    The study of random sets is a large and rapidly growing area with connections to many areas of mathematics and applications in widely varying disciplines, from economics and decision theory to biostatistics and image analysis. The drawback to such diversity is that the research reports are scattered throughout the literature, with the result that in science and engineering, and even in the statistics community, the topic is not well known and much of the enormous potential of random sets remains untapped.An Introduction to Random Sets provides a friendly but solid initiation into the theory of random sets. It builds the foundation for studying random set data, which, viewed as imprecise or incomplete observations, are ubiquitous in today''s technological society. The author, widely known for his best-selling A First Course in Fuzzy Logic text as well as his pioneering work in random sets, explores motivations, such as coarse data analysis and uncertainty analysis in intelligent systems, for studying random s...

  9. Enhancing Seismic Calibration Research Through Software Automation and Scientific Information Management

    Energy Technology Data Exchange (ETDEWEB)

    Ruppert, S D; Dodge, D A; Ganzberger, M D; Harris, D B; Hauk, T F

    2009-07-07

    The National Nuclear Security Administration (NNSA) Ground-Based Nuclear Explosion Monitoring Research and Development (GNEMRD) Program at LLNL continues to make significant progress enhancing the process of deriving seismic calibrations and performing scientific integration, analysis, and information management with software automation tools. Our tool efforts address the problematic issues of very large datasets and varied formats encountered during seismic calibration research. New information management and analysis tools have resulted in demonstrated gains in efficiency of producing scientific data products and improved accuracy of derived seismic calibrations. In contrast to previous years, software development work this past year has emphasized development of automation at the data ingestion level. This change reflects a gradually-changing emphasis in our program from processing a few large data sets that result in a single integrated delivery, to processing many different data sets from a variety of sources. The increase in the number of sources had resulted in a large increase in the amount of metadata relative to the final volume of research products. Software developed this year addresses the problems of: (1) Efficient metadata ingestion and conflict resolution; (2) Automated ingestion of bulletin information; (3) Automated ingestion of waveform information from global data centers; and (4) Site Metadata and Response transformation required for certain products. This year, we also made a significant step forward in meeting a long-standing goal of developing and using a waveform correlation framework. Our objective for such a framework is to extract additional calibration data (e.g. mining blasts) and to study the extent to which correlated seismicity can be found in global and regional scale environments.

  10. Use of fuzzy sets in modeling of GIS objects

    Science.gov (United States)

    Mironova, Yu N.

    2018-05-01

    The paper discusses modeling and methods of data visualization in geographic information systems. Information processing in Geoinformatics is based on the use of models. Therefore, geoinformation modeling is a key in the chain of GEODATA processing. When solving problems, using geographic information systems often requires submission of the approximate or insufficient reliable information about the map features in the GIS database. Heterogeneous data of different origin and accuracy have some degree of uncertainty. In addition, not all information is accurate: already during the initial measurements, poorly defined terms and attributes (e.g., "soil, well-drained") are used. Therefore, there are necessary methods for working with uncertain requirements, classes, boundaries. The author proposes using spatial information fuzzy sets. In terms of a characteristic function, a fuzzy set is a natural generalization of ordinary sets, when one rejects the binary nature of this feature and assumes that it can take any value in the interval.

  11. AAEC INIS - a large, new, on-line information source

    International Nuclear Information System (INIS)

    Rugg, T.J.; Wong, S.C.

    1984-01-01

    The Australian Atomic Energy Commission's INIS database is available for on-line searching by non-AAEC personnel from all parts of Australia. An introduction to the International Nuclear Information System is followed by information on searching AAEC INIS, AAEC INIS retrieval software and accessing AAEC INIS

  12. Novel gene sets improve set-level classification of prokaryotic gene expression data.

    Science.gov (United States)

    Holec, Matěj; Kuželka, Ondřej; Železný, Filip

    2015-10-28

    Set-level classification of gene expression data has received significant attention recently. In this setting, high-dimensional vectors of features corresponding to genes are converted into lower-dimensional vectors of features corresponding to biologically interpretable gene sets. The dimensionality reduction brings the promise of a decreased risk of overfitting, potentially resulting in improved accuracy of the learned classifiers. However, recent empirical research has not confirmed this expectation. Here we hypothesize that the reported unfavorable classification results in the set-level framework were due to the adoption of unsuitable gene sets defined typically on the basis of the Gene ontology and the KEGG database of metabolic networks. We explore an alternative approach to defining gene sets, based on regulatory interactions, which we expect to collect genes with more correlated expression. We hypothesize that such more correlated gene sets will enable to learn more accurate classifiers. We define two families of gene sets using information on regulatory interactions, and evaluate them on phenotype-classification tasks using public prokaryotic gene expression data sets. From each of the two gene-set families, we first select the best-performing subtype. The two selected subtypes are then evaluated on independent (testing) data sets against state-of-the-art gene sets and against the conventional gene-level approach. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. The novel gene sets are indeed more correlated than the conventional ones, and lead to significantly more accurate classifiers. Novel gene sets defined on the basis of regulatory interactions improve set-level classification of gene expression data. The experimental scripts and other material needed to reproduce the experiments are available at http://ida.felk.cvut.cz/novelgenesets.tar.gz.

  13. BACHSCORE. A tool for evaluating efficiently and reliably the quality of large sets of protein structures

    Science.gov (United States)

    Sarti, E.; Zamuner, S.; Cossio, P.; Laio, A.; Seno, F.; Trovato, A.

    2013-12-01

    In protein structure prediction it is of crucial importance, especially at the refinement stage, to score efficiently large sets of models by selecting the ones that are closest to the native state. We here present a new computational tool, BACHSCORE, that allows its users to rank different structural models of the same protein according to their quality, evaluated by using the BACH++ (Bayesian Analysis Conformation Hunt) scoring function. The original BACH statistical potential was already shown to discriminate with very good reliability the protein native state in large sets of misfolded models of the same protein. BACH++ features a novel upgrade in the solvation potential of the scoring function, now computed by adapting the LCPO (Linear Combination of Pairwise Orbitals) algorithm. This change further enhances the already good performance of the scoring function. BACHSCORE can be accessed directly through the web server: bachserver.pd.infn.it. Catalogue identifier: AEQD_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEQD_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: GNU General Public License version 3 No. of lines in distributed program, including test data, etc.: 130159 No. of bytes in distributed program, including test data, etc.: 24 687 455 Distribution format: tar.gz Programming language: C++. Computer: Any computer capable of running an executable produced by a g++ compiler (4.6.3 version). Operating system: Linux, Unix OS-es. RAM: 1 073 741 824 bytes Classification: 3. Nature of problem: Evaluate the quality of a protein structural model, taking into account the possible “a priori” knowledge of a reference primary sequence that may be different from the amino-acid sequence of the model; the native protein structure should be recognized as the best model. Solution method: The contact potential scores the occurrence of any given type of residue pair in 5 possible

  14. Spatial fingerprints of community structure in human interaction network for an extensive set of large-scale regions.

    Directory of Open Access Journals (Sweden)

    Zsófia Kallus

    Full Text Available Human interaction networks inferred from country-wide telephone activity recordings were recently used to redraw political maps by projecting their topological partitions into geographical space. The results showed remarkable spatial cohesiveness of the network communities and a significant overlap between the redrawn and the administrative borders. Here we present a similar analysis based on one of the most popular online social networks represented by the ties between more than 5.8 million of its geo-located users. The worldwide coverage of their measured activity allowed us to analyze the large-scale regional subgraphs of entire continents and an extensive set of examples for single countries. We present results for North and South America, Europe and Asia. In our analysis we used the well-established method of modularity clustering after an aggregation of the individual links into a weighted graph connecting equal-area geographical pixels. Our results show fingerprints of both of the opposing forces of dividing local conflicts and of uniting cross-cultural trends of globalization.

  15. Solving large sets of coupled equations iteratively by vector processing on the CYBER 205 computer

    International Nuclear Information System (INIS)

    Tolsma, L.D.

    1985-01-01

    The set of coupled linear second-order differential equations which has to be solved for the quantum-mechanical description of inelastic scattering of atomic and nuclear particles can be rewritten as an equivalent set of coupled integral equations. When some type of functions is used as piecewise analytic reference solutions, the integrals that arise in this set can be evaluated analytically. The set of integral equations can be solved iteratively. For the results mentioned an inward-outward iteration scheme has been applied. A concept of vectorization of coupled-channel Fortran programs, based on this integral method, is presented for the use on the Cyber 205 computer. It turns out that, for two heavy ion nuclear scattering test cases, this vector algorithm gives an overall speed-up of about a factor of 2 to 3 compared to a highly optimized scalar algorithm for a one vector pipeline computer

  16. Patient data and patient rights: Swiss healthcare stakeholders' ethical awareness regarding large patient data sets - a qualitative study.

    Science.gov (United States)

    Mouton Dorey, Corine; Baumann, Holger; Biller-Andorno, Nikola

    2018-03-07

    There is a growing interest in aggregating more biomedical and patient data into large health data sets for research and public benefits. However, collecting and processing patient data raises new ethical issues regarding patient's rights, social justice and trust in public institutions. The aim of this empirical study is to gain an in-depth understanding of the awareness of possible ethical risks and corresponding obligations among those who are involved in projects using patient data, i.e. healthcare professionals, regulators and policy makers. We used a qualitative design to examine Swiss healthcare stakeholders' experiences and perceptions of ethical challenges with regard to patient data in real-life settings where clinical registries are sponsored, created and/or used. A semi-structured interview was carried out with 22 participants (11 physicians, 7 policy-makers, 4 ethical committee members) between July 2014 and January 2015. The interviews were audio-recorded, transcribed, coded and analysed using a thematic method derived from Grounded Theory. All interviewees were concerned as a matter of priority with the needs of legal and operating norms for the collection and use of data, whereas less interest was shown in issues regarding patient agency, the need for reciprocity, and shared governance in the management and use of clinical registries' patient data. This observed asymmetry highlights a possible tension between public and research interests on the one hand, and the recognition of patients' rights and citizens' involvement on the other. The advocation of further health-related data sharing on the grounds of research and public interest, without due regard for the perspective of patients and donors, could run the risk of fostering distrust towards healthcare data collections. Ultimately, this could diminish the expected social benefits. However, rather than setting patient rights against public interest, new ethical approaches could strengthen both

  17. Information Ecology

    DEFF Research Database (Denmark)

    Christiansen, Ellen Tove

    2006-01-01

    in the 1960ties, and chosen here because it integrates cultural and psychological trajectories in a theory of living settings. The pedagogical-didactical paradigm comprises three distinct information ecologies, named after their intended outcome: the problem-setting, the exploration-setting, and the fit...

  18. SU-E-I-58: Experiences in Setting Up An Online Fluoroscopy Tracking System in a Large Healthcare System

    Energy Technology Data Exchange (ETDEWEB)

    Fisher, R; Wunderle, K; Lingenfelter, M [The Cleveland Clinic, Cleveland, OH (United States)

    2015-06-15

    Purpose: Transitioning from a paper based to an online system for tracking fluoroscopic case information required by state regulation and to conform to NCRP patient dose tracking suggestions. Methods: State regulations require documentation of operator, equipment, and some metric of tube output for fluoroscopy exams. This information was previously collected in paper logs, which was cumbersome and inefficient for the large number of fluoroscopic units across multiple locations within the system. The “tech notes” feature within Siemens’ Syngo workflow RIS was utilized to create an entry form for technologists to input case information, which was sent to a third party vendor for archiving and display though an online web based portal. Results: Over 55k cases were logged in the first year of implementation, with approximately 6,500 cases per month once fully online. A system was built for area managers to oversee and correct data, which has increased the accuracy of inputted values. A high-dose report was built to automatically send notifications when patients exceed trigger levels. In addition to meeting regulatory requirements, the new system allows for larger scale QC in fluoroscopic cases by allowing comparison of data from specific procedures, locations, equipment, and operators so that instances that fall outside of reference levels can be identified for further evaluation. The system has also drastically improved identification of operators without documented equipment specific training. Conclusion: The transition to online fluoroscopy logs has improved efficiency in meeting state regulatory requirements as well as allowed for identification of particular procedures, equipment, and operators in need of additional attention in order to optimize patient and personnel doses, while high dose alerts improve patient care and follow up. Future efforts are focused on incorporating case information from outside of radiology, as well as on automating processes for

  19. SU-E-I-58: Experiences in Setting Up An Online Fluoroscopy Tracking System in a Large Healthcare System

    International Nuclear Information System (INIS)

    Fisher, R; Wunderle, K; Lingenfelter, M

    2015-01-01

    Purpose: Transitioning from a paper based to an online system for tracking fluoroscopic case information required by state regulation and to conform to NCRP patient dose tracking suggestions. Methods: State regulations require documentation of operator, equipment, and some metric of tube output for fluoroscopy exams. This information was previously collected in paper logs, which was cumbersome and inefficient for the large number of fluoroscopic units across multiple locations within the system. The “tech notes” feature within Siemens’ Syngo workflow RIS was utilized to create an entry form for technologists to input case information, which was sent to a third party vendor for archiving and display though an online web based portal. Results: Over 55k cases were logged in the first year of implementation, with approximately 6,500 cases per month once fully online. A system was built for area managers to oversee and correct data, which has increased the accuracy of inputted values. A high-dose report was built to automatically send notifications when patients exceed trigger levels. In addition to meeting regulatory requirements, the new system allows for larger scale QC in fluoroscopic cases by allowing comparison of data from specific procedures, locations, equipment, and operators so that instances that fall outside of reference levels can be identified for further evaluation. The system has also drastically improved identification of operators without documented equipment specific training. Conclusion: The transition to online fluoroscopy logs has improved efficiency in meeting state regulatory requirements as well as allowed for identification of particular procedures, equipment, and operators in need of additional attention in order to optimize patient and personnel doses, while high dose alerts improve patient care and follow up. Future efforts are focused on incorporating case information from outside of radiology, as well as on automating processes for

  20. A Fast Logdet Divergence Based Metric Learning Algorithm for Large Data Sets Classification

    Directory of Open Access Journals (Sweden)

    Jiangyuan Mei

    2014-01-01

    the basis of classifiers, for example, the k-nearest neighbors classifier. Experiments on benchmark data sets demonstrate that the proposed algorithm compares favorably with the state-of-the-art methods.

  1. Behavior Identification Based on Geotagged Photo Data Set

    Directory of Open Access Journals (Sweden)

    Guo-qi Liu

    2014-01-01

    Full Text Available The popularity of mobile devices has produced a set of image data with geographic information, time information, and text description information, which is called geotagged photo data set. The division of this kind of data by its behavior and the location not only can identify the user’s important location and daily behavior, but also helps users to sort the huge image data. This paper proposes a method to build an index based on multiple classification result, which can divide the data set multiple times and distribute labels to the data to build index according to the estimated probability of classification results in order to accomplish the identification of users’ important location and daily behaviors. This paper collects 1400 discrete sets of data as experimental data to verify the method proposed in this paper. The result of the experiment shows that the index and actual tagging results have a high inosculation.

  2. Report from the Passive Microwave Data Set Management Workshop

    Science.gov (United States)

    Armstrong, Ed; Conover, Helen; Goodman, Michael; Krupp, Brian; Liu, Zhong; Moses, John; Ramapriyan, H. K.; Scott, Donna; Smith, Deborah; Weaver, Ronald

    2011-01-01

    Passive microwave data sets are some of the most important data sets in the Earth Observing System Data and Information System (EOSDIS), providing data as far back as the early 1970s. The widespread use of passive microwave (PM) radiometer data has led to their collection and distribution over the years at several different Earth science data centers. The user community is often confused by this proliferation and the uneven spread of information about the data sets. In response to this situation, a Passive Microwave Data Set Management Workshop was held 17 ]19 May 2011 at the Global Hydrology Resource Center, sponsored by the NASA Earth Science Data and Information System (ESDIS) Project. The workshop attendees reviewed all primary (Level 1 ]3) PM data sets from NASA and non ]NASA sensors held by NASA Distributed Active Archive Centers (DAACs), as well as high ]value data sets from other NASA ]funded organizations. This report provides the key findings and recommendations from the workshop as well as detailed tabluations of the datasets considered.

  3. EEA core set of indicators. Guide

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2005-07-01

    This guide provides information on the quality of the 37 indicators in the EEA core set. Its primary role is to support improved implementation of the core set in the EEA, European topic centres and the European environment information and observation network (Eionet). In parallel, it is aimed at helping users outside the EEA/Eionet system make best use of the indicators in their own work. It is hoped that the guide will promote cooperation on improving indicator methodologies and data quality as part of the wider process to streamline and improve environmental reporting in the European Union and beyond. (au)

  4. Set-oriented data mining in relational databases

    NARCIS (Netherlands)

    Houtsma, M.A.W.; Swami, Arun

    1995-01-01

    Data mining is an important real-life application for businesses. It is critical to find efficient ways of mining large data sets. In order to benefit from the experience with relational databases, a set-oriented approach to mining data is needed. In such an approach, the data mining operations are

  5. Vitali systems in R^n with irregular sets

    DEFF Research Database (Denmark)

    Mejlbro, Leif; Topsøe, Flemming

    1996-01-01

    Vitali type theorems are results stating that out of a given family of sets one can select pairwise disjoint sets which fill out a "large" region. Usually one works with "regular" sets such as balls. We shall establish results with sets of a more complicated geometrical structure, e.g., Cantor......-like sets are allowed. The results are related to a generalisation of the classical notion of a differentiation basis.l They concern real n-space R^n and Lebesgue measure....

  6. Intersection of triadic Cantor sets with their translates. II. Hausdorff measure spectrum function and its introduction for the classification of Cantor sets

    Energy Technology Data Exchange (ETDEWEB)

    Li Jun; Nekka, Fahima E-mail: fahima.nekka@umontreal.ca

    2004-01-01

    Initiated by the purpose of classification of sets having the same fractal dimension, we continue, in this second paper of a series of two, our investigation of intersection of triadic Cantor sets and their use in the classification of fractal sets. We exploit the infinite tree structure of translation elements to give the exact expressions of these elements. We generalize this result to a family of uniform Cantor sets for which we also give the Hausdorff measure spectrum function (HMSF). We develop three algorithms for the construction of HMSF of triadic Cantor sets. Then, we introduce a new method based on HMSF as a way for tracing the geometrical organization of a fractal set. The HMSF does carry a huge amount of information about the set to likely be explored in a chosen way. To extract this information, we develop a one by one step method and apply it to typical fractal sets. This results in a complete identification of fractals.0.

  7. The use of qualitative methods to inform Delphi surveys in core outcome set development.

    Science.gov (United States)

    Keeley, T; Williamson, P; Callery, P; Jones, L L; Mathers, J; Jones, J; Young, B; Calvert, M

    2016-05-04

    Core outcome sets (COS) help to minimise bias in trials and facilitate evidence synthesis. Delphi surveys are increasingly being used as part of a wider process to reach consensus about what outcomes should be included in a COS. Qualitative research can be used to inform the development of Delphi surveys. This is an advance in the field of COS development and one which is potentially valuable; however, little guidance exists for COS developers on how best to use qualitative methods and what the challenges are. This paper aims to provide early guidance on the potential role and contribution of qualitative research in this area. We hope the ideas we present will be challenged, critiqued and built upon by others exploring the role of qualitative research in COS development. This paper draws upon the experiences of using qualitative methods in the pre-Delphi stage of the development of three different COS. Using these studies as examples, we identify some of the ways that qualitative research might contribute to COS development, the challenges in using such methods and areas where future research is required. Qualitative research can help to identify what outcomes are important to stakeholders; facilitate understanding of why some outcomes may be more important than others, determine the scope of outcomes; identify appropriate language for use in the Delphi survey and inform comparisons between stakeholder data and other sources, such as systematic reviews. Developers need to consider a number of methodological points when using qualitative research: specifically, which stakeholders to involve, how to sample participants, which data collection methods are most appropriate, how to consider outcomes with stakeholders and how to analyse these data. A number of areas for future research are identified. Qualitative research has the potential to increase the research community's confidence in COS, although this will be dependent upon using rigorous and appropriate

  8. Improving the Understanding of Progressing and Emerging Health Informatics Roles and Skill Sets among Health Information Management Professionals: An Action Research Study

    Science.gov (United States)

    Palkie, Brooke N.

    2013-01-01

    The Health Information Management (HIM) profession is evolving to meet the technology demands of the current healthcare landscape. The 2009 enactment of the HITECH Act has placed unprecedented emphasis on utilizing technology to improve the quality of care and to decrease healthcare costs. Expectations of deep analytical skills have set the stage…

  9. Using multiobjective tradeoff sets and Multivariate Regression Trees to identify critical and robust decisions for long term water utility planning

    Science.gov (United States)

    Smith, R.; Kasprzyk, J. R.; Balaji, R.

    2017-12-01

    In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.

  10. Agenda Setting and Mass Communication Theory.

    Science.gov (United States)

    Shaw, Eugene F.

    The agenda-setting concept in mass communication asserts that the news media determine what people will include or exclude in their cognition of public events. Findings in uses and gratification research provide the foundation for this concept: an initial focus on people's needs, particularly the need for information. The agenda-setting concept…

  11. Working with Negative Emotions in Sets

    Science.gov (United States)

    Hillman, Alison

    2012-01-01

    This account draws upon learning from an incident in an action learning set where an individual challenged a mandatory organisational requirement. As a facilitator I reflect upon my initial defensive reaction to this challenge. The use of critical action learning to inform ourselves as facilitators of the underlying tensions between set members…

  12. Priority setting for risk assessment-The benefit of human experience

    International Nuclear Information System (INIS)

    Alonzo, Cristina; Laborde, Amalia

    2005-01-01

    The chemical risk assessment process plays an essential role in the potential human health risk evaluation. Setting priorities for this purpose is critical for better use of the available human and material resources. It has been generally accepted that all new chemicals require safety evaluation before manufacture and sale. This is a difficult task due to the large number of chemicals directly consumed by man, as well as those that are widely used. At present, more than 50% of chemicals do not have the minimum data requirements for risk assessment. Production and release volumes are well-established prioritization criteria, although volume itself does not directly reflect the likelihood of human exposure. This quantitative approach applied in setting priorities may be influenced by human experience. Human data provided by epidemiological investigations have been accepted as the most credible evidence for human toxicity although analytical studies are expensive and require long-term follow up. Unfortunately, some epidemiological studies continue to have difficulties with exposure documentation, controlling bias and confounding, and are not able to provide predictions of risk until humans are exposed. Clinical toxicology services and Poison Centres around the world accumulate a great amount of toxicological-related information that may contribute to the evidence-based medicine and research and so collaborate with all the risk assessment disciplines. The information obtained from these services and centers has the potential to prioritize existing chemical assessment processes or to influence scheduling of classes of chemicals. Prioritization process may be improved by evaluating Poisons Centres statistics about frequency of cases, severity of effects, detection of unusual circumstances of exposure, as well as vulnerable sub-populations. International efforts for the harmonization of these data offer a useful tool to take advantage of this global information. Case

  13. Information about the new 8-group delayed neutron set preparation

    International Nuclear Information System (INIS)

    Svarny, J.

    1998-01-01

    Some comments to the present state concerning delayed neutron data preparation is given and preliminary analysis of the new 8-group delayed data (relative abundances) is presented. Comparisons of the 8-group to 6-group set is given for rod drop experiment (Unit 1, Cycle 14, NPP Dukovany).(Author)

  14. Prevention and Control of Methicillin-Resistant Staphylococcus aureus in Acute Care Settings.

    Science.gov (United States)

    Lee, Andie S; Huttner, Benedikt; Harbarth, Stephan

    2016-12-01

    Methicillin-resistant Staphylococcus aureus (MRSA) is a leading cause of health care-associated infections worldwide. Controversies with regard to the effectiveness of various MRSA control strategies have contributed to varying approaches to the control of this pathogen in different settings. However, new evidence from large-scale studies has emerged, particularly with regards to MRSA screening and decolonization strategies, which will inform future control practices. The implementation as well as outcomes of control measures in the real world is not only influenced by scientific evidence but also depends on economic, administrative, governmental, and political influences. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Book review: Large igneous provinces

    Science.gov (United States)

    du Bray, Edward A.

    2015-01-01

    This book presents a comprehensive compilation of all aspects of large igneous provinces (LIPs). Published in 2014, the book is now the definitive source of information on the petrogenesis of this type of globally important, voluminous magmatic activity. In the first few pages, LIPs are characterized as magmatic provinces with areal extents >0.1 Mkm2 that are dominated by mafic magmas emplaced or erupted in intraplate settings during relatively short (1–5 m.y.) time intervals. Given these parameters, particularly areal extent, LIPs clearly represent significant contributions to global geologic evolution through time. This point is underscored, also in the introductory chapter, by a series of figures that aptly characterize the global time-space distribution of LIPs; an accompanying, particularly useful table identifies individual LIPs, quantifies their basic characteristics, and enumerates pertinent references. Accordingly, this compilation is a welcome addition to the geologic literature.

  16. Setting up fuel supply strategies for large-scale bio-energy projects using agricultural and forest residues. A methodology for developing countries

    International Nuclear Information System (INIS)

    Junginger, M.

    2000-08-01

    The objective of this paper is to develop a coherent methodology to set up fuel supply strategies for large-scale biomass-conversion units. This method will explicitly take risks and uncertainties regarding availability and costs in relation to time into account. This paper aims at providing general guidelines, which are not country-specific. These guidelines cannot provide 'perfect fit'-solutions, but aim to give general help to overcome barriers and to set up supply strategies. It will mainly focus on residues from the agricultural and forestry sector. This study focuses on electricity or both electricity and heat production (CHP) with plant scales between 1040 MWe. This range is chosen due to rules of economies of scale. In large-scale plants the benefits of increased efficiency outweigh increased transportation costs, allowing a lower price per kWh which in turn may allow higher biomass costs. However, fuel-supply risks tend to get higher with increasing plant size, which makes it more important to assess them for large(r) conversion plants. Although the methodology does not focus on a specific conversion technology, it should be stressed that the technology must be able to handle a wide variety of biomass fuels with different characteristics because many biomass residues are not available the year round and various fuels are needed for a constant supply. The methodology allows for comparing different technologies (with known investment and operational and maintenance costs from literature) and evaluation for different fuel supply scenarios. In order to demonstrate the methodology, a case study was carried out for the north-eastern part of Thailand (Isaan), an agricultural region. The research was conducted in collaboration with the Regional Wood Energy Development Programme in Asia (RWEDP), a project of the UN Food and Agricultural Organization (FAO) in Bangkok, Thailand. In Section 2 of this paper the methodology will be presented. In Section 3 the economic

  17. Hesitant fuzzy sets theory

    CERN Document Server

    Xu, Zeshui

    2014-01-01

    This book provides the readers with a thorough and systematic introduction to hesitant fuzzy theory. It presents the most recent research results and advanced methods in the field. These includes: hesitant fuzzy aggregation techniques, hesitant fuzzy preference relations, hesitant fuzzy measures, hesitant fuzzy clustering algorithms and hesitant fuzzy multi-attribute decision making methods. Since its introduction by Torra and Narukawa in 2009, hesitant fuzzy sets have become more and more popular and have been used for a wide range of applications, from decision-making problems to cluster analysis, from medical diagnosis to personnel appraisal and information retrieval. This book offers a comprehensive report on the state-of-the-art in hesitant fuzzy sets theory and applications, aiming at becoming a reference guide for both researchers and practitioners in the area of fuzzy mathematics and other applied research fields (e.g. operations research, information science, management science and engineering) chara...

  18. Information Society Needs of Managers in a Large Governmental Organisation

    Science.gov (United States)

    Broos, Elizabeth; Cronje, Johannes C.

    2009-01-01

    Dealing effectively with information and communication technology in the information society is a complex task and the human dimension is often under-estimated. This paper tries to give a voice to some managers about their experiences with information, communication and technology in their working environment, which involves participating in a…

  19. Logical analysis of diffuse large B-cell lymphomas.

    Science.gov (United States)

    Alexe, G; Alexe, S; Axelrod, D E; Hammer, P L; Weissmann, D

    2005-07-01

    The goal of this study is to re-examine the oligonucleotide microarray dataset of Shipp et al., which contains the intensity levels of 6817 genes of 58 patients with diffuse large B-cell lymphoma (DLBCL) and 19 with follicular lymphoma (FL), by means of the combinatorics, optimisation, and logic-based methodology of logical analysis of data (LAD). The motivations for this new analysis included the previously demonstrated capabilities of LAD and its expected potential (1) to identify different informative genes than those discovered by conventional statistical methods, (2) to identify combinations of gene expression levels capable of characterizing different types of lymphoma, and (3) to assemble collections of such combinations that if considered jointly are capable of accurately distinguishing different types of lymphoma. The central concept of LAD is a pattern or combinatorial biomarker, a concept that resembles a rule as used in decision tree methods. LAD is able to exhaustively generate the collection of all those patterns which satisfy certain quality constraints, through a systematic combinatorial process guided by clear optimization criteria. Then, based on a set covering approach, LAD aggregates the collection of patterns into classification models. In addition, LAD is able to use the information provided by large collections of patterns in order to extract subsets of variables, which collectively are able to distinguish between different types of disease. For the differential diagnosis of DLBCL versus FL, a model based on eight significant genes is constructed and shown to have a sensitivity of 94.7% and a specificity of 100% on the test set. For the prognosis of good versus poor outcome among the DLBCL patients, a model is constructed on another set consisting also of eight significant genes, and shown to have a sensitivity of 87.5% and a specificity of 90% on the test set. The genes selected by LAD also work well as a basis for other kinds of statistical

  20. Does information form matter when giving tailored risk information to patients in clinical settings? A review of patients’ preferences and responses

    Directory of Open Access Journals (Sweden)

    Harris R

    2017-03-01

    Full Text Available Rebecca Harris, Claire Noble, Victoria Lowers Institute of Psychology, Health and Society, University of Liverpool, Liverpool, UK Abstract: Neoliberal emphasis on “responsibility” has colonized many aspects of public life, including how health care is provided. Clinical risk assessment of patients based on a range of data concerned with lifestyle, behavior, and health status has assumed a growing importance in many health systems. It is a mechanism whereby responsibility for self (preventive care can be shifted to patients, provided that risk assessment data is communicated to patients in a way which is engaging and motivates change. This study aimed to look at whether the form in which tailored risk information was presented in a clinical setting (for example, using photographs, online data, diagrams etc., was associated with differences in patients’ responses and preferences to the material presented. We undertook a systematic review using electronic searching of nine databases, along with handsearching specialist journals and backward and forward citation searching. We identified eleven studies (eight with a randomized controlled trial design. Seven studies involved the use of computerized health risk assessments in primary care. Beneficial effects were relatively modest, even in studies merely aiming to enhance patient–clinician communication or to modify patients’ risk perceptions. In our paper, we discuss the apparent importance of the accompanying discourse between patient and clinician, which appears to be necessary in order to impart meaning to information on “risk,” irrespective of whether the material is personalized, or even presented in a vivid way. Thus, while expanding computer technologies might be able to generate a highly personalized account of patients’ risk in a time efficient way, the need for face-to-face interactions to impart meaning to the data means that these new technologies cannot fully address the

  1. Spatial compression algorithm for the analysis of very large multivariate images

    Science.gov (United States)

    Keenan, Michael R [Albuquerque, NM

    2008-07-15

    A method for spatially compressing data sets enables the efficient analysis of very large multivariate images. The spatial compression algorithms use a wavelet transformation to map an image into a compressed image containing a smaller number of pixels that retain the original image's information content. Image analysis can then be performed on a compressed data matrix consisting of a reduced number of significant wavelet coefficients. Furthermore, a block algorithm can be used for performing common operations more efficiently. The spatial compression algorithms can be combined with spectral compression algorithms to provide further computational efficiencies.

  2. Combining qualitative and quantitative operational research methods to inform quality improvement in pathways that span multiple settings.

    Science.gov (United States)

    Crowe, Sonya; Brown, Katherine; Tregay, Jenifer; Wray, Jo; Knowles, Rachel; Ridout, Deborah A; Bull, Catherine; Utley, Martin

    2017-08-01

    Improving integration and continuity of care across sectors within resource constraints is a priority in many health systems. Qualitative operational research methods of problem structuring have been used to address quality improvement in services involving multiple sectors but not in combination with quantitative operational research methods that enable targeting of interventions according to patient risk. We aimed to combine these methods to augment and inform an improvement initiative concerning infants with congenital heart disease (CHD) whose complex care pathway spans multiple sectors. Soft systems methodology was used to consider systematically changes to services from the perspectives of community, primary, secondary and tertiary care professionals and a patient group, incorporating relevant evidence. Classification and regression tree (CART) analysis of national audit datasets was conducted along with data visualisation designed to inform service improvement within the context of limited resources. A 'Rich Picture' was developed capturing the main features of services for infants with CHD pertinent to service improvement. This was used, along with a graphical summary of the CART analysis, to guide discussions about targeting interventions at specific patient risk groups. Agreement was reached across representatives of relevant health professions and patients on a coherent set of targeted recommendations for quality improvement. These fed into national decisions about service provision and commissioning. When tackling complex problems in service provision across multiple settings, it is important to acknowledge and work with multiple perspectives systematically and to consider targeting service improvements in response to confined resources. Our research demonstrates that applying a combination of qualitative and quantitative operational research methods is one approach to doing so that warrants further consideration. Published by the BMJ Publishing Group

  3. The Amateurs' Love Affair with Large Datasets

    Science.gov (United States)

    Price, Aaron; Jacoby, S. H.; Henden, A.

    2006-12-01

    Amateur astronomers are professionals in other areas. They bring expertise from such varied and technical careers as computer science, mathematics, engineering, and marketing. These skills, coupled with an enthusiasm for astronomy, can be used to help manage the large data sets coming online in the next decade. We will show specific examples where teams of amateurs have been involved in mining large, online data sets and have authored and published their own papers in peer-reviewed astronomical journals. Using the proposed LSST database as an example, we will outline a framework for involving amateurs in data analysis and education with large astronomical surveys.

  4. The ambient dose equivalent at flight altitudes: a fit to a large set of data using a Bayesian approach

    International Nuclear Information System (INIS)

    Wissmann, F; Reginatto, M; Moeller, T

    2010-01-01

    The problem of finding a simple, generally applicable description of worldwide measured ambient dose equivalent rates at aviation altitudes between 8 and 12 km is difficult to solve due to the large variety of functional forms and parametrisations that are possible. We present an approach that uses Bayesian statistics and Monte Carlo methods to fit mathematical models to a large set of data and to compare the different models. About 2500 data points measured in the periods 1997-1999 and 2003-2006 were used. Since the data cover wide ranges of barometric altitude, vertical cut-off rigidity and phases in the solar cycle 23, we developed functions which depend on these three variables. Whereas the dependence on the vertical cut-off rigidity is described by an exponential, the dependences on barometric altitude and solar activity may be approximated by linear functions in the ranges under consideration. Therefore, a simple Taylor expansion was used to define different models and to investigate the relevance of the different expansion coefficients. With the method presented here, it is possible to obtain probability distributions for each expansion coefficient and thus to extract reliable uncertainties even for the dose rate evaluated. The resulting function agrees well with new measurements made at fixed geographic positions and during long haul flights covering a wide range of latitudes.

  5. Schools and Informal Science Settings: Collaborate, Co-Exist, or Assimilate?

    Science.gov (United States)

    Adams, Jennifer D.; Gupta, Preeti; DeFelice, Amy

    2012-01-01

    In this metalogue we build on the arguments presented by Puvirajah, Verma and Webb to discuss the nature of authentic science learning experiences in context of collaborations between schools and out-of-school time settings. We discuss the role of stakeholders in creating collaborative science learning practices and affordances of out of school…

  6. Cancer education and effective dissemination: information access is not enough.

    Science.gov (United States)

    Ousley, Anita L; Swarz, Jeffrey A; Milliken, Erin L; Ellis, Steven

    2010-06-01

    Education is the main avenue for disseminating new research findings into clinical practice. Understanding factors that affect translation of research into practice may help cancer educators design programs that facilitate the time it takes for research-indicated practices to become standard care. To understand various factors, the National Cancer Institute (NCI) Office of Education and Special Initiatives (OESI)(1) with individual cooperation from Oncology Nursing Society (ONS), American Society of Clinical Oncology (ASCO), and Association of Oncology Social Work (AOSW) administered a Practitioner Information Needs survey to five different types of practitioners involved in cancer care. While most of the 2,864 practitioners (83%) agreed they had access to current practice information, practitioners in large practice settings were more likely to report having access to research than those small practice settings. However, only 33% indicated that they had adequate time to access the information. Colleagues or experts within the organization were cited as the most frequently relied on information resource (60%), and peer-reviewed journals were cited as second (57%). Overall, 66% strongly or somewhat agreed that their organizations exhibit effective change management practices. A majority (69%) agreed that implementation of new practices is hindered by the lack of available staff time. Financial factors and the characteristics of the information presented were also believed to be factors contributing to research implementation. Group differences were observed among practitioner groups and practice settings for some factors.

  7. Frontiers of higher order fuzzy sets

    CERN Document Server

    Tahayori, Hooman

    2015-01-01

    Frontiers of Higher Order Fuzzy Sets, strives to improve the theoretical aspects of general and Interval Type-2 fuzzy sets and provides a unified representation theorem for higher order fuzzy sets. Moreover, the book elaborates on the concept of gradual elements and their integration with the higher order fuzzy sets. This book also introduces new frameworks for information granulation based on general T2FSs, IT2FSs, Gradual elements, Shadowed sets and rough sets. In particular, the properties and characteristics of the new proposed frameworks are studied. Such new frameworks are shown to be more capable to be exploited in real applications. Higher order fuzzy sets that are the result of the integration of general T2FSs, IT2FSs, gradual elements, shadowed sets and rough sets will be shown to be suitable to be applied in the fields of bioinformatics, business, management, ambient intelligence, medicine, cloud computing and smart grids. Presents new variations of fuzzy set frameworks and new areas of applicabili...

  8. An Aerial-Ground Robotic System for Navigation and Obstacle Mapping in Large Outdoor Areas

    Directory of Open Access Journals (Sweden)

    David Zapata

    2013-01-01

    Full Text Available There are many outdoor robotic applications where a robot must reach a goal position or explore an area without previous knowledge of the environment around it. Additionally, other applications (like path planning require the use of known maps or previous information of the environment. This work presents a system composed by a terrestrial and an aerial robot that cooperate and share sensor information in order to address those requirements. The ground robot is able to navigate in an unknown large environment aided by visual feedback from a camera on board the aerial robot. At the same time, the obstacles are mapped in real-time by putting together the information from the camera and the positioning system of the ground robot. A set of experiments were carried out with the purpose of verifying the system applicability. The experiments were performed in a simulation environment and outdoor with a medium-sized ground robot and a mini quad-rotor. The proposed robotic system shows outstanding results in simultaneous navigation and mapping applications in large outdoor environments.

  9. Setting Priorities For Large Research Facility Projects Supported By the National Science Foundation

    National Research Council Canada - National Science Library

    2005-01-01

    ...) level has stalled in the face of a backlog of approved but unfunded projects. Second, the rationale and criteria used to select projects and set priorities among projects for MREFC funding have not been clearly and publicly articulated...

  10. [A medical consumable material management information system].

    Science.gov (United States)

    Tang, Guoping; Hu, Liang

    2014-05-01

    Medical consumables material is essential supplies to carry out medical work, which has a wide range of varieties and a large amount of usage. How to manage it feasibly and efficiently that has been a topic of concern to everyone. This article discussed about how to design a medical consumable material management information system that has a set of standardized processes, bring together medical supplies administrator, suppliers and clinical departments. Advanced management mode, enterprise resource planning (ERP) applied to the whole system design process.

  11. Interactive Visualization of Large-Scale Hydrological Data using Emerging Technologies in Web Systems and Parallel Programming

    Science.gov (United States)

    Demir, I.; Krajewski, W. F.

    2013-12-01

    As geoscientists are confronted with increasingly massive datasets from environmental observations to simulations, one of the biggest challenges is having the right tools to gain scientific insight from the data and communicate the understanding to stakeholders. Recent developments in web technologies make it easy to manage, visualize and share large data sets with general public. Novel visualization techniques and dynamic user interfaces allow users to interact with data, and modify the parameters to create custom views of the data to gain insight from simulations and environmental observations. This requires developing new data models and intelligent knowledge discovery techniques to explore and extract information from complex computational simulations or large data repositories. Scientific visualization will be an increasingly important component to build comprehensive environmental information platforms. This presentation provides an overview of the trends and challenges in the field of scientific visualization, and demonstrates information visualization and communication tools developed within the light of these challenges.

  12. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  13. Integrating Genomic Data Sets for Knowledge Discovery: An Informed Approach to Management of Captive Endangered Species.

    Science.gov (United States)

    Irizarry, Kristopher J L; Bryant, Doug; Kalish, Jordan; Eng, Curtis; Schmidt, Peggy L; Barrett, Gini; Barr, Margaret C

    2016-01-01

    Many endangered captive populations exhibit reduced genetic diversity resulting in health issues that impact reproductive fitness and quality of life. Numerous cost effective genomic sequencing and genotyping technologies provide unparalleled opportunity for incorporating genomics knowledge in management of endangered species. Genomic data, such as sequence data, transcriptome data, and genotyping data, provide critical information about a captive population that, when leveraged correctly, can be utilized to maximize population genetic variation while simultaneously reducing unintended introduction or propagation of undesirable phenotypes. Current approaches aimed at managing endangered captive populations utilize species survival plans (SSPs) that rely upon mean kinship estimates to maximize genetic diversity while simultaneously avoiding artificial selection in the breeding program. However, as genomic resources increase for each endangered species, the potential knowledge available for management also increases. Unlike model organisms in which considerable scientific resources are used to experimentally validate genotype-phenotype relationships, endangered species typically lack the necessary sample sizes and economic resources required for such studies. Even so, in the absence of experimentally verified genetic discoveries, genomics data still provides value. In fact, bioinformatics and comparative genomics approaches offer mechanisms for translating these raw genomics data sets into integrated knowledge that enable an informed approach to endangered species management.

  14. Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets

    NARCIS (Netherlands)

    Mensink, T.; Verbeek, J.; Perronnin, F.; Csurka, G.; Farinella, G.M.; Battiato, S.; Cipolla, R,

    2013-01-01

    Many real-life large-scale datasets are open-ended and dynamic: new images are continuously added to existing classes, new classes appear over time, and the semantics of existing classes might evolve too. Therefore, we study large-scale image classification methods that can incorporate new classes

  15. Motivated Reasoning, Political Information, and Information Literacy Education

    Science.gov (United States)

    Lenker, Mark

    2016-01-01

    Research in psychology and political science has identified motivated reasoning as a set of biases that inhibit a person's ability to process political information objectively. This research has important implications for the information literacy movement's aims of fostering lifelong learning and informed citizenship. This essay argues that…

  16. Generalized rough sets

    International Nuclear Information System (INIS)

    Rady, E.A.; Kozae, A.M.; Abd El-Monsef, M.M.E.

    2004-01-01

    The process of analyzing data under uncertainty is a main goal for many real life problems. Statistical analysis for such data is an interested area for research. The aim of this paper is to introduce a new method concerning the generalization and modification of the rough set theory introduced early by Pawlak [Int. J. Comput. Inform. Sci. 11 (1982) 314

  17. Nursing Minimum Data Set Based on EHR Archetypes Approach.

    Science.gov (United States)

    Spigolon, Dandara N; Moro, Cláudia M C

    2012-01-01

    The establishment of a Nursing Minimum Data Set (NMDS) can facilitate the use of health information systems. The adoption of these sets and represent them based on archetypes are a way of developing and support health systems. The objective of this paper is to describe the definition of a minimum data set for nursing in endometriosis represent with archetypes. The study was divided into two steps: Defining the Nursing Minimum Data Set to endometriosis, and Development archetypes related to the NMDS. The nursing data set to endometriosis was represented in the form of archetype, using the whole perception of the evaluation item, organs and senses. This form of representation is an important tool for semantic interoperability and knowledge representation for health information systems.

  18. Shipping Information Pipeline

    DEFF Research Database (Denmark)

    Jensen, Thomas

    to creating a more efficient shipping industry, and a number of critical issues are identified. These include that shipments depend on shipping information, that shipments often are delayed due to issues with documentation, that EDI messages account for only a minor part of the needed information......This thesis applies theoretical perspectives from the Information Systems (IS) research field to propose how Information Technology (IT) can improve containerized shipping. This question is addressed by developing a set of design principles for an information infrastructure for sharing shipping...... information named the Shipping Information Pipeline (SIP). Review of the literature revealed that IS research prescribed a set of meta-design principles, including digitalization and digital collaboration by implementation of Inter-Organizational Systems based on Electronic Data Interchange (EDI) messages...

  19. Priority setting: what constitutes success? A conceptual framework for successful priority setting.

    Science.gov (United States)

    Sibbald, Shannon L; Singer, Peter A; Upshur, Ross; Martin, Douglas K

    2009-03-05

    The sustainability of healthcare systems worldwide is threatened by a growing demand for services and expensive innovative technologies. Decision makers struggle in this environment to set priorities appropriately, particularly because they lack consensus about which values should guide their decisions. One way to approach this problem is to determine what all relevant stakeholders understand successful priority setting to mean. The goal of this research was to develop a conceptual framework for successful priority setting. Three separate empirical studies were completed using qualitative data collection methods (one-on-one interviews with healthcare decision makers from across Canada; focus groups with representation of patients, caregivers and policy makers; and Delphi study including scholars and decision makers from five countries). This paper synthesizes the findings from three studies into a framework of ten separate but interconnected elements germane to successful priority setting: stakeholder understanding, shifted priorities/reallocation of resources, decision making quality, stakeholder acceptance and satisfaction, positive externalities, stakeholder engagement, use of explicit process, information management, consideration of values and context, and revision or appeals mechanism. The ten elements specify both quantitative and qualitative dimensions of priority setting and relate to both process and outcome components. To our knowledge, this is the first framework that describes successful priority setting. The ten elements identified in this research provide guidance for decision makers and a common language to discuss priority setting success and work toward improving priority setting efforts.

  20. MEDIUM-INFORMATION SIDE OF WORK OF THE NATIONAL POLICE

    OpenAIRE

    Трофименко, Володимир Анатолійович

    2018-01-01

    Problem setting. Modernity dictates the new conditions for the existence and effective functioning of state institutions, in particular, the law-enforcement system. The public demands not only the coordinated, effective work of the national police on its intended purpose, but also comprehensive reporting and informing of citizens about the law enforcement activities.Today, there have been created and exist a large number of mass-media (mass-media) of different types in Ukraine: from classical...

  1. Stabilizing model predictive control : on the enlargement of the terminal set

    NARCIS (Netherlands)

    Brunner, F.D.; Lazar, M.; Allgöwer, F.

    2015-01-01

    It is well known that a large terminal set leads to a large region where the model predictive control problem is feasible without the need for a long prediction horizon. This paper proposes a new method for the enlargement of the terminal set. Different from existing approaches, the method uses the

  2. Large-scale climatic anomalies affect marine predator foraging behaviour and demography

    Science.gov (United States)

    Bost, Charles A.; Cotté, Cedric; Terray, Pascal; Barbraud, Christophe; Bon, Cécile; Delord, Karine; Gimenez, Olivier; Handrich, Yves; Naito, Yasuhiko; Guinet, Christophe; Weimerskirch, Henri

    2015-10-01

    Determining the links between the behavioural and population responses of wild species to environmental variations is critical for understanding the impact of climate variability on ecosystems. Using long-term data sets, we show how large-scale climatic anomalies in the Southern Hemisphere affect the foraging behaviour and population dynamics of a key marine predator, the king penguin. When large-scale subtropical dipole events occur simultaneously in both subtropical Southern Indian and Atlantic Oceans, they generate tropical anomalies that shift the foraging zone southward. Consequently the distances that penguins foraged from the colony and their feeding depths increased and the population size decreased. This represents an example of a robust and fast impact of large-scale climatic anomalies affecting a marine predator through changes in its at-sea behaviour and demography, despite lack of information on prey availability. Our results highlight a possible behavioural mechanism through which climate variability may affect population processes.

  3. Social Set Visualizer (SoSeVi) II

    DEFF Research Database (Denmark)

    Flesch, Benjamin; Vatrapu, Ravi

    2016-01-01

    This paper reports the second iteration of the Social Set Visualizer (SoSeVi), a set theoretical visual analytics dashboard of big social data. In order to further demonstrate its usefulness in large-scale visual analytics tasks of individual and collective behavior of actors in social networks......, the current iteration of the Social Set Visualizer (SoSeVi) in version II builds on recent advancements in visualizing set intersections. The development of the SoSeVi dashboard involved cutting-edge open source visual analytics libraries (D3.js) and creation of new visualizations such as of actor mobility...

  4. Computing Convex Coverage Sets for Faster Multi-Objective Coordination

    NARCIS (Netherlands)

    Roijers, D.M.; Whiteson, S.; Oliehoek, F.A.

    2015-01-01

    In this article, we propose new algorithms for multi-objective coordination graphs (MO-CoGs). Key to the efficiency of these algorithms is that they compute a convex coverage set (CCS) instead of a Pareto coverage set (PCS). Not only is a CCS a sufficient solution set for a large class of problems,

  5. Information resource preferences by general pediatricians in office settings: a qualitative study

    Directory of Open Access Journals (Sweden)

    Lehmann Harold P

    2005-10-01

    Full Text Available Abstract Background Information needs and resource preferences of office-based general pediatricians have not been well characterized. Methods Data collected from a sample of twenty office-based urban/suburban general pediatricians consisted of: (a a demographic survey about participants' practice and computer use, (b semi-structured interviews on their use of different types of information resources and (c semi-structured interviews on perceptions of information needs and resource preferences in response to clinical vignettes representing cases in Genetics and Infectious Diseases. Content analysis of interviews provided participants' perceived use of resources and their perceived questions and preferred resources in response to vignettes. Results Participants' average time in practice was 15.4 years (2–28 years. All had in-office online access. Participants identified specialist/generalist colleagues, general/specialty pediatric texts, drug formularies, federal government/professional organization Websites and medical portals (when available as preferred information sources. They did not identify decision-making texts, evidence-based reviews, journal abstracts, medical librarians or consumer health information for routine office use. In response to clinical vignettes in Genetics and Infectious Diseases, participants identified Question Types about patient-specific (diagnosis, history and findings and general medical (diagnostic, therapeutic and referral guidelines information. They identified specialists and specialty textbooks, history and physical examination, colleagues and general pediatric textbooks, and federal and professional organizational Websites as information sources. Participants with access to portals identified them as information resources in lieu of texts. For Genetics vignettes, participants identified questions about prenatal history, disease etiology and treatment guidelines. For Genetics vignettes, they identified

  6. Identifying and applying psychological theory to setting and achieving rehabilitation goals.

    Science.gov (United States)

    Scobbie, Lesley; Wyke, Sally; Dixon, Diane

    2009-04-01

    Goal setting is considered to be a fundamental part of rehabilitation; however, theories of behaviour change relevant to goal-setting practice have not been comprehensively reviewed. (i) To identify and discuss specific theories of behaviour change relevant to goal-setting practice in the rehabilitation setting. (ii) To identify 'candidate' theories that that offer most potential to inform clinical practice. The rehabilitation and self-management literature was systematically searched to identify review papers or empirical studies that proposed a specific theory of behaviour change relevant to setting and/or achieving goals in a clinical context. Data from included papers were extracted under the headings of: key constructs, clinical application and empirical support. Twenty-four papers were included in the review which proposed a total of five theories: (i) social cognitive theory, (ii) goal setting theory, (iii) health action process approach, (iv) proactive coping theory, and (v) the self-regulatory model of illness behaviour. The first three of these theories demonstrated most potential to inform clinical practice, on the basis of their capacity to inform interventions that resulted in improved patient outcomes. Social cognitive theory, goal setting theory and the health action process approach are theories of behaviour change that can inform clinicians in the process of setting and achieving goals in the rehabilitation setting. Overlapping constructs within these theories have been identified, and can be applied in clinical practice through the development and evaluation of a goal-setting practice framework.

  7. Algorithms for detecting and analysing autocatalytic sets.

    Science.gov (United States)

    Hordijk, Wim; Smith, Joshua I; Steel, Mike

    2015-01-01

    Autocatalytic sets are considered to be fundamental to the origin of life. Prior theoretical and computational work on the existence and properties of these sets has relied on a fast algorithm for detectingself-sustaining autocatalytic sets in chemical reaction systems. Here, we introduce and apply a modified version and several extensions of the basic algorithm: (i) a modification aimed at reducing the number of calls to the computationally most expensive part of the algorithm, (ii) the application of a previously introduced extension of the basic algorithm to sample the smallest possible autocatalytic sets within a reaction network, and the application of a statistical test which provides a probable lower bound on the number of such smallest sets, (iii) the introduction and application of another extension of the basic algorithm to detect autocatalytic sets in a reaction system where molecules can also inhibit (as well as catalyse) reactions, (iv) a further, more abstract, extension of the theory behind searching for autocatalytic sets. (i) The modified algorithm outperforms the original one in the number of calls to the computationally most expensive procedure, which, in some cases also leads to a significant improvement in overall running time, (ii) our statistical test provides strong support for the existence of very large numbers (even millions) of minimal autocatalytic sets in a well-studied polymer model, where these minimal sets share about half of their reactions on average, (iii) "uninhibited" autocatalytic sets can be found in reaction systems that allow inhibition, but their number and sizes depend on the level of inhibition relative to the level of catalysis. (i) Improvements in the overall running time when searching for autocatalytic sets can potentially be obtained by using a modified version of the algorithm, (ii) the existence of large numbers of minimal autocatalytic sets can have important consequences for the possible evolvability of

  8. Standard setting in medical education: fundamental concepts and emerging challenges.

    Science.gov (United States)

    Mortaz Hejri, Sara; Jalili, Mohammad

    2014-01-01

    The process of determining the minimum pass level to separate the competent students from those who do not perform well enough is called standard setting. A large number of methods are widely used to set cut-scores for both written and clinical examinations. There are some challenging issues pertaining to any standard setting procedure. Ignoring these concerns would result in a large dispute regarding the credibility and defensibility of the method. The goal of this review is to provide a basic understanding of the key concepts and challenges in standard setting and to suggest some recommendations to overcome the challenging issues for educators and policymakers who are dealing with decision-making in this field.

  9. Piracy prevention and the pricing of information goods

    OpenAIRE

    Cremer, Helmuth; Pestieau, Pierre

    2006-01-01

    This paper develops a simple model of piracy to analyze its effects on prices and welfare and to study the optimal enforcement policy. A monopolist produces an information good (involving a 'large' development cost and a 'small' reproduction cost) that is sold to two groups of consumers differing in their valuation of the good. We distinguish two settings: one in which the monopoly is regulated and one in which it maximizes profits and is not regulated, except that the public authority may be...

  10. Institutionalization of evidence-informed practices in healthcare settings.

    Science.gov (United States)

    Novotná, Gabriela; Dobbins, Maureen; Henderson, Joanna

    2012-11-21

    The effective and timely integration of the best available research evidence into healthcare practice has considerable potential to improve the quality of provided care. Knowledge translation (KT) approaches aim to develop, implement, and evaluate strategies to address the research-practice gap. However, most KT research has been directed toward implementation strategies that apply cognitive, behavioral, and, to a lesser extent, organizational theories. In this paper, we discuss the potential of institutional theory to inform KT-related research. Despite significant research, there is still much to learn about how to achieve KT within healthcare systems and practices. Institutional theory, focusing on the processes by which new ideas and concepts become accepted within their institutional environments, holds promise for advancing KT efforts and research. To propose new directions for future KT research, we present some of the main concepts of institutional theory and discuss their application to KT research by outlining how institutionalization of new practices can lead to their ongoing use in organizations. In addition, we discuss the circumstances under which institutionalized practices dissipate and give way to new insights and ideas that can lead to new, more effective practices. KT research informed by institutional theory can provide important insights into how knowledge becomes implemented, routinized, and accepted as institutionalized practices. Future KT research should employ both quantitative and qualitative research designs to examine the specifics of sustainability, institutionalization, and deinstitutionalization of practices to enhance our understanding of these complex constructs.

  11. A minimum set of ancestry informative markers for determining admixture proportions in a mixed American population: the Brazilian set.

    Science.gov (United States)

    Santos, Hadassa C; Horimoto, Andréa V R; Tarazona-Santos, Eduardo; Rodrigues-Soares, Fernanda; Barreto, Mauricio L; Horta, Bernardo L; Lima-Costa, Maria F; Gouveia, Mateus H; Machado, Moara; Silva, Thiago M; Sanches, José M; Esteban, Nubia; Magalhaes, Wagner C S; Rodrigues, Maíra R; Kehdy, Fernanda S G; Pereira, Alexandre C

    2016-05-01

    The Brazilian population is considered to be highly admixed. The main contributing ancestral populations were European and African, with Amerindians contributing to a lesser extent. The aims of this study were to provide a resource for determining and quantifying individual continental ancestry using the smallest number of SNPs possible, thus allowing for a cost- and time-efficient strategy for genomic ancestry determination. We identified and validated a minimum set of 192 ancestry informative markers (AIMs) for the genetic ancestry determination of Brazilian populations. These markers were selected on the basis of their distribution throughout the human genome, and their capacity of being genotyped on widely available commercial platforms. We analyzed genotyping data from 6487 individuals belonging to three Brazilian cohorts. Estimates of individual admixture using this 192 AIM panels were highly correlated with estimates using ~370 000 genome-wide SNPs: 91%, 92%, and 74% of, respectively, African, European, and Native American ancestry components. Besides that, 192 AIMs are well distributed among populations from these ancestral continents, allowing greater freedom in future studies with this panel regarding the choice of reference populations. We also observed that genetic ancestry inferred by AIMs provides similar association results to the one obtained using ancestry inferred by genomic data (370 K SNPs) in a simple regression model with rs1426654, related to skin pigmentation, genotypes as dependent variable. In conclusion, these markers can be used to identify and accurately quantify ancestry of Latin Americans or US Hispanics/Latino individuals, in particular in the context of fine-mapping strategies that require the quantification of continental ancestry in thousands of individuals.

  12. Construct Maps as a Foundation for Standard Setting

    Science.gov (United States)

    Wyse, Adam E.

    2013-01-01

    Construct maps are tools that display how the underlying achievement construct upon which one is trying to set cut-scores is related to other information used in the process of standard setting. This article reviews what construct maps are, uses construct maps to provide a conceptual framework to view commonly used standard-setting procedures (the…

  13. Performance assessment and optimisation of a large information system by combined customer relationship management and resilience engineering: a mathematical programming approach

    Science.gov (United States)

    Azadeh, A.; Foroozan, H.; Ashjari, B.; Motevali Haghighi, S.; Yazdanparast, R.; Saberi, M.; Torki Nejad, M.

    2017-10-01

    ISs and ITs play a critical role in large complex gas corporations. Many factors such as human, organisational and environmental factors affect IS in an organisation. Therefore, investigating ISs success is considered to be a complex problem. Also, because of the competitive business environment and the high amount of information flow in organisations, new issues like resilient ISs and successful customer relationship management (CRM) have emerged. A resilient IS will provide sustainable delivery of information to internal and external customers. This paper presents an integrated approach to enhance and optimise the performance of each component of a large IS based on CRM and resilience engineering (RE) in a gas company. The enhancement of the performance can help ISs to perform business tasks efficiently. The data are collected from standard questionnaires. It is then analysed by data envelopment analysis by selecting the optimal mathematical programming approach. The selected model is validated and verified by principle component analysis method. Finally, CRM and RE factors are identified as influential factors through sensitivity analysis for this particular case study. To the best of our knowledge, this is the first study for performance assessment and optimisation of large IS by combined RE and CRM.

  14. High-throughput film-densitometry: An efficient approach to generate large data sets

    Energy Technology Data Exchange (ETDEWEB)

    Typke, Dieter; Nordmeyer, Robert A.; Jones, Arthur; Lee, Juyoung; Avila-Sakar, Agustin; Downing, Kenneth H.; Glaeser, Robert M.

    2004-07-14

    A film-handling machine (robot) has been built which can, in conjunction with a commercially available film densitometer, exchange and digitize over 300 electron micrographs per day. Implementation of robotic film handling effectively eliminates the delay and tedium associated with digitizing images when data are initially recorded on photographic film. The modulation transfer function (MTF) of the commercially available densitometer is significantly worse than that of a high-end, scientific microdensitometer. Nevertheless, its signal-to-noise ratio (S/N) is quite excellent, allowing substantial restoration of the output to ''near-to-perfect'' performance. Due to the large area of the standard electron microscope film that can be digitized by the commercial densitometer (up to 10,000 x 13,680 pixels with an appropriately coded holder), automated film digitization offers a fast and inexpensive alternative to high-end CCD cameras as a means of acquiring large amounts of image data in electron microscopy.

  15. Generating mock data sets for large-scale Lyman-α forest correlation measurements

    Energy Technology Data Exchange (ETDEWEB)

    Font-Ribera, Andreu [Institut de Ciències de l' Espai (CSIC-IEEC), Campus UAB, Fac. Ciències, torre C5 parell 2, Bellaterra, Catalonia (Spain); McDonald, Patrick [Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720 (United States); Miralda-Escudé, Jordi, E-mail: font@ieec.uab.es, E-mail: pvmcdonald@lbl.gov, E-mail: miralda@icc.ub.edu [Institució Catalana de Recerca i Estudis Avançats, Barcelona, Catalonia (Spain)

    2012-01-01

    Massive spectroscopic surveys of high-redshift quasars yield large numbers of correlated Lyα absorption spectra that can be used to measure large-scale structure. Simulations of these surveys are required to accurately interpret the measurements of correlations and correct for systematic errors. An efficient method to generate mock realizations of Lyα forest surveys is presented which generates a field over the lines of sight to the survey sources only, instead of having to generate it over the entire three-dimensional volume of the survey. The method can be calibrated to reproduce the power spectrum and one-point distribution function of the transmitted flux fraction, as well as the redshift evolution of these quantities, and is easily used for modeling any survey systematic effects. We present an example of how these mock surveys are applied to predict the measurement errors in a survey with similar parameters as the BOSS quasar survey in SDSS-III.

  16. Generating mock data sets for large-scale Lyman-α forest correlation measurements

    International Nuclear Information System (INIS)

    Font-Ribera, Andreu; McDonald, Patrick; Miralda-Escudé, Jordi

    2012-01-01

    Massive spectroscopic surveys of high-redshift quasars yield large numbers of correlated Lyα absorption spectra that can be used to measure large-scale structure. Simulations of these surveys are required to accurately interpret the measurements of correlations and correct for systematic errors. An efficient method to generate mock realizations of Lyα forest surveys is presented which generates a field over the lines of sight to the survey sources only, instead of having to generate it over the entire three-dimensional volume of the survey. The method can be calibrated to reproduce the power spectrum and one-point distribution function of the transmitted flux fraction, as well as the redshift evolution of these quantities, and is easily used for modeling any survey systematic effects. We present an example of how these mock surveys are applied to predict the measurement errors in a survey with similar parameters as the BOSS quasar survey in SDSS-III

  17. Climate change education in informal settings: Using boundary objects to frame network dissemination

    Science.gov (United States)

    Steiner, Mary Ann

    This study of climate change education dissemination takes place in the context of a larger project where institutions in four cities worked together to develop a linked set of informal learning experiences about climate change. Each city developed an organizational network to explore new ways to connect urban audiences with climate change education. The four city-specific networks shared tools, resources, and knowledge with each other. The networks were related in mission and goals, but were structured and functioned differently depending on the city context. This study illustrates how the tools, resources, and knowledge developed in one network were shared with networks in two additional cities. Boundary crossing theory frames the study to describe the role of objects and processes in sharing between networks. Findings suggest that the goals, capacity and composition of networks resulted in a different emphasis in dissemination efforts, in one case to push the approach out to partners for their own work and in the other to pull partners into a more collaborative stance. Learning experiences developed in each city as a result of the dissemination reflected these differences in the city-specific emphasis with the push city diving into messy examples of the approach to make their own examples, and the pull city offering polished experiences to partners in order to build confidence in the climate change messaging. The networks themselves underwent different kinds of growth and change as a result of dissemination. The emphasis on push and use of messy examples resulted in active use of the principles of the approach and the pull emphasis with polished examples resulted in the cultivation of partnerships with the hub and the potential to engage in the educational approach. These findings have implications for boundary object theory as a useful grounding for dissemination designs in the context of networks of informal learning organizations to support a shift in

  18. Large scale deep learning for computer aided detection of mammographic lesions.

    Science.gov (United States)

    Kooi, Thijs; Litjens, Geert; van Ginneken, Bram; Gubern-Mérida, Albert; Sánchez, Clara I; Mann, Ritse; den Heeten, Ard; Karssemeijer, Nico

    2017-01-01

    Recent advances in machine learning yielded new techniques to train deep neural networks, which resulted in highly successful applications in many pattern recognition tasks such as object detection and speech recognition. In this paper we provide a head-to-head comparison between a state-of-the art in mammography CAD system, relying on a manually designed feature set and a Convolutional Neural Network (CNN), aiming for a system that can ultimately read mammograms independently. Both systems are trained on a large data set of around 45,000 images and results show the CNN outperforms the traditional CAD system at low sensitivity and performs comparable at high sensitivity. We subsequently investigate to what extent features such as location and patient information and commonly used manual features can still complement the network and see improvements at high specificity over the CNN especially with location and context features, which contain information not available to the CNN. Additionally, a reader study was performed, where the network was compared to certified screening radiologists on a patch level and we found no significant difference between the network and the readers. Copyright © 2016 Elsevier B.V. All rights reserved.

  19. Application of a statistical software package for analysis of large patient dose data sets obtained from RIS

    International Nuclear Information System (INIS)

    Fazakerley, J.; Charnock, P.; Wilde, R.; Jones, R.; Ward, M.

    2010-01-01

    For the purpose of patient dose audit, clinical audit and radiology workload analysis, data from Radiology Information Systems (RIS) at many hospitals are collected using a database and the analysis was automated using a statistical package and Visual Basic coding. The database is a Structured Query Language database, which can be queried using an off-the-shelf statistical package, Statistica. Macros were created to automatically format the data to a consistent format between different hospitals ready for analysis. These macros can also be used to automate further analysis such as detailing mean kV, mAs and entrance surface dose per room and per gender. Standard deviation and standard error of the mean are also generated. Graphs can also be generated to illustrate the trends in doses between different variables such as room and gender. Collectively, this information can be used to generate a report. A process that once could take up to 1 d to complete now takes around 1 h. A major benefit in providing the service to hospital trusts is that less resource is now required to report on RIS data, making the possibility of continuous dose audit more likely. Time that was spent on sorting through data can now be spent on improving the analysis to provide benefit to the customer. Using data sets from RIS is a good way to perform dose audits as the huge numbers of data available provide the bases for very accurate analysis. Using macros written in Statistica Visual Basic has helped sort and consistently analyse these data. Being able to analyse by exposure factors has provided a more detailed report to the customer. (authors)

  20. Dimensionality Reduction and Information-Theoretic Divergence Between Sets of Ladar Images

    National Research Council Canada - National Science Library

    Gray, David M; Principe, Jose C

    2008-01-01

    ... can be exploited while circumventing many of the problems associated with the so-called "curse of dimensionality." In this study, PCA techniques are used to find a low-dimensional sub-space representation of LADAR image sets...