WorldWideScience

Sample records for performance data-driven approach

  1. Data-Driven Controller Design The H2 Approach

    CERN Document Server

    Sanfelice Bazanella, Alexandre; Eckhard, Diego

    2012-01-01

    Data-driven methodologies have recently emerged as an important paradigm alternative to model-based controller design and several such methodologies are formulated as an H2 performance optimization. This book presents a comprehensive theoretical treatment of the H2 approach to data-driven control design. The fundamental properties implied by the H2 problem formulation are analyzed in detail, so that common features to all solutions are identified. Direct methods (VRFT) and iterative methods (IFT, DFT, CbT) are put under a common theoretical framework. The choice of the reference model, the experimental conditions, the optimization method to be used, and several other designer’s choices are crucial to the quality of the final outcome, and firm guidelines for all these choices are derived from the theoretical analysis presented. The practical application of the concepts in the book is illustrated with a large number of practical designs performed for different classes of processes: thermal, fluid processing a...

  2. Using Two Different Approaches to Assess Dietary Patterns: Hypothesis-Driven and Data-Driven Analysis

    Directory of Open Access Journals (Sweden)

    Ágatha Nogueira Previdelli

    2016-09-01

    Full Text Available The use of dietary patterns to assess dietary intake has become increasingly common in nutritional epidemiology studies due to the complexity and multidimensionality of the diet. Currently, two main approaches have been widely used to assess dietary patterns: data-driven and hypothesis-driven analysis. Since the methods explore different angles of dietary intake, using both approaches simultaneously might yield complementary and useful information; thus, we aimed to use both approaches to gain knowledge of adolescents’ dietary patterns. Food intake from a cross-sectional survey with 295 adolescents was assessed by 24 h dietary recall (24HR. In hypothesis-driven analysis, based on the American National Cancer Institute method, the usual intake of Brazilian Healthy Eating Index Revised components were estimated. In the data-driven approach, the usual intake of foods/food groups was estimated by the Multiple Source Method. In the results, hypothesis-driven analysis showed low scores for Whole grains, Total vegetables, Total fruit and Whole fruits, while, in data-driven analysis, fruits and whole grains were not presented in any pattern. High intakes of sodium, fats and sugars were observed in hypothesis-driven analysis with low total scores for Sodium, Saturated fat and SoFAA (calories from solid fat, alcohol and added sugar components in agreement, while the data-driven approach showed the intake of several foods/food groups rich in these nutrients, such as butter/margarine, cookies, chocolate powder, whole milk, cheese, processed meat/cold cuts and candies. In this study, using both approaches at the same time provided consistent and complementary information with regard to assessing the overall dietary habits that will be important in order to drive public health programs, and improve their efficiency to monitor and evaluate the dietary patterns of populations.

  3. Statistical Data Processing with R – Metadata Driven Approach

    Directory of Open Access Journals (Sweden)

    Rudi SELJAK

    2016-06-01

    Full Text Available In recent years the Statistical Office of the Republic of Slovenia has put a lot of effort into re-designing its statistical process. We replaced the classical stove-pipe oriented production system with general software solutions, based on the metadata driven approach. This means that one general program code, which is parametrized with process metadata, is used for data processing for a particular survey. Currently, the general program code is entirely based on SAS macros, but in the future we would like to explore how successfully statistical software R can be used for this approach. Paper describes the metadata driven principle for data validation, generic software solution and main issues connected with the use of statistical software R for this approach.

  4. A data-driven multiplicative fault diagnosis approach for automation processes.

    Science.gov (United States)

    Hao, Haiyang; Zhang, Kai; Ding, Steven X; Chen, Zhiwen; Lei, Yaguo

    2014-09-01

    This paper presents a new data-driven method for diagnosing multiplicative key performance degradation in automation processes. Different from the well-established additive fault diagnosis approaches, the proposed method aims at identifying those low-level components which increase the variability of process variables and cause performance degradation. Based on process data, features of multiplicative fault are extracted. To identify the root cause, the impact of fault on each process variable is evaluated in the sense of contribution to performance degradation. Then, a numerical example is used to illustrate the functionalities of the method and Monte-Carlo simulation is performed to demonstrate the effectiveness from the statistical viewpoint. Finally, to show the practical applicability, a case study on the Tennessee Eastman process is presented. Copyright © 2013. Published by Elsevier Ltd.

  5. Data and Dynamics Driven Approaches for Modelling and Forecasting the Red Sea Chlorophyll

    KAUST Repository

    Dreano, Denis

    2017-01-01

    concentration and have practical applications for fisheries operation and harmful algae blooms monitoring. Modelling approaches can be divided between physics- driven (dynamical) approaches, and data-driven (statistical) approaches. Dynamical models are based

  6. Data-Driven Anomaly Detection Performance for the Ares I-X Ground Diagnostic Prototype

    Science.gov (United States)

    Martin, Rodney A.; Schwabacher, Mark A.; Matthews, Bryan L.

    2010-01-01

    In this paper, we will assess the performance of a data-driven anomaly detection algorithm, the Inductive Monitoring System (IMS), which can be used to detect simulated Thrust Vector Control (TVC) system failures. However, the ability of IMS to detect these failures in a true operational setting may be related to the realistic nature of how they are simulated. As such, we will investigate both a low fidelity and high fidelity approach to simulating such failures, with the latter based upon the underlying physics. Furthermore, the ability of IMS to detect anomalies that were previously unknown and not previously simulated will be studied in earnest, as well as apparent deficiencies or misapplications that result from using the data-driven paradigm. Our conclusions indicate that robust detection performance of simulated failures using IMS is not appreciably affected by the use of a high fidelity simulation. However, we have found that the inclusion of a data-driven algorithm such as IMS into a suite of deployable health management technologies does add significant value.

  7. Combining engineering and data-driven approaches

    DEFF Research Database (Denmark)

    Fischer, Katharina; De Sanctis, Gianluca; Kohler, Jochen

    2015-01-01

    Two general approaches may be followed for the development of a fire risk model: statistical models based on observed fire losses can support simple cost-benefit studies but are usually not detailed enough for engineering decision-making. Engineering models, on the other hand, require many assump...... to the calibration of a generic fire risk model for single family houses to Swiss insurance data. The example demonstrates that the bias in the risk estimation can be strongly reduced by model calibration.......Two general approaches may be followed for the development of a fire risk model: statistical models based on observed fire losses can support simple cost-benefit studies but are usually not detailed enough for engineering decision-making. Engineering models, on the other hand, require many...... assumptions that may result in a biased risk assessment. In two related papers we show how engineering and data-driven modelling can be combined by developing generic risk models that are calibrated to statistical data on observed fire events. The focus of the present paper is on the calibration procedure...

  8. Articulatory Distinctiveness of Vowels and Consonants: A Data-Driven Approach

    Science.gov (United States)

    Wang, Jun; Green, Jordan R.; Samal, Ashok; Yunusova, Yana

    2013-01-01

    Purpose: To quantify the articulatory distinctiveness of 8 major English vowels and 11 English consonants based on tongue and lip movement time series data using a data-driven approach. Method: Tongue and lip movements of 8 vowels and 11 consonants from 10 healthy talkers were

  9. Controller synthesis for negative imaginary systems: a data driven approach

    KAUST Repository

    Mabrok, Mohamed; Petersen, Ian R.

    2016-01-01

    -driven controller synthesis methodology for NI systems is presented. In this approach, measured frequency response data of the plant is used to construct the controller frequency response at every frequency by minimising a cost function. Then, this controller

  10. A data-driven approach for modeling post-fire debris-flow volumes and their uncertainty

    Science.gov (United States)

    Friedel, Michael J.

    2011-01-01

    This study demonstrates the novel application of genetic programming to evolve nonlinear post-fire debris-flow volume equations from variables associated with a data-driven conceptual model of the western United States. The search space is constrained using a multi-component objective function that simultaneously minimizes root-mean squared and unit errors for the evolution of fittest equations. An optimization technique is then used to estimate the limits of nonlinear prediction uncertainty associated with the debris-flow equations. In contrast to a published multiple linear regression three-variable equation, linking basin area with slopes greater or equal to 30 percent, burn severity characterized as area burned moderate plus high, and total storm rainfall, the data-driven approach discovers many nonlinear and several dimensionally consistent equations that are unbiased and have less prediction uncertainty. Of the nonlinear equations, the best performance (lowest prediction uncertainty) is achieved when using three variables: average basin slope, total burned area, and total storm rainfall. Further reduction in uncertainty is possible for the nonlinear equations when dimensional consistency is not a priority and by subsequently applying a gradient solver to the fittest solutions. The data-driven modeling approach can be applied to nonlinear multivariate problems in all fields of study.

  11. Data and Dynamics Driven Approaches for Modelling and Forecasting the Red Sea Chlorophyll

    KAUST Repository

    Dreano, Denis

    2017-05-31

    Phytoplankton is at the basis of the marine food chain and therefore play a fundamental role in the ocean ecosystem. However, the large-scale phytoplankton dynamics of the Red Sea are not well understood yet, mainly due to the lack of historical in situ measurements. As a result, our knowledge in this area relies mostly on remotely-sensed observations and large-scale numerical marine ecosystem models. Models are very useful to identify the mechanisms driving the variations in chlorophyll concentration and have practical applications for fisheries operation and harmful algae blooms monitoring. Modelling approaches can be divided between physics- driven (dynamical) approaches, and data-driven (statistical) approaches. Dynamical models are based on a set of differential equations representing the transfer of energy and matter between different subsets of the biota, whereas statistical models identify relationships between variables based on statistical relations within the available data. The goal of this thesis is to develop, implement and test novel dynamical and statistical modelling approaches for studying and forecasting the variability of chlorophyll concentration in the Red Sea. These new models are evaluated in term of their ability to efficiently forecast and explain the regional chlorophyll variability. We also propose innovative synergistic strategies to combine data- and physics-driven approaches to further enhance chlorophyll forecasting capabilities and efficiency.

  12. Data-driven HR how to use analytics and metrics to drive performance

    CERN Document Server

    Marr, Bernard

    2018-01-01

    Traditionally seen as a purely people function unconcerned with numbers, HR is now uniquely placed to use company data to drive performance, both of the people in the organization and the organization as a whole. Data-driven HR is a practical guide which enables HR practitioners to leverage the value of the vast amount of data available at their fingertips. Covering how to identify the most useful sources of data, how to collect information in a transparent way that is in line with data protection requirements and how to turn this data into tangible insights, this book marks a turning point for the HR profession. Covering all the key elements of HR including recruitment, employee engagement, performance management, wellbeing and training, Data-driven HR examines the ways data can contribute to organizational success by, among other things, optimizing processes, driving performance and improving HR decision making. Packed with case studies and real-life examples, this is essential reading for all HR profession...

  13. A data-driven approach to identify controls on global fire activity from satellite and climate observations (SOFIA V1

    Directory of Open Access Journals (Sweden)

    M. Forkel

    2017-12-01

    Full Text Available Vegetation fires affect human infrastructures, ecosystems, global vegetation distribution, and atmospheric composition. However, the climatic, environmental, and socioeconomic factors that control global fire activity in vegetation are only poorly understood, and in various complexities and formulations are represented in global process-oriented vegetation-fire models. Data-driven model approaches such as machine learning algorithms have successfully been used to identify and better understand controlling factors for fire activity. However, such machine learning models cannot be easily adapted or even implemented within process-oriented global vegetation-fire models. To overcome this gap between machine learning-based approaches and process-oriented global fire models, we introduce a new flexible data-driven fire modelling approach here (Satellite Observations to predict FIre Activity, SOFIA approach version 1. SOFIA models can use several predictor variables and functional relationships to estimate burned area that can be easily adapted with more complex process-oriented vegetation-fire models. We created an ensemble of SOFIA models to test the importance of several predictor variables. SOFIA models result in the highest performance in predicting burned area if they account for a direct restriction of fire activity under wet conditions and if they include a land cover-dependent restriction or allowance of fire activity by vegetation density and biomass. The use of vegetation optical depth data from microwave satellite observations, a proxy for vegetation biomass and water content, reaches higher model performance than commonly used vegetation variables from optical sensors. We further analyse spatial patterns of the sensitivity between anthropogenic, climate, and vegetation predictor variables and burned area. We finally discuss how multiple observational datasets on climate, hydrological, vegetation, and socioeconomic variables together with

  14. A data-driven approach to identify controls on global fire activity from satellite and climate observations (SOFIA V1)

    Science.gov (United States)

    Forkel, Matthias; Dorigo, Wouter; Lasslop, Gitta; Teubner, Irene; Chuvieco, Emilio; Thonicke, Kirsten

    2017-12-01

    Vegetation fires affect human infrastructures, ecosystems, global vegetation distribution, and atmospheric composition. However, the climatic, environmental, and socioeconomic factors that control global fire activity in vegetation are only poorly understood, and in various complexities and formulations are represented in global process-oriented vegetation-fire models. Data-driven model approaches such as machine learning algorithms have successfully been used to identify and better understand controlling factors for fire activity. However, such machine learning models cannot be easily adapted or even implemented within process-oriented global vegetation-fire models. To overcome this gap between machine learning-based approaches and process-oriented global fire models, we introduce a new flexible data-driven fire modelling approach here (Satellite Observations to predict FIre Activity, SOFIA approach version 1). SOFIA models can use several predictor variables and functional relationships to estimate burned area that can be easily adapted with more complex process-oriented vegetation-fire models. We created an ensemble of SOFIA models to test the importance of several predictor variables. SOFIA models result in the highest performance in predicting burned area if they account for a direct restriction of fire activity under wet conditions and if they include a land cover-dependent restriction or allowance of fire activity by vegetation density and biomass. The use of vegetation optical depth data from microwave satellite observations, a proxy for vegetation biomass and water content, reaches higher model performance than commonly used vegetation variables from optical sensors. We further analyse spatial patterns of the sensitivity between anthropogenic, climate, and vegetation predictor variables and burned area. We finally discuss how multiple observational datasets on climate, hydrological, vegetation, and socioeconomic variables together with data-driven

  15. Designing Data-Driven Battery Prognostic Approaches for Variable Loading Profiles: Some Lessons Learned

    Data.gov (United States)

    National Aeronautics and Space Administration — Among various approaches for implementing prognostic algorithms data-driven algorithms are popular in the industry due to their intuitive nature and relatively fast...

  16. A data-driven approach to quality risk management.

    Science.gov (United States)

    Alemayehu, Demissie; Alvir, Jose; Levenstein, Marcia; Nickerson, David

    2013-10-01

    An effective clinical trial strategy to ensure patient safety as well as trial quality and efficiency involves an integrated approach, including prospective identification of risk factors, mitigation of the risks through proper study design and execution, and assessment of quality metrics in real-time. Such an integrated quality management plan may also be enhanced by using data-driven techniques to identify risk factors that are most relevant in predicting quality issues associated with a trial. In this paper, we illustrate such an approach using data collected from actual clinical trials. Several statistical methods were employed, including the Wilcoxon rank-sum test and logistic regression, to identify the presence of association between risk factors and the occurrence of quality issues, applied to data on quality of clinical trials sponsored by Pfizer. ONLY A SUBSET OF THE RISK FACTORS HAD A SIGNIFICANT ASSOCIATION WITH QUALITY ISSUES, AND INCLUDED: Whether study used Placebo, whether an agent was a biologic, unusual packaging label, complex dosing, and over 25 planned procedures. Proper implementation of the strategy can help to optimize resource utilization without compromising trial integrity and patient safety.

  17. A data-driven approach to quality risk management

    Directory of Open Access Journals (Sweden)

    Demissie Alemayehu

    2013-01-01

    Full Text Available Aim: An effective clinical trial strategy to ensure patient safety as well as trial quality and efficiency involves an integrated approach, including prospective identification of risk factors, mitigation of the risks through proper study design and execution, and assessment of quality metrics in real-time. Such an integrated quality management plan may also be enhanced by using data-driven techniques to identify risk factors that are most relevant in predicting quality issues associated with a trial. In this paper, we illustrate such an approach using data collected from actual clinical trials. Materials and Methods: Several statistical methods were employed, including the Wilcoxon rank-sum test and logistic regression, to identify the presence of association between risk factors and the occurrence of quality issues, applied to data on quality of clinical trials sponsored by Pfizer. Results: Only a subset of the risk factors had a significant association with quality issues, and included: Whether study used Placebo, whether an agent was a biologic, unusual packaging label, complex dosing, and over 25 planned procedures. Conclusion: Proper implementation of the strategy can help to optimize resource utilization without compromising trial integrity and patient safety.

  18. Controller synthesis for negative imaginary systems: a data driven approach

    KAUST Repository

    Mabrok, Mohamed

    2016-02-17

    The negative imaginary (NI) property occurs in many important applications. For instance, flexible structure systems with collocated force actuators and position sensors can be modelled as negative imaginary systems. In this study, a data-driven controller synthesis methodology for NI systems is presented. In this approach, measured frequency response data of the plant is used to construct the controller frequency response at every frequency by minimising a cost function. Then, this controller response is used to identify the controller transfer function using system identification methods. © The Institution of Engineering and Technology 2016.

  19. Data-Driven Security-Constrained OPF

    DEFF Research Database (Denmark)

    Thams, Florian; Halilbasic, Lejla; Pinson, Pierre

    2017-01-01

    considerations, while being less conservative than current approaches. Our approach can be scalable for large systems, accounts explicitly for power system security, and enables the electricity market to identify a cost-efficient dispatch avoiding redispatching actions. We demonstrate the performance of our......In this paper we unify electricity market operations with power system security considerations. Using data-driven techniques, we address both small signal stability and steady-state security, derive tractable decision rules in the form of line flow limits, and incorporate the resulting constraints...... in market clearing algorithms. Our goal is to minimize redispatching actions, and instead allow the market to determine the most cost-efficient dispatch while considering all security constraints. To maintain tractability of our approach we perform our security assessment offline, examining large datasets...

  20. A Data-Driven Approach to Realistic Shape Morphing

    KAUST Repository

    Gao, Lin; Lai, Yu-Kun; Huang, Qi-Xing; Hu, Shi-Min

    2013-01-01

    Morphing between 3D objects is a fundamental technique in computer graphics. Traditional methods of shape morphing focus on establishing meaningful correspondences and finding smooth interpolation between shapes. Such methods however only take geometric information as input and thus cannot in general avoid producing unnatural interpolation, in particular for large-scale deformations. This paper proposes a novel data-driven approach for shape morphing. Given a database with various models belonging to the same category, we treat them as data samples in the plausible deformation space. These models are then clustered to form local shape spaces of plausible deformations. We use a simple metric to reasonably represent the closeness between pairs of models. Given source and target models, the morphing problem is casted as a global optimization problem of finding a minimal distance path within the local shape spaces connecting these models. Under the guidance of intermediate models in the path, an extended as-rigid-as-possible interpolation is used to produce the final morphing. By exploiting the knowledge of plausible models, our approach produces realistic morphing for challenging cases as demonstrated by various examples in the paper. © 2013 The Eurographics Association and Blackwell Publishing Ltd.

  1. A Data-Driven Approach to Realistic Shape Morphing

    KAUST Repository

    Gao, Lin

    2013-05-01

    Morphing between 3D objects is a fundamental technique in computer graphics. Traditional methods of shape morphing focus on establishing meaningful correspondences and finding smooth interpolation between shapes. Such methods however only take geometric information as input and thus cannot in general avoid producing unnatural interpolation, in particular for large-scale deformations. This paper proposes a novel data-driven approach for shape morphing. Given a database with various models belonging to the same category, we treat them as data samples in the plausible deformation space. These models are then clustered to form local shape spaces of plausible deformations. We use a simple metric to reasonably represent the closeness between pairs of models. Given source and target models, the morphing problem is casted as a global optimization problem of finding a minimal distance path within the local shape spaces connecting these models. Under the guidance of intermediate models in the path, an extended as-rigid-as-possible interpolation is used to produce the final morphing. By exploiting the knowledge of plausible models, our approach produces realistic morphing for challenging cases as demonstrated by various examples in the paper. © 2013 The Eurographics Association and Blackwell Publishing Ltd.

  2. Model-driven approach to data collection and reporting for quality improvement.

    Science.gov (United States)

    Curcin, Vasa; Woodcock, Thomas; Poots, Alan J; Majeed, Azeem; Bell, Derek

    2014-12-01

    Continuous data collection and analysis have been shown essential to achieving improvement in healthcare. However, the data required for local improvement initiatives are often not readily available from hospital Electronic Health Record (EHR) systems or not routinely collected. Furthermore, improvement teams are often restricted in time and funding thus requiring inexpensive and rapid tools to support their work. Hence, the informatics challenge in healthcare local improvement initiatives consists of providing a mechanism for rapid modelling of the local domain by non-informatics experts, including performance metric definitions, and grounded in established improvement techniques. We investigate the feasibility of a model-driven software approach to address this challenge, whereby an improvement model designed by a team is used to automatically generate required electronic data collection instruments and reporting tools. To that goal, we have designed a generic Improvement Data Model (IDM) to capture the data items and quality measures relevant to the project, and constructed Web Improvement Support in Healthcare (WISH), a prototype tool that takes user-generated IDM models and creates a data schema, data collection web interfaces, and a set of live reports, based on Statistical Process Control (SPC) for use by improvement teams. The software has been successfully used in over 50 improvement projects, with more than 700 users. We present in detail the experiences of one of those initiatives, Chronic Obstructive Pulmonary Disease project in Northwest London hospitals. The specific challenges of improvement in healthcare are analysed and the benefits and limitations of the approach are discussed. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  3. A Data-Driven Reliability Estimation Approach for Phased-Mission Systems

    Directory of Open Access Journals (Sweden)

    Hua-Feng He

    2014-01-01

    Full Text Available We attempt to address the issues associated with reliability estimation for phased-mission systems (PMS and present a novel data-driven approach to achieve reliability estimation for PMS using the condition monitoring information and degradation data of such system under dynamic operating scenario. In this sense, this paper differs from the existing methods only considering the static scenario without using the real-time information, which aims to estimate the reliability for a population but not for an individual. In the presented approach, to establish a linkage between the historical data and real-time information of the individual PMS, we adopt a stochastic filtering model to model the phase duration and obtain the updated estimation of the mission time by Bayesian law at each phase. At the meanwhile, the lifetime of PMS is estimated from degradation data, which are modeled by an adaptive Brownian motion. As such, the mission reliability can be real time obtained through the estimated distribution of the mission time in conjunction with the estimated lifetime distribution. We demonstrate the usefulness of the developed approach via a numerical example.

  4. Autonomous Soil Assessment System: A Data-Driven Approach to Planetary Mobility Hazard Detection

    Science.gov (United States)

    Raimalwala, K.; Faragalli, M.; Reid, E.

    2018-04-01

    The Autonomous Soil Assessment System predicts mobility hazards for rovers. Its development and performance are presented, with focus on its data-driven models, machine learning algorithms, and real-time sensor data fusion for predictive analytics.

  5. A data-driven approach for retrieving temperatures and abundances in brown dwarf atmospheres

    OpenAIRE

    Line, MR; Fortney, JJ; Marley, MS; Sorahana, S

    2014-01-01

    © 2014. The American Astronomical Society. All rights reserved. Brown dwarf spectra contain a wealth of information about their molecular abundances, temperature structure, and gravity. We present a new data driven retrieval approach, previously used in planetary atmosphere studies, to extract the molecular abundances and temperature structure from brown dwarf spectra. The approach makes few a priori physical assumptions about the state of the atmosphere. The feasibility of the approach is fi...

  6. Data-driven modeling of nano-nose gas sensor arrays

    DEFF Research Database (Denmark)

    Alstrøm, Tommy Sonne; Larsen, Jan; Nielsen, Claus Højgård

    2010-01-01

    We present a data-driven approach to classification of Quartz Crystal Microbalance (QCM) sensor data. The sensor is a nano-nose gas sensor that detects concentrations of analytes down to ppm levels using plasma polymorized coatings. Each sensor experiment takes approximately one hour hence...... the number of available training data is limited. We suggest a data-driven classification model which work from few examples. The paper compares a number of data-driven classification and quantification schemes able to detect the gas and the concentration level. The data-driven approaches are based on state...

  7. Data-driven batch schuduling

    Energy Technology Data Exchange (ETDEWEB)

    Bent, John [Los Alamos National Laboratory; Denehy, Tim [GOOGLE; Arpaci - Dusseau, Remzi [UNIV OF WISCONSIN; Livny, Miron [UNIV OF WISCONSIN; Arpaci - Dusseau, Andrea C [NON LANL

    2009-01-01

    In this paper, we develop data-driven strategies for batch computing schedulers. Current CPU-centric batch schedulers ignore the data needs within workloads and execute them by linking them transparently and directly to their needed data. When scheduled on remote computational resources, this elegant solution of direct data access can incur an order of magnitude performance penalty for data-intensive workloads. Adding data-awareness to batch schedulers allows a careful coordination of data and CPU allocation thereby reducing the cost of remote execution. We offer here new techniques by which batch schedulers can become data-driven. Such systems can use our analytical predictive models to select one of the four data-driven scheduling policies that we have created. Through simulation, we demonstrate the accuracy of our predictive models and show how they can reduce time to completion for some workloads by as much as 80%.

  8. Data driven approaches for diagnostics and optimization of NPP operation

    International Nuclear Information System (INIS)

    Pliska, J.; Machat, Z.

    2014-01-01

    The efficiency and heat rate is an important indicator of both the health of the power plant equipment and the quality of power plant operation. To achieve this challenges powerful tool is a statistical data processing of large data sets which are stored in data historians. These large data sets contain useful information about process quality and equipment and sensor health. The paper discusses data-driven approaches for model building of main power plant equipment such as condenser, cooling tower and the overall thermal cycle as well using multivariate regression techniques based on so called a regression triplet - data, model and method. Regression models comprise a base for diagnostics and optimization tasks. Diagnostics and optimization tasks are demonstrated on practical cases - diagnostics of main power plant equipment to early identify equipment fault, and optimization task of cooling circuit by cooling water flow control to achieve for a given boundary conditions the highest power output. (authors)

  9. Practical aspects of data-driven motion correction approach for brain SPECT

    International Nuclear Information System (INIS)

    Kyme, A.Z.; Hutton, B.F.; Hatton, R.L.; Skerrett, D.; Barnden, L.

    2002-01-01

    Full text: Patient motion can cause image artifacts in SPECT despite restraining measures. Data-driven detection and correction of motion can be achieved by comparison of acquired data with the forward-projections. By optimising the orientation of a partial reconstruction, parameters can be obtained for each misaligned projection and applied to update this volume using a 3D reconstruction algorithm. Phantom validation was performed to explore practical aspects of this approach. Noisy projection datasets simulating a patient undergoing at least one fully 3D movement during acquisition were compiled from various projections of the digital Hoffman brain phantom. Motion correction was then applied to the reconstructed studies. Correction success was assessed visually and quantitatively. Resilience with respect to subset order and missing data in the reconstruction and updating stages, detector geometry considerations, and the need for implementing an iterated correction were assessed in the process. Effective correction of the corrupted studies was achieved. Visually, artifactual regions in the reconstructed slices were suppressed and/or removed. Typically the ratio of mean square difference between the corrected and reference studies compared to that between the corrupted and reference studies was > 2. Although components of the motions are missed using a single-head implementation, improvement was still evident in the correction. The need for multiple iterations in the approach was small due to the bulk of misalignment errors being corrected in the first pass. Dispersion of subsets for reconstructing and updating the partial reconstruction appears to give optimal correction. Further validation is underway using triple-head physical phantom data. Copyright (2002) The Australian and New Zealand Society of Nuclear Medicine Inc

  10. A data-driven decomposition approach to model aerodynamic forces on flapping airfoils

    Science.gov (United States)

    Raiola, Marco; Discetti, Stefano; Ianiro, Andrea

    2017-11-01

    In this work, we exploit a data-driven decomposition of experimental data from a flapping airfoil experiment with the aim of isolating the main contributions to the aerodynamic force and obtaining a phenomenological model. Experiments are carried out on a NACA 0012 airfoil in forward flight with both heaving and pitching motion. Velocity measurements of the near field are carried out with Planar PIV while force measurements are performed with a load cell. The phase-averaged velocity fields are transformed into the wing-fixed reference frame, allowing for a description of the field in a domain with fixed boundaries. The decomposition of the flow field is performed by means of the POD applied on the velocity fluctuations and then extended to the phase-averaged force data by means of the Extended POD approach. This choice is justified by the simple consideration that aerodynamic forces determine the largest contributions to the energetic balance in the flow field. Only the first 6 modes have a relevant contribution to the force. A clear relationship can be drawn between the force and the flow field modes. Moreover, the force modes are closely related (yet slightly different) to the contributions of the classic potential models in literature, allowing for their correction. This work has been supported by the Spanish MINECO under Grant TRA2013-41103-P.

  11. Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM

    KAUST Repository

    Amer, Abdelhalim

    2013-01-01

    Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications of fork-join and data-driven execution models on this type of architecture at the level of task parallelism. For this purpose, we use a highly optimized fork-join based implementation of the FMM and extend it to a data-driven implementation using a distributed task scheduling approach. This study exposes some limitations of the conventional fork-join implementation in terms of synchronization overheads. We find that these are not negligible and their elimination by the data-driven method, with a careful data locality strategy, was beneficial. Experimental evaluation of both methods on state-of-the-art multi-socket multi-core architectures showed up to 22% speed-ups of the data-driven approach compared to the original method. We demonstrate that a data-driven execution of FMM not only improves performance by avoiding global synchronization overheads but also reduces the memory-bandwidth pressure caused by memory-intensive computations. © 2013 Springer-Verlag.

  12. A Data-Driven Control Design Approach for Freeway Traffic Ramp Metering with Virtual Reference Feedback Tuning

    Directory of Open Access Journals (Sweden)

    Shangtai Jin

    2014-01-01

    Full Text Available ALINEA is a simple, efficient, and easily implemented ramp metering strategy. Virtual reference feedback tuning (VRFT is most suitable for many practical systems since it is a “one-shot” data-driven control design methodology. This paper presents an application of VRFT to a ramp metering problem of freeway traffic system. When there is not enough prior knowledge of the controlled system to select a proper parameter of ALINEA, the VRFT approach is used to optimize the ALINEA's parameter by only using a batch of input and output data collected from the freeway traffic system. The extensive simulations are built on both the macroscopic MATLAB platform and the microscopic PARAMICS platform to show the effectiveness and applicability of the proposed data-driven controller tuning approach.

  13. Defining datasets and creating data dictionaries for quality improvement and research in chronic disease using routinely collected data: an ontology-driven approach

    Directory of Open Access Journals (Sweden)

    Simon de Lusignan

    2011-06-01

    Conclusion Adopting an ontology-driven approach to case finding could improve the quality of disease registers and of research based on routine data. It would offer considerable advantages over using limited datasets to define cases. This approach should be considered by those involved in research and quality improvement projects which utilise routine data.

  14. Dynamic Data-Driven UAV Network for Plume Characterization

    Science.gov (United States)

    2016-05-23

    AFRL-AFOSR-VA-TR-2016-0203 Dynamic Data-Driven UAV Network for Plume Characterization Kamran Mohseni UNIVERSITY OF FLORIDA Final Report 05/23/2016...AND SUBTITLE Dynamic Data-Driven UAV Network for Plume Characterization 5a.  CONTRACT NUMBER 5b.  GRANT NUMBER FA9550-13-1-0090 5c.  PROGRAM ELEMENT...studied a dynamic data driven (DDD) approach to operation of a heterogeneous team of unmanned aerial vehicles ( UAVs ) or micro/miniature aerial

  15. Ensemble of data-driven prognostic algorithms for robust prediction of remaining useful life

    International Nuclear Information System (INIS)

    Hu Chao; Youn, Byeng D.; Wang Pingfeng; Taek Yoon, Joung

    2012-01-01

    Prognostics aims at determining whether a failure of an engineered system (e.g., a nuclear power plant) is impending and estimating the remaining useful life (RUL) before the failure occurs. The traditional data-driven prognostic approach is to construct multiple candidate algorithms using a training data set, evaluate their respective performance using a testing data set, and select the one with the best performance while discarding all the others. This approach has three shortcomings: (i) the selected standalone algorithm may not be robust; (ii) it wastes the resources for constructing the algorithms that are discarded; (iii) it requires the testing data in addition to the training data. To overcome these drawbacks, this paper proposes an ensemble data-driven prognostic approach which combines multiple member algorithms with a weighted-sum formulation. Three weighting schemes, namely the accuracy-based weighting, diversity-based weighting and optimization-based weighting, are proposed to determine the weights of member algorithms. The k-fold cross validation (CV) is employed to estimate the prediction error required by the weighting schemes. The results obtained from three case studies suggest that the ensemble approach with any weighting scheme gives more accurate RUL predictions compared to any sole algorithm when member algorithms producing diverse RUL predictions have comparable prediction accuracy and that the optimization-based weighting scheme gives the best overall performance among the three weighting schemes.

  16. Data-driven methods towards learning the highly nonlinear inverse kinematics of tendon-driven surgical manipulators.

    Science.gov (United States)

    Xu, Wenjun; Chen, Jie; Lau, Henry Y K; Ren, Hongliang

    2017-09-01

    Accurate motion control of flexible surgical manipulators is crucial in tissue manipulation tasks. The tendon-driven serpentine manipulator (TSM) is one of the most widely adopted flexible mechanisms in minimally invasive surgery because of its enhanced maneuverability in torturous environments. TSM, however, exhibits high nonlinearities and conventional analytical kinematics model is insufficient to achieve high accuracy. To account for the system nonlinearities, we applied a data driven approach to encode the system inverse kinematics. Three regression methods: extreme learning machine (ELM), Gaussian mixture regression (GMR) and K-nearest neighbors regression (KNNR) were implemented to learn a nonlinear mapping from the robot 3D position states to the control inputs. The performance of the three algorithms was evaluated both in simulation and physical trajectory tracking experiments. KNNR performed the best in the tracking experiments, with the lowest RMSE of 2.1275 mm. The proposed inverse kinematics learning methods provide an alternative and efficient way to accurately model the tendon driven flexible manipulator. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Data-Driven and Expectation-Driven Discovery of Empirical Laws.

    Science.gov (United States)

    1982-10-10

    occurred in small integer proportions to each other. In 1809, Joseph Gay- Lussac found evidence for his law of combining volumes, which stated that a...of Empirical Laws Patrick W. Langley Gary L. Bradshaw Herbert A. Simon T1he Robotics Institute Carnegie-Mellon University Pittsburgh, Pennsylvania...Subtitle) S. TYPE OF REPORT & PERIOD COVERED Data-Driven and Expectation-Driven Discovery Interim Report 2/82-10/82 of Empirical Laws S. PERFORMING ORG

  18. Data-driven non-linear elasticity: constitutive manifold construction and problem discretization

    Science.gov (United States)

    Ibañez, Ruben; Borzacchiello, Domenico; Aguado, Jose Vicente; Abisset-Chavanne, Emmanuelle; Cueto, Elias; Ladeveze, Pierre; Chinesta, Francisco

    2017-11-01

    The use of constitutive equations calibrated from data has been implemented into standard numerical solvers for successfully addressing a variety problems encountered in simulation-based engineering sciences (SBES). However, the complexity remains constantly increasing due to the need of increasingly detailed models as well as the use of engineered materials. Data-Driven simulation constitutes a potential change of paradigm in SBES. Standard simulation in computational mechanics is based on the use of two very different types of equations. The first one, of axiomatic character, is related to balance laws (momentum, mass, energy,\\ldots ), whereas the second one consists of models that scientists have extracted from collected, either natural or synthetic, data. Data-driven (or data-intensive) simulation consists of directly linking experimental data to computers in order to perform numerical simulations. These simulations will employ laws, universally recognized as epistemic, while minimizing the need of explicit, often phenomenological, models. The main drawback of such an approach is the large amount of required data, some of them inaccessible from the nowadays testing facilities. Such difficulty can be circumvented in many cases, and in any case alleviated, by considering complex tests, collecting as many data as possible and then using a data-driven inverse approach in order to generate the whole constitutive manifold from few complex experimental tests, as discussed in the present work.

  19. Development of a Stochastically-driven, Forward Predictive Performance Model for PEMFCs

    Science.gov (United States)

    Harvey, David Benjamin Paul

    A one-dimensional multi-scale coupled, transient, and mechanistic performance model for a PEMFC membrane electrode assembly has been developed. The model explicitly includes each of the 5 layers within a membrane electrode assembly and solves for the transport of charge, heat, mass, species, dissolved water, and liquid water. Key features of the model include the use of a multi-step implementation of the HOR reaction on the anode, agglomerate catalyst sub-models for both the anode and cathode catalyst layers, a unique approach that links the composition of the catalyst layer to key properties within the agglomerate model and the implementation of a stochastic input-based approach for component material properties. The model employs a new methodology for validation using statistically varying input parameters and statistically-based experimental performance data; this model represents the first stochastic input driven unit cell performance model. The stochastic input driven performance model was used to identify optimal ionomer content within the cathode catalyst layer, demonstrate the role of material variation in potential low performing MEA materials, provide explanation for the performance of low-Pt loaded MEAs, and investigate the validity of transient-sweep experimental diagnostic methods.

  20. Probing the dynamics of identified neurons with a data-driven modeling approach.

    Directory of Open Access Journals (Sweden)

    Thomas Nowotny

    2008-07-01

    Full Text Available In controlling animal behavior the nervous system has to perform within the operational limits set by the requirements of each specific behavior. The implications for the corresponding range of suitable network, single neuron, and ion channel properties have remained elusive. In this article we approach the question of how well-constrained properties of neuronal systems may be on the neuronal level. We used large data sets of the activity of isolated invertebrate identified cells and built an accurate conductance-based model for this cell type using customized automated parameter estimation techniques. By direct inspection of the data we found that the variability of the neurons is larger when they are isolated from the circuit than when in the intact system. Furthermore, the responses of the neurons to perturbations appear to be more consistent than their autonomous behavior under stationary conditions. In the developed model, the constraints on different parameters that enforce appropriate model dynamics vary widely from some very tightly controlled parameters to others that are almost arbitrary. The model also allows predictions for the effect of blocking selected ionic currents and to prove that the origin of irregular dynamics in the neuron model is proper chaoticity and that this chaoticity is typical in an appropriate sense. Our results indicate that data driven models are useful tools for the in-depth analysis of neuronal dynamics. The better consistency of responses to perturbations, in the real neurons as well as in the model, suggests a paradigm shift away from measuring autonomous dynamics alone towards protocols of controlled perturbations. Our predictions for the impact of channel blockers on the neuronal dynamics and the proof of chaoticity underscore the wide scope of our approach.

  1. Flood probability quantification for road infrastructure: Data-driven spatial-statistical approach and case study applications.

    Science.gov (United States)

    Kalantari, Zahra; Cavalli, Marco; Cantone, Carolina; Crema, Stefano; Destouni, Georgia

    2017-03-01

    Climate-driven increase in the frequency of extreme hydrological events is expected to impose greater strain on the built environment and major transport infrastructure, such as roads and railways. This study develops a data-driven spatial-statistical approach to quantifying and mapping the probability of flooding at critical road-stream intersection locations, where water flow and sediment transport may accumulate and cause serious road damage. The approach is based on novel integration of key watershed and road characteristics, including also measures of sediment connectivity. The approach is concretely applied to and quantified for two specific study case examples in southwest Sweden, with documented road flooding effects of recorded extreme rainfall. The novel contributions of this study in combining a sediment connectivity account with that of soil type, land use, spatial precipitation-runoff variability and road drainage in catchments, and in extending the connectivity measure use for different types of catchments, improve the accuracy of model results for road flood probability. Copyright © 2016 Elsevier B.V. All rights reserved.

  2. Data driven propulsion system weight prediction model

    Science.gov (United States)

    Gerth, Richard J.

    1994-10-01

    The objective of the research was to develop a method to predict the weight of paper engines, i.e., engines that are in the early stages of development. The impetus for the project was the Single Stage To Orbit (SSTO) project, where engineers need to evaluate alternative engine designs. Since the SSTO is a performance driven project the performance models for alternative designs were well understood. The next tradeoff is weight. Since it is known that engine weight varies with thrust levels, a model is required that would allow discrimination between engines that produce the same thrust. Above all, the model had to be rooted in data with assumptions that could be justified based on the data. The general approach was to collect data on as many existing engines as possible and build a statistical model of the engines weight as a function of various component performance parameters. This was considered a reasonable level to begin the project because the data would be readily available, and it would be at the level of most paper engines, prior to detailed component design.

  3. Geoscience Meets Social Science: A Flexible Data Driven Approach for Developing High Resolution Population Datasets at Global Scale

    Science.gov (United States)

    Rose, A.; McKee, J.; Weber, E.; Bhaduri, B. L.

    2017-12-01

    Leveraging decades of expertise in population modeling, and in response to growing demand for higher resolution population data, Oak Ridge National Laboratory is now generating LandScan HD at global scale. LandScan HD is conceived as a 90m resolution population distribution where modeling is tailored to the unique geography and data conditions of individual countries or regions by combining social, cultural, physiographic, and other information with novel geocomputation methods. Similarities among these areas are exploited in order to leverage existing training data and machine learning algorithms to rapidly scale development. Drawing on ORNL's unique set of capabilities, LandScan HD adapts highly mature population modeling methods developed for LandScan Global and LandScan USA, settlement mapping research and production in high-performance computing (HPC) environments, land use and neighborhood mapping through image segmentation, and facility-specific population density models. Adopting a flexible methodology to accommodate different geographic areas, LandScan HD accounts for the availability, completeness, and level of detail of relevant ancillary data. Beyond core population and mapped settlement inputs, these factors determine the model complexity for an area, requiring that for any given area, a data-driven model could support either a simple top-down approach, a more detailed bottom-up approach, or a hybrid approach.

  4. Meta-control of combustion performance with a data mining approach

    Science.gov (United States)

    Song, Zhe

    Large scale combustion process is complex and proposes challenges of optimizing its performance. Traditional approaches based on thermal dynamics have limitations on finding optimal operational regions due to time-shift nature of the process. Recent advances in information technology enable people collect large volumes of process data easily and continuously. The collected process data contains rich information about the process and, to some extent, represents a digital copy of the process over time. Although large volumes of data exist in industrial combustion processes, they are not fully utilized to the level where the process can be optimized. Data mining is an emerging science which finds patterns or models from large data sets. It has found many successful applications in business marketing, medical and manufacturing domains The focus of this dissertation is on applying data mining to industrial combustion processes, and ultimately optimizing the combustion performance. However the philosophy, methods and frameworks discussed in this research can also be applied to other industrial processes. Optimizing an industrial combustion process has two major challenges. One is the underlying process model changes over time and obtaining an accurate process model is nontrivial. The other is that a process model with high fidelity is usually highly nonlinear, solving the optimization problem needs efficient heuristics. This dissertation is set to solve these two major challenges. The major contribution of this 4-year research is the data-driven solution to optimize the combustion process, where process model or knowledge is identified based on the process data, then optimization is executed by evolutionary algorithms to search for optimal operating regions.

  5. qPortal: A platform for data-driven biomedical research.

    Science.gov (United States)

    Mohr, Christopher; Friedrich, Andreas; Wojnar, David; Kenar, Erhan; Polatkan, Aydin Can; Codrea, Marius Cosmin; Czemmel, Stefan; Kohlbacher, Oliver; Nahnsen, Sven

    2018-01-01

    Modern biomedical research aims at drawing biological conclusions from large, highly complex biological datasets. It has become common practice to make extensive use of high-throughput technologies that produce big amounts of heterogeneous data. In addition to the ever-improving accuracy, methods are getting faster and cheaper, resulting in a steadily increasing need for scalable data management and easily accessible means of analysis. We present qPortal, a platform providing users with an intuitive way to manage and analyze quantitative biological data. The backend leverages a variety of concepts and technologies, such as relational databases, data stores, data models and means of data transfer, as well as front-end solutions to give users access to data management and easy-to-use analysis options. Users are empowered to conduct their experiments from the experimental design to the visualization of their results through the platform. Here, we illustrate the feature-rich portal by simulating a biomedical study based on publically available data. We demonstrate the software's strength in supporting the entire project life cycle. The software supports the project design and registration, empowers users to do all-digital project management and finally provides means to perform analysis. We compare our approach to Galaxy, one of the most widely used scientific workflow and analysis platforms in computational biology. Application of both systems to a small case study shows the differences between a data-driven approach (qPortal) and a workflow-driven approach (Galaxy). qPortal, a one-stop-shop solution for biomedical projects offers up-to-date analysis pipelines, quality control workflows, and visualization tools. Through intensive user interactions, appropriate data models have been developed. These models build the foundation of our biological data management system and provide possibilities to annotate data, query metadata for statistics and future re-analysis on

  6. Data-Driven Predictive Direct Load Control of Refrigeration Systems

    DEFF Research Database (Denmark)

    Shafiei, Seyed Ehsan; Knudsen, Torben; Wisniewski, Rafal

    2015-01-01

    A predictive control using subspace identification is applied for the smart grid integration of refrigeration systems under a direct load control scheme. A realistic demand response scenario based on regulation of the electrical power consumption is considered. A receding horizon optimal control...... is proposed to fulfil two important objectives: to secure high coefficient of performance and to participate in power consumption management. Moreover, a new method for design of input signals for system identification is put forward. The control method is fully data driven without an explicit use of model...... against real data. The performance improvement results in a 22% reduction in the energy consumption. A comparative simulation is accomplished showing the superiority of the method over the existing approaches in terms of the load following performance....

  7. Data-Driven Methods to Diversify Knowledge of Human Psychology

    OpenAIRE

    Jack, Rachael E.; Crivelli, Carlos; Wheatley, Thalia

    2017-01-01

    open access article Psychology aims to understand real human behavior. However, cultural biases in the scientific process can constrain knowledge. We describe here how data-driven methods can relax these constraints to reveal new insights that theories can overlook. To advance knowledge we advocate a symbiotic approach that better combines data-driven methods with theory.

  8. Data-driven workflows for microservices

    DEFF Research Database (Denmark)

    Safina, Larisa; Mazzara, Manuel; Montesi, Fabrizio

    2016-01-01

    Microservices is an architectural style inspired by service-oriented computing that has recently started gainingpopularity. Jolie is a programming language based on the microservices paradigm: the main building block of Jolie systems are services, in contrast to, e.g., functions or objects....... The primitives offered by the Jolie language elicit many of the recurring patterns found in microservices, like load balancers and structured processes. However, Jolie still lacks some useful constructs for dealing with message types and data manipulation that are present in service-oriented computing......). We show the impact of our implementation on some of the typical scenarios found in microservice systems. This shows how computation can move from a process-driven to a data-driven approach, and leads to the preliminary identification of recurring communication patterns that can be shaped as design...

  9. External radioactive markers for PET data-driven respiratory gating in positron emission tomography.

    Science.gov (United States)

    Büther, Florian; Ernst, Iris; Hamill, James; Eich, Hans T; Schober, Otmar; Schäfers, Michael; Schäfers, Klaus P

    2013-04-01

    Respiratory gating is an established approach to overcoming respiration-induced image artefacts in PET. Of special interest in this respect are raw PET data-driven gating methods which do not require additional hardware to acquire respiratory signals during the scan. However, these methods rely heavily on the quality of the acquired PET data (statistical properties, data contrast, etc.). We therefore combined external radioactive markers with data-driven respiratory gating in PET/CT. The feasibility and accuracy of this approach was studied for [(18)F]FDG PET/CT imaging in patients with malignant liver and lung lesions. PET data from 30 patients with abdominal or thoracic [(18)F]FDG-positive lesions (primary tumours or metastases) were included in this prospective study. The patients underwent a 10-min list-mode PET scan with a single bed position following a standard clinical whole-body [(18)F]FDG PET/CT scan. During this scan, one to three radioactive point sources (either (22)Na or (18)F, 50-100 kBq) in a dedicated holder were attached the patient's abdomen. The list mode data acquired were retrospectively analysed for respiratory signals using established data-driven gating approaches and additionally by tracking the motion of the point sources in sinogram space. Gated reconstructions were examined qualitatively, in terms of the amount of respiratory displacement and in respect of changes in local image intensity in the gated images. The presence of the external markers did not affect whole-body PET/CT image quality. Tracking of the markers led to characteristic respiratory curves in all patients. Applying these curves for gated reconstructions resulted in images in which motion was well resolved. Quantitatively, the performance of the external marker-based approach was similar to that of the best intrinsic data-driven methods. Overall, the gain in measured tumour uptake from the nongated to the gated images indicating successful removal of respiratory motion

  10. Data-driven execution of fast multipole methods

    KAUST Repository

    Ltaief, Hatem

    2013-09-17

    Fast multipole methods (FMMs) have O (N) complexity, are compute bound, and require very little synchronization, which makes them a favorable algorithm on next-generation supercomputers. Their most common application is to accelerate N-body problems, but they can also be used to solve boundary integral equations. When the particle distribution is irregular and the tree structure is adaptive, load balancing becomes a non-trivial question. A common strategy for load balancing FMMs is to use the work load from the previous step as weights to statically repartition the next step. The authors discuss in the paper another approach based on data-driven execution to efficiently tackle this challenging load balancing problem. The core idea consists of breaking the most time-consuming stages of the FMMs into smaller tasks. The algorithm can then be represented as a directed acyclic graph where nodes represent tasks and edges represent dependencies among them. The execution of the algorithm is performed by asynchronously scheduling the tasks using the queueing and runtime for kernels runtime environment, in a way such that data dependencies are not violated for numerical correctness purposes. This asynchronous scheduling results in an out-of-order execution. The performance results of the data-driven FMM execution outperform the previous strategy and show linear speedup on a quad-socket quad-core Intel Xeon system.Copyright © 2013 John Wiley & Sons, Ltd. Copyright © 2013 John Wiley & Sons, Ltd.

  11. Design and Data in Balance: Using Design-Driven Decision Making to Enable Student Success

    Science.gov (United States)

    Fairchild, Susan; Farrell, Timothy; Gunton, Brad; Mackinnon, Anne; McNamara, Christina; Trachtman, Roberta

    2014-01-01

    Data-driven approaches to school decision making have come into widespread use in the past decade, nationally and in New York City. New Visions has been at the forefront of those developments: in New Visions schools, teacher teams and school teams regularly examine student performance data to understand patterns and drive classroom- and…

  12. Estimating the Probability of Wind Ramping Events: A Data-driven Approach

    OpenAIRE

    Wang, Cheng; Wei, Wei; Wang, Jianhui; Qiu, Feng

    2016-01-01

    This letter proposes a data-driven method for estimating the probability of wind ramping events without exploiting the exact probability distribution function (PDF) of wind power. Actual wind data validates the proposed method.

  13. Data-driven approach for assessing utility of medical tests using electronic medical records.

    Science.gov (United States)

    Skrøvseth, Stein Olav; Augestad, Knut Magne; Ebadollahi, Shahram

    2015-02-01

    To precisely define the utility of tests in a clinical pathway through data-driven analysis of the electronic medical record (EMR). The information content was defined in terms of the entropy of the expected value of the test related to a given outcome. A kernel density classifier was used to estimate the necessary distributions. To validate the method, we used data from the EMR of the gastrointestinal department at a university hospital. Blood tests from patients undergoing surgery for gastrointestinal surgery were analyzed with respect to second surgery within 30 days of the index surgery. The information content is clearly reflected in the patient pathway for certain combinations of tests and outcomes. C-reactive protein tests coupled to anastomosis leakage, a severe complication show a clear pattern of information gain through the patient trajectory, where the greatest gain from the test is 3-4 days post index surgery. We have defined the information content in a data-driven and information theoretic way such that the utility of a test can be precisely defined. The results reflect clinical knowledge. In the case we used the tests carry little negative impact. The general approach can be expanded to cases that carry a substantial negative impact, such as in certain radiological techniques. Copyright © 2014 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Data-driven technology for engineering systems health management design approach, feature construction, fault diagnosis, prognosis, fusion and decisions

    CERN Document Server

    Niu, Gang

    2017-01-01

    This book introduces condition-based maintenance (CBM)/data-driven prognostics and health management (PHM) in detail, first explaining the PHM design approach from a systems engineering perspective, then summarizing and elaborating on the data-driven methodology for feature construction, as well as feature-based fault diagnosis and prognosis. The book includes a wealth of illustrations and tables to help explain the algorithms, as well as practical examples showing how to use this tool to solve situations for which analytic solutions are poorly suited. It equips readers to apply the concepts discussed in order to analyze and solve a variety of problems in PHM system design, feature construction, fault diagnosis and prognosis.

  15. Combining engineering and data-driven approaches: Development of a generic fire risk model facilitating calibration

    DEFF Research Database (Denmark)

    De Sanctis, G.; Fischer, K.; Kohler, J.

    2014-01-01

    Fire risk models support decision making for engineering problems under the consistent consideration of the associated uncertainties. Empirical approaches can be used for cost-benefit studies when enough data about the decision problem are available. But often the empirical approaches...... a generic risk model that is calibrated to observed fire loss data. Generic risk models assess the risk of buildings based on specific risk indicators and support risk assessment at a portfolio level. After an introduction to the principles of generic risk assessment, the focus of the present paper...... are not detailed enough. Engineering risk models, on the other hand, may be detailed but typically involve assumptions that may result in a biased risk assessment and make a cost-benefit study problematic. In two related papers it is shown how engineering and data-driven modeling can be combined by developing...

  16. On Mixed Data and Event Driven Design for Adaptive-Critic-Based Nonlinear $H_{\\infty}$ Control.

    Science.gov (United States)

    Wang, Ding; Mu, Chaoxu; Liu, Derong; Ma, Hongwen

    2018-04-01

    In this paper, based on the adaptive critic learning technique, the control for a class of unknown nonlinear dynamic systems is investigated by adopting a mixed data and event driven design approach. The nonlinear control problem is formulated as a two-player zero-sum differential game and the adaptive critic method is employed to cope with the data-based optimization. The novelty lies in that the data driven learning identifier is combined with the event driven design formulation, in order to develop the adaptive critic controller, thereby accomplishing the nonlinear control. The event driven optimal control law and the time driven worst case disturbance law are approximated by constructing and tuning a critic neural network. Applying the event driven feedback control, the closed-loop system is built with stability analysis. Simulation studies are conducted to verify the theoretical results and illustrate the control performance. It is significant to observe that the present research provides a new avenue of integrating data-based control and event-triggering mechanism into establishing advanced adaptive critic systems.

  17. Using a data-centric event-driven architecture approach in the integration of real-time systems at DTP2

    International Nuclear Information System (INIS)

    Tuominen, Janne; Viinikainen, Mikko; Alho, Pekka; Mattila, Jouni

    2014-01-01

    Integration of heterogeneous and distributed systems is a challenging task, because they might be running on different platforms and written with different implementation languages by multiple organizations. Data-centricity and event-driven architecture (EDA) are concepts that help to implement versatile and well-scaling distributed systems. This paper focuses on the implementation of inter-subsystem communication in a prototype distributed remote handling control system developed at Divertor Test Platform 2 (DTP2). The control system consists of a variety of heterogeneous subsystems, including a client–server web application and hard real-time controllers. A standardized middleware solution (Data Distribution Services (DDS)) that supports a data-centric EDA approach is used to integrate the system. One of the greatest challenges in integrating a system with a data-centric EDA approach is in defining the global data space model. The selected middleware is currently only used for non-deterministic communication. For future application, we evaluated the performance of point-to-point communication with and without the presence of additional network load to ensure applicability to real-time systems. We found that, under certain limitations, the middleware can be used for soft real-time communication. Hard real-time use will require more validation with a more suitable environment

  18. Using a data-centric event-driven architecture approach in the integration of real-time systems at DTP2

    Energy Technology Data Exchange (ETDEWEB)

    Tuominen, Janne, E-mail: janne.m.tuominen@tut.fi; Viinikainen, Mikko; Alho, Pekka; Mattila, Jouni

    2014-10-15

    Integration of heterogeneous and distributed systems is a challenging task, because they might be running on different platforms and written with different implementation languages by multiple organizations. Data-centricity and event-driven architecture (EDA) are concepts that help to implement versatile and well-scaling distributed systems. This paper focuses on the implementation of inter-subsystem communication in a prototype distributed remote handling control system developed at Divertor Test Platform 2 (DTP2). The control system consists of a variety of heterogeneous subsystems, including a client–server web application and hard real-time controllers. A standardized middleware solution (Data Distribution Services (DDS)) that supports a data-centric EDA approach is used to integrate the system. One of the greatest challenges in integrating a system with a data-centric EDA approach is in defining the global data space model. The selected middleware is currently only used for non-deterministic communication. For future application, we evaluated the performance of point-to-point communication with and without the presence of additional network load to ensure applicability to real-time systems. We found that, under certain limitations, the middleware can be used for soft real-time communication. Hard real-time use will require more validation with a more suitable environment.

  19. Scenario driven data modelling: a method for integrating diverse sources of data and data streams

    Science.gov (United States)

    2011-01-01

    Background Biology is rapidly becoming a data intensive, data-driven science. It is essential that data is represented and connected in ways that best represent its full conceptual content and allows both automated integration and data driven decision-making. Recent advancements in distributed multi-relational directed graphs, implemented in the form of the Semantic Web make it possible to deal with complicated heterogeneous data in new and interesting ways. Results This paper presents a new approach, scenario driven data modelling (SDDM), that integrates multi-relational directed graphs with data streams. SDDM can be applied to virtually any data integration challenge with widely divergent types of data and data streams. In this work, we explored integrating genetics data with reports from traditional media. SDDM was applied to the New Delhi metallo-beta-lactamase gene (NDM-1), an emerging global health threat. The SDDM process constructed a scenario, created a RDF multi-relational directed graph that linked diverse types of data to the Semantic Web, implemented RDF conversion tools (RDFizers) to bring content into the Sematic Web, identified data streams and analytical routines to analyse those streams, and identified user requirements and graph traversals to meet end-user requirements. Conclusions We provided an example where SDDM was applied to a complex data integration challenge. The process created a model of the emerging NDM-1 health threat, identified and filled gaps in that model, and constructed reliable software that monitored data streams based on the scenario derived multi-relational directed graph. The SDDM process significantly reduced the software requirements phase by letting the scenario and resulting multi-relational directed graph define what is possible and then set the scope of the user requirements. Approaches like SDDM will be critical to the future of data intensive, data-driven science because they automate the process of converting

  20. Data-Driven Learning of Q-Matrix

    Science.gov (United States)

    Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

    2012-01-01

    The recent surge of interests in cognitive assessment has led to developments of novel statistical models for diagnostic classification. Central to many such models is the well-known "Q"-matrix, which specifies the item-attribute relationships. This article proposes a data-driven approach to identification of the "Q"-matrix and estimation of…

  1. Using Shape Memory Alloys: A Dynamic Data Driven Approach

    KAUST Repository

    Douglas, Craig C.

    2013-06-01

    Shape Memory Alloys (SMAs) are capable of changing their crystallographic structure due to changes of either stress or temperature. SMAs are used in a number of aerospace devices and are required in some devices in exotic environments. We are developing dynamic data driven application system (DDDAS) tools to monitor and change SMAs in real time for delivering payloads by aerospace vehicles. We must be able to turn on and off the sensors and heating units, change the stress on the SMA, monitor on-line data streams, change scales based on incoming data, and control what type of data is generated. The application must have the capability to be run and steered remotely as an unmanned feedback control loop.

  2. DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat,; Skinner, David; White, Julia

    2014-10-17

    U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.

  3. Enhanced dynamic data-driven fault detection approach: Application to a two-tank heater system

    KAUST Repository

    Harrou, Fouzi

    2018-02-12

    Principal components analysis (PCA) has been intensively studied and used in monitoring industrial systems. However, data generated from chemical processes are usually correlated in time due to process dynamics, which makes the fault detection based on PCA approach a challenging task. Accounting for the dynamic nature of data can also reflect the performance of the designed fault detection approaches. In PCA-based methods, this dynamic characteristic of the data can be accounted for by using dynamic PCA (DPCA), in which lagged variables are used in the PCA model to capture the time evolution of the process. This paper presents a new approach that combines the DPCA to account for autocorrelation in data and generalized likelihood ratio (GLR) test to detect faults. A DPCA model is applied to perform dimension reduction while appropriately considering the temporal relationships in the data. Specifically, the proposed approach uses the DPCA to generate residuals, and then apply GLR test to reveal any abnormality. The performances of the proposed method are evaluated through a continuous stirred tank heater system.

  4. Pengembangan Data Warehouse Menggunakan Pendekatan Data-Driven untuk Membantu Pengelolaan SDM

    Directory of Open Access Journals (Sweden)

    Mujiono Mujiono

    2016-01-01

    Full Text Available The basis of bureaucratic reform is the reform of human resources management. One supporting factor is the development of an employee database. To support the management of human resources required including data warehouse and business intelligent tools. The data warehouse is an integrated concept of reliable data storage to provide support to all the needs of the data analysis. In this study developed a data warehouse using the data-driven approach to the source data comes from SIMPEG, SAPK and electronic presence. Data warehouses are designed using the nine steps methodology and unified modeling language (UML notation. Extract transform load (ETL is done by using Pentaho Data Integration by applying transformation maps. Furthermore, to help human resource management, the system is built to perform online analytical processing (OLAP to facilitate web-based information. In this study generated BI application development framework with Model-View-Controller (MVC architecture and OLAP operations are built using the dynamic query generation, PivotTable, and HighChart to present information about PNS, CPNS, Retirement, Kenpa and Presence

  5. NextGEOSS project: A user-driven approach to build a Earth Observations Data Hub

    Science.gov (United States)

    Percivall, G.; Voidrot, M. F.; Bye, B. L.; De Lathouwer, B.; Catarino, N.; Concalves, P.; Kraft, C.; Grosso, N.; Meyer-Arnek, J.; Mueller, A.; Goor, E.

    2017-12-01

    Several initiatives and projects contribute to support Group on Earth Observation's (GEO) global priorities including support to the UN 2030 Agenda for sustainable development, the Paris Agreement on climate change, and the Sendai Framework for Disaster Risk Reduction . Running until 2020, the NextGEOSS project evolves the European vision of a user driven GEOSS data exploitation for innovation and business, relying on the three main pillars: engaging communities of practice delivering technological advancements advocating the use of GEOSS These 3 pillars support the creation and deployment of Earth observation based innovative research activities and commercial services. In this presentation we will emphasise how the NextGEOSS project uses a pilot-driven approach to ramp up and consolidate the system in a pragmatique way, integrating the complexity of the existing global ecosystem, leveraging previous investments, adding new cloud technologies and resources and engaging the diverse communities to address all types of Sustainable Development Goals (SDGs). A set of 10 initial pilots have been defined by the project partners to address the main challenges and include as soon as possible contributions to SDGs associated with Food Sustainability, Bio Diversity, Space and Security, Cold Regions, Air Pollutions, Disaster Risk Reduction, Territorial Planning, Energy. In 2018 and 2019 the project team will work on two new series of Architecture Implementation Pilots (AIP-10 and AIP-11), opened world-wide, to increase discoverability, accessibility and usability of data with a strong User Centric approach for innovative GEOSS powered applications for multiple societal areas. All initiatives with an interest in and need of Earth observations (data, processes, models, ...) are welcome to participate to these pilots initiatives. NextGEOSS is a H2020 Research and Development Project from the European Community under grant agreement 730329.

  6. Challenges of Data-driven Healthcare Management

    DEFF Research Database (Denmark)

    Bossen, Claus; Danholt, Peter; Ubbesen, Morten Bonde

    This paper describes the new kind of data-work involved in developing data-driven healthcare based on two cases from Denmark: The first case concerns a governance infrastructure based on Diagnose-Related Groups (DRG), which was introduced in Denmark in the 1990s. The DRG-system links healthcare...... activity and financing and relies of extensive data entry, reporting and calculations. This has required the development of new skills, work and work roles. The second case concerns a New Governance project aimed at developing new performance indicators for healthcare delivery as an alternative to DRG....... Here, a core challenge is select indicators and actually being able to acquire data upon them. The two cases point out that data-driven healthcare requires more and new kinds of work for which new skills, functions and work roles have to be developed....

  7. Data-driven approach for creating synthetic electronic medical records

    Directory of Open Access Journals (Sweden)

    Moniz Linda

    2010-10-01

    Full Text Available Abstract Background New algorithms for disease outbreak detection are being developed to take advantage of full electronic medical records (EMRs that contain a wealth of patient information. However, due to privacy concerns, even anonymized EMRs cannot be shared among researchers, resulting in great difficulty in comparing the effectiveness of these algorithms. To bridge the gap between novel bio-surveillance algorithms operating on full EMRs and the lack of non-identifiable EMR data, a method for generating complete and synthetic EMRs was developed. Methods This paper describes a novel methodology for generating complete synthetic EMRs both for an outbreak illness of interest (tularemia and for background records. The method developed has three major steps: 1 synthetic patient identity and basic information generation; 2 identification of care patterns that the synthetic patients would receive based on the information present in real EMR data for similar health problems; 3 adaptation of these care patterns to the synthetic patient population. Results We generated EMRs, including visit records, clinical activity, laboratory orders/results and radiology orders/results for 203 synthetic tularemia outbreak patients. Validation of the records by a medical expert revealed problems in 19% of the records; these were subsequently corrected. We also generated background EMRs for over 3000 patients in the 4-11 yr age group. Validation of those records by a medical expert revealed problems in fewer than 3% of these background patient EMRs and the errors were subsequently rectified. Conclusions A data-driven method was developed for generating fully synthetic EMRs. The method is general and can be applied to any data set that has similar data elements (such as laboratory and radiology orders and results, clinical activity, prescription orders. The pilot synthetic outbreak records were for tularemia but our approach may be adapted to other infectious

  8. Data-driven approach for creating synthetic electronic medical records.

    Science.gov (United States)

    Buczak, Anna L; Babin, Steven; Moniz, Linda

    2010-10-14

    New algorithms for disease outbreak detection are being developed to take advantage of full electronic medical records (EMRs) that contain a wealth of patient information. However, due to privacy concerns, even anonymized EMRs cannot be shared among researchers, resulting in great difficulty in comparing the effectiveness of these algorithms. To bridge the gap between novel bio-surveillance algorithms operating on full EMRs and the lack of non-identifiable EMR data, a method for generating complete and synthetic EMRs was developed. This paper describes a novel methodology for generating complete synthetic EMRs both for an outbreak illness of interest (tularemia) and for background records. The method developed has three major steps: 1) synthetic patient identity and basic information generation; 2) identification of care patterns that the synthetic patients would receive based on the information present in real EMR data for similar health problems; 3) adaptation of these care patterns to the synthetic patient population. We generated EMRs, including visit records, clinical activity, laboratory orders/results and radiology orders/results for 203 synthetic tularemia outbreak patients. Validation of the records by a medical expert revealed problems in 19% of the records; these were subsequently corrected. We also generated background EMRs for over 3000 patients in the 4-11 yr age group. Validation of those records by a medical expert revealed problems in fewer than 3% of these background patient EMRs and the errors were subsequently rectified. A data-driven method was developed for generating fully synthetic EMRs. The method is general and can be applied to any data set that has similar data elements (such as laboratory and radiology orders and results, clinical activity, prescription orders). The pilot synthetic outbreak records were for tularemia but our approach may be adapted to other infectious diseases. The pilot synthetic background records were in the 4

  9. Analyzing the Discourse of Chais Conferences for the Study of Innovation and Learning Technologies via a Data-Driven Approach

    Science.gov (United States)

    Silber-Varod, Vered; Eshet-Alkalai, Yoram; Geri, Nitza

    2016-01-01

    The current rapid technological changes confront researchers of learning technologies with the challenge of evaluating them, predicting trends, and improving their adoption and diffusion. This study utilizes a data-driven discourse analysis approach, namely culturomics, to investigate changes over time in the research of learning technologies. The…

  10. Prognostic and health management for engineering systems: a review of the data-driven approach and algorithms

    Directory of Open Access Journals (Sweden)

    Thamo Sutharssan

    2015-07-01

    Full Text Available Prognostics and health management (PHM has become an important component of many engineering systems and products, where algorithms are used to detect anomalies, diagnose faults and predict remaining useful lifetime (RUL. PHM can provide many advantages to users and maintainers. Although primary goals are to ensure the safety, provide state of the health and estimate RUL of the components and systems, there are also financial benefits such as operational and maintenance cost reductions and extended lifetime. This study aims at reviewing the current status of algorithms and methods used to underpin different existing PHM approaches. The focus is on providing a structured and comprehensive classification of the existing state-of-the-art PHM approaches, data-driven approaches and algorithms.

  11. Enhanced dynamic data-driven fault detection approach: Application to a two-tank heater system

    KAUST Repository

    Harrou, Fouzi; Madakyaru, Muddu; Sun, Ying; Kammammettu, Sanjula

    2018-01-01

    on PCA approach a challenging task. Accounting for the dynamic nature of data can also reflect the performance of the designed fault detection approaches. In PCA-based methods, this dynamic characteristic of the data can be accounted for by using dynamic

  12. Least squares approach for initial data recovery in dynamic data-driven applications simulations

    KAUST Repository

    Douglas, C.

    2010-12-01

    In this paper, we consider the initial data recovery and the solution update based on the local measured data that are acquired during simulations. Each time new data is obtained, the initial condition, which is a representation of the solution at a previous time step, is updated. The update is performed using the least squares approach. The objective function is set up based on both a measurement error as well as a penalization term that depends on the prior knowledge about the solution at previous time steps (or initial data). Various numerical examples are considered, where the penalization term is varied during the simulations. Numerical examples demonstrate that the predictions are more accurate if the initial data are updated during the simulations. © Springer-Verlag 2011.

  13. Data-Driven H∞ Control for Nonlinear Distributed Parameter Systems.

    Science.gov (United States)

    Luo, Biao; Huang, Tingwen; Wu, Huai-Ning; Yang, Xiong

    2015-11-01

    The data-driven H∞ control problem of nonlinear distributed parameter systems is considered in this paper. An off-policy learning method is developed to learn the H∞ control policy from real system data rather than the mathematical model. First, Karhunen-Loève decomposition is used to compute the empirical eigenfunctions, which are then employed to derive a reduced-order model (ROM) of slow subsystem based on the singular perturbation theory. The H∞ control problem is reformulated based on the ROM, which can be transformed to solve the Hamilton-Jacobi-Isaacs (HJI) equation, theoretically. To learn the solution of the HJI equation from real system data, a data-driven off-policy learning approach is proposed based on the simultaneous policy update algorithm and its convergence is proved. For implementation purpose, a neural network (NN)- based action-critic structure is developed, where a critic NN and two action NNs are employed to approximate the value function, control, and disturbance policies, respectively. Subsequently, a least-square NN weight-tuning rule is derived with the method of weighted residuals. Finally, the developed data-driven off-policy learning approach is applied to a nonlinear diffusion-reaction process, and the obtained results demonstrate its effectiveness.

  14. Observer and data-driven model based fault detection in Power Plant Coal Mills

    DEFF Research Database (Denmark)

    Fogh Odgaard, Peter; Lin, Bao; Jørgensen, Sten Bay

    2008-01-01

    model with motor power as the controlled variable, data-driven methods for fault detection are also investigated. Regression models that represent normal operating conditions (NOCs) are developed with both static and dynamic principal component analysis and partial least squares methods. The residual...... between process measurement and the NOC model prediction is used for fault detection. A hybrid approach, where a data-driven model is employed to derive an optimal unknown input observer, is also implemented. The three methods are evaluated with case studies on coal mill data, which includes a fault......This paper presents and compares model-based and data-driven fault detection approaches for coal mill systems. The first approach detects faults with an optimal unknown input observer developed from a simplified energy balance model. Due to the time-consuming effort in developing a first principles...

  15. Data-Driven Problems in Elasticity

    Science.gov (United States)

    Conti, S.; Müller, S.; Ortiz, M.

    2018-01-01

    We consider a new class of problems in elasticity, referred to as Data-Driven problems, defined on the space of strain-stress field pairs, or phase space. The problem consists of minimizing the distance between a given material data set and the subspace of compatible strain fields and stress fields in equilibrium. We find that the classical solutions are recovered in the case of linear elasticity. We identify conditions for convergence of Data-Driven solutions corresponding to sequences of approximating material data sets. Specialization to constant material data set sequences in turn establishes an appropriate notion of relaxation. We find that relaxation within this Data-Driven framework is fundamentally different from the classical relaxation of energy functions. For instance, we show that in the Data-Driven framework the relaxation of a bistable material leads to material data sets that are not graphs.

  16. A data-driven fault-tolerant control design of linear multivariable systems with performance optimization.

    Science.gov (United States)

    Li, Zhe; Yang, Guang-Hong

    2017-09-01

    In this paper, an integrated data-driven fault-tolerant control (FTC) design scheme is proposed under the configuration of the Youla parameterization for multiple-input multiple-output (MIMO) systems. With unknown system model parameters, the canonical form identification technique is first applied to design the residual observer in fault-free case. In faulty case, with online tuning of the Youla parameters based on the system data via the gradient-based algorithm, the fault influence is attenuated with system performance optimization. In addition, to improve the robustness of the residual generator to a class of system deviations, a novel adaptive scheme is proposed for the residual generator to prevent its over-activation. Simulation results of a two-tank flow system demonstrate the optimized performance and effect of the proposed FTC scheme. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.

  17. A Data-Driven Frequency-Domain Approach for Robust Controller Design via Convex Optimization

    CERN Document Server

    AUTHOR|(CDS)2092751; Martino, Michele

    The objective of this dissertation is to develop data-driven frequency-domain methods for designing robust controllers through the use of convex optimization algorithms. Many of today's industrial processes are becoming more complex, and modeling accurate physical models for these plants using first principles may be impossible. Albeit a model may be available; however, such a model may be too complex to consider for an appropriate controller design. With the increased developments in the computing world, large amounts of measured data can be easily collected and stored for processing purposes. Data can also be collected and used in an on-line fashion. Thus it would be very sensible to make full use of this data for controller design, performance evaluation, and stability analysis. The design methods imposed in this work ensure that the dynamics of a system are captured in an experiment and avoids the problem of unmodeled dynamics associated with parametric models. The devised methods consider robust designs...

  18. Data-driven approach for auditory profiling

    DEFF Research Database (Denmark)

    Sanchez Lopez, Raul; Bianchi, Federica; Fereczkowski, Michal

    2017-01-01

    Nowadays, the pure-tone audiogram is the main tool used to characterizehearing loss and to fit hearing aids. However, the perceptual consequencesof hearing loss are typically not only associated with a loss of sensitivity, butalso with a clarity loss that is not captured by the audiogram. A detai......-in-noise perception. The current approach is promising for analyzingother existing data sets in order to select the most relevant tests for auditoryprofiling....

  19. The Structural Consequences of Big Data-Driven Education.

    Science.gov (United States)

    Zeide, Elana

    2017-06-01

    Educators and commenters who evaluate big data-driven learning environments focus on specific questions: whether automated education platforms improve learning outcomes, invade student privacy, and promote equality. This article puts aside separate unresolved-and perhaps unresolvable-issues regarding the concrete effects of specific technologies. It instead examines how big data-driven tools alter the structure of schools' pedagogical decision-making, and, in doing so, change fundamental aspects of America's education enterprise. Technological mediation and data-driven decision-making have a particularly significant impact in learning environments because the education process primarily consists of dynamic information exchange. In this overview, I highlight three significant structural shifts that accompany school reliance on data-driven instructional platforms that perform core school functions: teaching, assessment, and credentialing. First, virtual learning environments create information technology infrastructures featuring constant data collection, continuous algorithmic assessment, and possibly infinite record retention. This undermines the traditional intellectual privacy and safety of classrooms. Second, these systems displace pedagogical decision-making from educators serving public interests to private, often for-profit, technology providers. They constrain teachers' academic autonomy, obscure student evaluation, and reduce parents' and students' ability to participate or challenge education decision-making. Third, big data-driven tools define what "counts" as education by mapping the concepts, creating the content, determining the metrics, and setting desired learning outcomes of instruction. These shifts cede important decision-making to private entities without public scrutiny or pedagogical examination. In contrast to the public and heated debates that accompany textbook choices, schools often adopt education technologies ad hoc. Given education

  20. Data-Driven Baseline Estimation of Residential Buildings for Demand Response

    Directory of Open Access Journals (Sweden)

    Saehong Park

    2015-09-01

    Full Text Available The advent of advanced metering infrastructure (AMI generates a large volume of data related with energy service. This paper exploits data mining approach for customer baseline load (CBL estimation in demand response (DR management. CBL plays a significant role in measurement and verification process, which quantifies the amount of demand reduction and authenticates the performance. The proposed data-driven baseline modeling is based on the unsupervised learning technique. Specifically we leverage both the self organizing map (SOM and K-means clustering for accurate estimation. This two-level approach efficiently reduces the large data set into representative weight vectors in SOM, and then these weight vectors are clustered by K-means clustering to find the load pattern that would be similar to the potential load pattern of the DR event day. To verify the proposed method, we conduct nationwide scale experiments where three major cities’ residential consumption is monitored by smart meters. Our evaluation compares the proposed solution with the various types of day matching techniques, showing that our approach outperforms the existing methods by up to a 68.5% lower error rate.

  1. Data Driven Economic Model Predictive Control

    Directory of Open Access Journals (Sweden)

    Masoud Kheradmandi

    2018-04-01

    Full Text Available This manuscript addresses the problem of data driven model based economic model predictive control (MPC design. To this end, first, a data-driven Lyapunov-based MPC is designed, and shown to be capable of stabilizing a system at an unstable equilibrium point. The data driven Lyapunov-based MPC utilizes a linear time invariant (LTI model cognizant of the fact that the training data, owing to the unstable nature of the equilibrium point, has to be obtained from closed-loop operation or experiments. Simulation results are first presented demonstrating closed-loop stability under the proposed data-driven Lyapunov-based MPC. The underlying data-driven model is then utilized as the basis to design an economic MPC. The economic improvements yielded by the proposed method are illustrated through simulations on a nonlinear chemical process system example.

  2. A data-driven approach to reverse engineering customer engagement models: towards functional constructs.

    Science.gov (United States)

    de Vries, Natalie Jane; Carlson, Jamie; Moscato, Pablo

    2014-01-01

    Online consumer behavior in general and online customer engagement with brands in particular, has become a major focus of research activity fuelled by the exponential increase of interactive functions of the internet and social media platforms and applications. Current research in this area is mostly hypothesis-driven and much debate about the concept of Customer Engagement and its related constructs remains existent in the literature. In this paper, we aim to propose a novel methodology for reverse engineering a consumer behavior model for online customer engagement, based on a computational and data-driven perspective. This methodology could be generalized and prove useful for future research in the fields of consumer behaviors using questionnaire data or studies investigating other types of human behaviors. The method we propose contains five main stages; symbolic regression analysis, graph building, community detection, evaluation of results and finally, investigation of directed cycles and common feedback loops. The 'communities' of questionnaire items that emerge from our community detection method form possible 'functional constructs' inferred from data rather than assumed from literature and theory. Our results show consistent partitioning of questionnaire items into such 'functional constructs' suggesting the method proposed here could be adopted as a new data-driven way of human behavior modeling.

  3. A data-driven approach to reverse engineering customer engagement models: towards functional constructs.

    Directory of Open Access Journals (Sweden)

    Natalie Jane de Vries

    Full Text Available Online consumer behavior in general and online customer engagement with brands in particular, has become a major focus of research activity fuelled by the exponential increase of interactive functions of the internet and social media platforms and applications. Current research in this area is mostly hypothesis-driven and much debate about the concept of Customer Engagement and its related constructs remains existent in the literature. In this paper, we aim to propose a novel methodology for reverse engineering a consumer behavior model for online customer engagement, based on a computational and data-driven perspective. This methodology could be generalized and prove useful for future research in the fields of consumer behaviors using questionnaire data or studies investigating other types of human behaviors. The method we propose contains five main stages; symbolic regression analysis, graph building, community detection, evaluation of results and finally, investigation of directed cycles and common feedback loops. The 'communities' of questionnaire items that emerge from our community detection method form possible 'functional constructs' inferred from data rather than assumed from literature and theory. Our results show consistent partitioning of questionnaire items into such 'functional constructs' suggesting the method proposed here could be adopted as a new data-driven way of human behavior modeling.

  4. Data-Driven Learning: Reasonable Fears and Rational Reassurance

    Science.gov (United States)

    Boulton, Alex

    2009-01-01

    Computer corpora have many potential applications in teaching and learning languages, the most direct of which--when the learners explore a corpus themselves--has become known as data-driven learning (DDL). Despite considerable enthusiasm in the research community and interest in higher education, the approach has not made major inroads to…

  5. Authoring Data-Driven Videos with DataClips.

    Science.gov (United States)

    Amini, Fereshteh; Riche, Nathalie Henry; Lee, Bongshin; Monroy-Hernandez, Andres; Irani, Pourang

    2017-01-01

    Data videos, or short data-driven motion graphics, are an increasingly popular medium for storytelling. However, creating data videos is difficult as it involves pulling together a unique combination of skills. We introduce DataClips, an authoring tool aimed at lowering the barriers to crafting data videos. DataClips allows non-experts to assemble data-driven "clips" together to form longer sequences. We constructed the library of data clips by analyzing the composition of over 70 data videos produced by reputable sources such as The New York Times and The Guardian. We demonstrate that DataClips can reproduce over 90% of our data videos corpus. We also report on a qualitative study comparing the authoring process and outcome achieved by (1) non-experts using DataClips, and (2) experts using Adobe Illustrator and After Effects to create data-driven clips. Results indicated that non-experts are able to learn and use DataClips with a short training period. In the span of one hour, they were able to produce more videos than experts using a professional editing tool, and their clips were rated similarly by an independent audience.

  6. Data-driven storytelling

    CERN Document Server

    Hurter, Christophe; Diakopoulos, Nicholas ed.; Carpendale, Sheelagh

    2018-01-01

    This book is an accessible introduction to data-driven storytelling, resulting from discussions between data visualization researchers and data journalists. This book will be the first to define the topic, present compelling examples and existing resources, as well as identify challenges and new opportunities for research.

  7. Analyzing the Discourse of Chais Conferences for the Study of Innovation and Learning Technologies via a Data-Driven Approach

    Directory of Open Access Journals (Sweden)

    Vered Silber-Varod

    2016-12-01

    Full Text Available The current rapid technological changes confront researchers of learning technologies with the challenge of evaluating them, predicting trends, and improving their adoption and diffusion. This study utilizes a data-driven discourse analysis approach, namely culturomics, to investigate changes over time in the research of learning technologies. The patterns and changes were examined on a corpus of articles published over the past decade (2006-2014 in the proceedings of Chais Conference for the Study of Innovation and Learning Technologies – the leading research conference on learning technologies in Israel. The interesting findings of the exhaustive process of analyzing all the words in the corpus were that the most commonly used terms (e.g., pupil, teacher, student and the most commonly used phrases (e.g., face-to-face in the field of learning technologies reflect a pedagogical rather than a technological aspect of learning technologies. The study also demonstrates two cases of change over time in prominent themes, such as “Facebook” and “the National Information and Communication Technology (ICT program”. Methodologically, this research demonstrates the effectiveness of a data-driven approach for identifying discourse trends over time.

  8. Temporal Data-Driven Sleep Scheduling and Spatial Data-Driven Anomaly Detection for Clustered Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Gang Li

    2016-09-01

    Full Text Available The spatial–temporal correlation is an important feature of sensor data in wireless sensor networks (WSNs. Most of the existing works based on the spatial–temporal correlation can be divided into two parts: redundancy reduction and anomaly detection. These two parts are pursued separately in existing works. In this work, the combination of temporal data-driven sleep scheduling (TDSS and spatial data-driven anomaly detection is proposed, where TDSS can reduce data redundancy. The TDSS model is inspired by transmission control protocol (TCP congestion control. Based on long and linear cluster structure in the tunnel monitoring system, cooperative TDSS and spatial data-driven anomaly detection are then proposed. To realize synchronous acquisition in the same ring for analyzing the situation of every ring, TDSS is implemented in a cooperative way in the cluster. To keep the precision of sensor data, spatial data-driven anomaly detection based on the spatial correlation and Kriging method is realized to generate an anomaly indicator. The experiment results show that cooperative TDSS can realize non-uniform sensing effectively to reduce the energy consumption. In addition, spatial data-driven anomaly detection is quite significant for maintaining and improving the precision of sensor data.

  9. Perspectives of data-driven LPV modeling of high-purity distillation columns

    NARCIS (Netherlands)

    Bachnas, A.A.; Toth, R.; Mesbah, A.; Ludlage, J.H.A.

    2013-01-01

    Abstract—This paper investigates data-driven, Linear- Parameter-Varying (LPV) modeling of a high-purity distillation column. Two LPV modeling approaches are studied: a local approach, corresponding to the interpolation of Linear Time- Invariant (LTI) models identified at steady-state purity levels,

  10. Architectural Strategies for Enabling Data-Driven Science at Scale

    Science.gov (United States)

    Crichton, D. J.; Law, E. S.; Doyle, R. J.; Little, M. M.

    2017-12-01

    The analysis of large data collections from NASA or other agencies is often executed through traditional computational and data analysis approaches, which require users to bring data to their desktops and perform local data analysis. Alternatively, data are hauled to large computational environments that provide centralized data analysis via traditional High Performance Computing (HPC). Scientific data archives, however, are not only growing massive, but are also becoming highly distributed. Neither traditional approach provides a good solution for optimizing analysis into the future. Assumptions across the NASA mission and science data lifecycle, which historically assume that all data can be collected, transmitted, processed, and archived, will not scale as more capable instruments stress legacy-based systems. New paradigms are needed to increase the productivity and effectiveness of scientific data analysis. This paradigm must recognize that architectural and analytical choices are interrelated, and must be carefully coordinated in any system that aims to allow efficient, interactive scientific exploration and discovery to exploit massive data collections, from point of collection (e.g., onboard) to analysis and decision support. The most effective approach to analyzing a distributed set of massive data may involve some exploration and iteration, putting a premium on the flexibility afforded by the architectural framework. The framework should enable scientist users to assemble workflows efficiently, manage the uncertainties related to data analysis and inference, and optimize deep-dive analytics to enhance scalability. In many cases, this "data ecosystem" needs to be able to integrate multiple observing assets, ground environments, archives, and analytics, evolving from stewardship of measurements of data to using computational methodologies to better derive insight from the data that may be fused with other sets of data. This presentation will discuss

  11. Minimization of energy consumption in HVAC systems with data-driven models and an interior-point method

    International Nuclear Information System (INIS)

    Kusiak, Andrew; Xu, Guanglin; Zhang, Zijun

    2014-01-01

    Highlights: • We study the energy saving of HVAC systems with a data-driven approach. • We conduct an in-depth analysis of the topology of developed Neural Network based HVAC model. • We apply interior-point method to solving a Neural Network based HVAC optimization model. • The uncertain building occupancy is incorporated in the minimization of HVAC energy consumption. • A significant potential of saving HVAC energy is discovered. - Abstract: In this paper, a data-driven approach is applied to minimize energy consumption of a heating, ventilating, and air conditioning (HVAC) system while maintaining the thermal comfort of a building with uncertain occupancy level. The uncertainty of arrival and departure rate of occupants is modeled by the Poisson and uniform distributions, respectively. The internal heating gain is calculated from the stochastic process of the building occupancy. Based on the observed and simulated data, a multilayer perceptron algorithm is employed to model and simulate the HVAC system. The data-driven models accurately predict future performance of the HVAC system based on the control settings and the observed historical information. An optimization model is formulated and solved with the interior-point method. The optimization results are compared with the results produced by the simulation models

  12. Technology-driven online marketing performance measurement: lessons from affiliate marketing

    OpenAIRE

    Bowie, David; Paraskevas, Alexandros; Mariussen, Anastasia

    2014-01-01

    Although the measurement of offline and online marketing is extensively researched, the literature on online performance measurement still has a number of limitations such as slow theory advancement and predominance of technology- and practitioner-driven measurement approaches. By focusing on the widely employed but under-researched affiliate marketing channel, this study addresses these limitations and evaluates the effectiveness of practitioner-led online performance assessment. The paper o...

  13. Enabling Data-Driven Methodologies Across the Data Lifecycle and Ecosystem

    Science.gov (United States)

    Doyle, R. J.; Crichton, D.

    2017-12-01

    NASA has unlocked unprecedented scientific knowledge through exploration of the Earth, our solar system, and the larger universe. NASA is generating enormous amounts of data that are challenging traditional approaches to capturing, managing, analyzing and ultimately gaining scientific understanding from science data. New architectures, capabilities and methodologies are needed to span the entire observing system, from spacecraft to archive, while integrating data-driven discovery and analytic capabilities. NASA data have a definable lifecycle, from remote collection point to validated accessibility in multiple archives. Data challenges must be addressed across this lifecycle, to capture opportunities and avoid decisions that may limit or compromise what is achievable once data arrives at the archive. Data triage may be necessary when the collection capacity of the sensor or instrument overwhelms data transport or storage capacity. By migrating computational and analytic capability to the point of data collection, informed decisions can be made about which data to keep; in some cases, to close observational decision loops onboard, to enable attending to unexpected or transient phenomena. Along a different dimension than the data lifecycle, scientists and other end-users must work across an increasingly complex data ecosystem, where the range of relevant data is rarely owned by a single institution. To operate effectively, scalable data architectures and community-owned information models become essential. NASA's Planetary Data System is having success with this approach. Finally, there is the difficult challenge of reproducibility and trust. While data provenance techniques will be part of the solution, future interactive analytics environments must support an ability to provide a basis for a result: relevant data source and algorithms, uncertainty tracking, etc., to assure scientific integrity and to enable confident decision making. Advances in data science offer

  14. Consistent data-driven computational mechanics

    Science.gov (United States)

    González, D.; Chinesta, F.; Cueto, E.

    2018-05-01

    We present a novel method, within the realm of data-driven computational mechanics, to obtain reliable and thermodynamically sound simulation from experimental data. We thus avoid the need to fit any phenomenological model in the construction of the simulation model. This kind of techniques opens unprecedented possibilities in the framework of data-driven application systems and, particularly, in the paradigm of industry 4.0.

  15. Personalized mortality prediction driven by electronic medical data and a patient similarity metric.

    Science.gov (United States)

    Lee, Joon; Maslove, David M; Dubin, Joel A

    2015-01-01

    Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made. We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care. The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our novel medical data analytics contributes to

  16. Personalized mortality prediction driven by electronic medical data and a patient similarity metric.

    Directory of Open Access Journals (Sweden)

    Joon Lee

    Full Text Available Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1 to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2 to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made.We deployed a cosine-similarity-based patient similarity metric (PSM to an intensive care unit (ICU database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care.The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR systems, our novel medical data analytics

  17. Data-driven modeling, control and tools for cyber-physical energy systems

    Science.gov (United States)

    Behl, Madhur

    Energy systems are experiencing a gradual but substantial change in moving away from being non-interactive and manually-controlled systems to utilizing tight integration of both cyber (computation, communications, and control) and physical representations guided by first principles based models, at all scales and levels. Furthermore, peak power reduction programs like demand response (DR) are becoming increasingly important as the volatility on the grid continues to increase due to regulation, integration of renewables and extreme weather conditions. In order to shield themselves from the risk of price volatility, end-user electricity consumers must monitor electricity prices and be flexible in the ways they choose to use electricity. This requires the use of control-oriented predictive models of an energy system's dynamics and energy consumption. Such models are needed for understanding and improving the overall energy efficiency and operating costs. However, learning dynamical models using grey/white box approaches is very cost and time prohibitive since it often requires significant financial investments in retrofitting the system with several sensors and hiring domain experts for building the model. We present the use of data-driven methods for making model capture easy and efficient for cyber-physical energy systems. We develop Model-IQ, a methodology for analysis of uncertainty propagation for building inverse modeling and controls. Given a grey-box model structure and real input data from a temporary set of sensors, Model-IQ evaluates the effect of the uncertainty propagation from sensor data to model accuracy and to closed-loop control performance. We also developed a statistical method to quantify the bias in the sensor measurement and to determine near optimal sensor placement and density for accurate data collection for model training and control. Using a real building test-bed, we show how performing an uncertainty analysis can reveal trends about

  18. A data-driven approach for evaluating multi-modal therapy in traumatic brain injury.

    Science.gov (United States)

    Haefeli, Jenny; Ferguson, Adam R; Bingham, Deborah; Orr, Adrienne; Won, Seok Joon; Lam, Tina I; Shi, Jian; Hawley, Sarah; Liu, Jialing; Swanson, Raymond A; Massa, Stephen M

    2017-02-16

    Combination therapies targeting multiple recovery mechanisms have the potential for additive or synergistic effects, but experimental design and analyses of multimodal therapeutic trials are challenging. To address this problem, we developed a data-driven approach to integrate and analyze raw source data from separate pre-clinical studies and evaluated interactions between four treatments following traumatic brain injury. Histologic and behavioral outcomes were measured in 202 rats treated with combinations of an anti-inflammatory agent (minocycline), a neurotrophic agent (LM11A-31), and physical therapy consisting of assisted exercise with or without botulinum toxin-induced limb constraint. Data was curated and analyzed in a linked workflow involving non-linear principal component analysis followed by hypothesis testing with a linear mixed model. Results revealed significant benefits of the neurotrophic agent LM11A-31 on learning and memory outcomes after traumatic brain injury. In addition, modulations of LM11A-31 effects by co-administration of minocycline and by the type of physical therapy applied reached statistical significance. These results suggest a combinatorial effect of drug and physical therapy interventions that was not evident by univariate analysis. The study designs and analytic techniques applied here form a structured, unbiased, internally validated workflow that may be applied to other combinatorial studies, both in animals and humans.

  19. A Data-driven Concept Schema for Defining Clinical Research Data Needs

    Science.gov (United States)

    Hruby, Gregory W.; Hoxha, Julia; Ravichandran, Praveen Chandar; Mendonça, Eneida A.; Hanauer, David A; Weng, Chunhua

    2016-01-01

    OBJECTIVES The Patient, Intervention, Control/Comparison, and Outcome (PICO) framework is an effective technique for framing a clinical question. We aim to develop the counterpart of PICO to structure clinical research data needs. METHODS We use a data-driven approach to abstracting key concepts representing clinical research data needs by adapting and extending an expert-derived framework originally developed for defining cancer research data needs. We annotated clinical trial eligibility criteria, EHR data request logs, and data queries to electronic health records (EHR), to extract and harmonize concept classes representing clinical research data needs. We evaluated the class coverage, class preservation from the original framework, schema generalizability, schema understandability, and schema structural correctness through a semi-structured interview with eight multidisciplinary domain experts. We iteratively refined the schema based on the evaluations. RESULTS Our data-driven schema preserved 68% of the 63 classes from the original framework and covered 88% (73/82) of the classes proposed by evaluators. Class coverage for participants of different backgrounds ranged from 60% to 100% with a median value of 95% agreement among the individual evaluators. The schema was found understandable and structurally sound. CONCLUSIONS Our proposed schema may serve as the counterpart to PICO for improving the research data needs communication between researchers and informaticians. PMID:27185504

  20. Data-driven motion correction in brain SPECT

    International Nuclear Information System (INIS)

    Kyme, A.Z.; Hutton, B.F.; Hatton, R.L.; Skerrett, D.W.

    2002-01-01

    Patient motion can cause image artifacts in SPECT despite restraining measures. Data-driven detection and correction of motion can be achieved by comparison of acquired data with the forward-projections. By optimising the orientation of the reconstruction, parameters can be obtained for each misaligned projection and applied to update this volume using a 3D reconstruction algorithm. Digital and physical phantom validation was performed to investigate this approach. Noisy projection data simulating at least one fully 3D patient head movement during acquisition were constructed by projecting the digital Huffman brain phantom at various orientations. Motion correction was applied to the reconstructed studies. The importance of including attenuation effects in the estimation of motion and the need for implementing an iterated correction were assessed in the process. Correction success was assessed visually for artifact reduction, and quantitatively using a mean square difference (MSD) measure. Physical Huffman phantom studies with deliberate movements introduced during the acquisition were also acquired and motion corrected. Effective artifact reduction in the simulated corrupt studies was achieved by motion correction. Typically the MSD ratio between the corrected and reference studies compared to the corrupted and reference studies was > 2. Motion correction could be achieved without inclusion of attenuation effects in the motion estimation stage, providing simpler implementation and greater efficiency. Moreover the additional improvement with multiple iterations of the approach was small. Improvement was also observed in the physical phantom data, though the technique appeared limited here by an object symmetry. Copyright (2002) The Australian and New Zealand Society of Nuclear Medicine Inc

  1. Data-Driven Cyber-Physical Systems via Real-Time Stream Analytics and Machine Learning

    OpenAIRE

    Akkaya, Ilge

    2016-01-01

    Emerging distributed cyber-physical systems (CPSs) integrate a wide range of heterogeneous components that need to be orchestrated in a dynamic environment. While model-based techniques are commonly used in CPS design, they be- come inadequate in capturing the complexity as systems become larger and extremely dynamic. The adaptive nature of the systems makes data-driven approaches highly desirable, if not necessary.Traditionally, data-driven systems utilize large volumes of static data sets t...

  2. Data-Driven Handover Optimization in Next Generation Mobile Communication Networks

    Directory of Open Access Journals (Sweden)

    Po-Chiang Lin

    2016-01-01

    Full Text Available Network densification is regarded as one of the important ingredients to increase capacity for next generation mobile communication networks. However, it also leads to mobility problems since users are more likely to hand over to another cell in dense or even ultradense mobile communication networks. Therefore, supporting seamless and robust connectivity through such networks becomes a very important issue. In this paper, we investigate handover (HO optimization in next generation mobile communication networks. We propose a data-driven handover optimization (DHO approach, which aims to mitigate mobility problems including too-late HO, too-early HO, HO to wrong cell, ping-pong HO, and unnecessary HO. The key performance indicator (KPI is defined as the weighted average of the ratios of these mobility problems. The DHO approach collects data from the mobile communication measurement results and provides a model to estimate the relationship between the KPI and features from the collected dataset. Based on the model, the handover parameters, including the handover margin and time-to-trigger, are optimized to minimize the KPI. Simulation results show that the proposed DHO approach could effectively mitigate mobility problems.

  3. EEG-based functional networks evoked by acupuncture at ST 36: A data-driven thresholding study

    Science.gov (United States)

    Li, Huiyan; Wang, Jiang; Yi, Guosheng; Deng, Bin; Zhou, Hexi

    2017-10-01

    This paper investigates how acupuncture at ST 36 modulates the brain functional network. 20 channel EEG signals from 15 healthy subjects are respectively recorded before, during and after acupuncture. The correlation between two EEG channels is calculated by using Pearson’s coefficient. A data-driven approach is applied to determine the threshold, which is performed by considering the connected set, connected edge and network connectivity. Based on such thresholding approach, the functional network in each acupuncture period is built with graph theory, and the associated functional connectivity is determined. We show that acupuncturing at ST 36 increases the connectivity of the EEG-based functional network, especially for the long distance ones between two hemispheres. The properties of the functional network in five EEG sub-bands are also characterized. It is found that the delta and gamma bands are affected more obviously by acupuncture than the other sub-bands. These findings highlight the modulatory effects of acupuncture on the EEG-based functional connectivity, which is helpful for us to understand how it participates in the cortical or subcortical activities. Further, the data-driven threshold provides an alternative approach to infer the functional connectivity under other physiological conditions.

  4. IT-Driven Approaches to Fraud Detection and Control in Financial ...

    African Journals Online (AJOL)

    ... case-based Reasoning, genetic algorithm and fuzzy logic. Its derivable benefits were x-rayed. It has been concluded that there is need for an IT-driven approach to fraud detection and control as a workable alternative to curb the increase and sophistication of fraudsters. KEYWORD: Data Mining, Artificial Neural Network, ...

  5. Advancing data reuse in phyloinformatics using an ontology-driven Semantic Web approach.

    Science.gov (United States)

    Panahiazar, Maryam; Sheth, Amit P; Ranabahu, Ajith; Vos, Rutger A; Leebens-Mack, Jim

    2013-01-01

    Phylogenetic analyses can resolve historical relationships among genes, organisms or higher taxa. Understanding such relationships can elucidate a wide range of biological phenomena, including, for example, the importance of gene and genome duplications in the evolution of gene function, the role of adaptation as a driver of diversification, or the evolutionary consequences of biogeographic shifts. Phyloinformaticists are developing data standards, databases and communication protocols (e.g. Application Programming Interfaces, APIs) to extend the accessibility of gene trees, species trees, and the metadata necessary to interpret these trees, thus enabling researchers across the life sciences to reuse phylogenetic knowledge. Specifically, Semantic Web technologies are being developed to make phylogenetic knowledge interpretable by web agents, thereby enabling intelligently automated, high-throughput reuse of results generated by phylogenetic research. This manuscript describes an ontology-driven, semantic problem-solving environment for phylogenetic analyses and introduces artefacts that can promote phyloinformatic efforts to promote accessibility of trees and underlying metadata. PhylOnt is an extensible ontology with concepts describing tree types and tree building methodologies including estimation methods, models and programs. In addition we present the PhylAnt platform for annotating scientific articles and NeXML files with PhylOnt concepts. The novelty of this work is the annotation of NeXML files and phylogenetic related documents with PhylOnt Ontology. This approach advances data reuse in phyloinformatics.

  6. Data-driven Discovery: A New Era of Exploiting the Literature and Data

    Directory of Open Access Journals (Sweden)

    Ying Ding

    2016-11-01

    Full Text Available In the current data-intensive era, the traditional hands-on method of conducting scientific research by exploring related publications to generate a testable hypothesis is well on its way of becoming obsolete within just a year or two. Analyzing the literature and data to automatically generate a hypothesis might become the de facto approach to inform the core research efforts of those trying to master the exponentially rapid expansion of publications and datasets. Here, viewpoints are provided and discussed to help the understanding of challenges of data-driven discovery.

  7. Knowledge-Driven Versus Data-Driven Logics

    Czech Academy of Sciences Publication Activity Database

    Dubois, D.; Hájek, Petr; Prade, H.

    2000-01-01

    Roč. 9, č. 1 (2000), s. 65-89 ISSN 0925-8531 R&D Projects: GA AV ČR IAA1030601 Grant - others:CNRS(FR) 4008 Institutional research plan: AV0Z1030915 Keywords : epistemic logic * possibility theory * data-driven reasoning * deontic logic Subject RIV: BA - General Mathematics

  8. Data-driven integration of genome-scale regulatory and metabolic network models

    Science.gov (United States)

    Imam, Saheed; Schäuble, Sascha; Brooks, Aaron N.; Baliga, Nitin S.; Price, Nathan D.

    2015-01-01

    Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription, and signaling) have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert—a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or more network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. In this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system. PMID:25999934

  9. The Stanford Data Miner: a novel approach for integrating and exploring heterogeneous immunological data.

    Science.gov (United States)

    Siebert, Janet C; Munsil, Wes; Rosenberg-Hasson, Yael; Davis, Mark M; Maecker, Holden T

    2012-03-28

    Systems-level approaches are increasingly common in both murine and human translational studies. These approaches employ multiple high information content assays. As a result, there is a need for tools to integrate heterogeneous types of laboratory and clinical/demographic data, and to allow the exploration of that data by aggregating and/or segregating results based on particular variables (e.g., mean cytokine levels by age and gender). Here we describe the application of standard data warehousing tools to create a novel environment for user-driven upload, integration, and exploration of heterogeneous data. The system presented here currently supports flow cytometry and immunoassays performed in the Stanford Human Immune Monitoring Center, but could be applied more generally. Users upload assay results contained in platform-specific spreadsheets of a defined format, and clinical and demographic data in spreadsheets of flexible format. Users then map sample IDs to connect the assay results with the metadata. An OLAP (on-line analytical processing) data exploration interface allows filtering and display of various dimensions (e.g., Luminex analytes in rows, treatment group in columns, filtered on a particular study). Statistics such as mean, median, and N can be displayed. The views can be expanded or contracted to aggregate or segregate data at various levels. Individual-level data is accessible with a single click. The result is a user-driven system that permits data integration and exploration in a variety of settings. We show how the system can be used to find gender-specific differences in serum cytokine levels, and compare them across experiments and assay types. We have used the tools and techniques of data warehousing, including open-source business intelligence software, to support investigator-driven data integration and mining of diverse immunological data.

  10. The Stanford Data Miner: a novel approach for integrating and exploring heterogeneous immunological data

    Directory of Open Access Journals (Sweden)

    Siebert Janet C

    2012-03-01

    Full Text Available Abstract Background Systems-level approaches are increasingly common in both murine and human translational studies. These approaches employ multiple high information content assays. As a result, there is a need for tools to integrate heterogeneous types of laboratory and clinical/demographic data, and to allow the exploration of that data by aggregating and/or segregating results based on particular variables (e.g., mean cytokine levels by age and gender. Methods Here we describe the application of standard data warehousing tools to create a novel environment for user-driven upload, integration, and exploration of heterogeneous data. The system presented here currently supports flow cytometry and immunoassays performed in the Stanford Human Immune Monitoring Center, but could be applied more generally. Results Users upload assay results contained in platform-specific spreadsheets of a defined format, and clinical and demographic data in spreadsheets of flexible format. Users then map sample IDs to connect the assay results with the metadata. An OLAP (on-line analytical processing data exploration interface allows filtering and display of various dimensions (e.g., Luminex analytes in rows, treatment group in columns, filtered on a particular study. Statistics such as mean, median, and N can be displayed. The views can be expanded or contracted to aggregate or segregate data at various levels. Individual-level data is accessible with a single click. The result is a user-driven system that permits data integration and exploration in a variety of settings. We show how the system can be used to find gender-specific differences in serum cytokine levels, and compare them across experiments and assay types. Conclusions We have used the tools and techniques of data warehousing, including open-source business intelligence software, to support investigator-driven data integration and mining of diverse immunological data.

  11. Data-driven architectural production and operation

    NARCIS (Netherlands)

    Bier, H.H.; Mostafavi, S.

    2014-01-01

    Data-driven architectural production and operation as explored within Hyperbody rely heavily on system thinking implying that all parts of a system are to be understood in relation to each other. These relations are increasingly established bi-directionally so that data-driven architecture is not

  12. Data-driven Development of ROTEM and TEG Algorithms for the Management of Trauma Hemorrhage

    DEFF Research Database (Denmark)

    Baksaas-Aasen, Kjersti; Van Dieren, Susan; Balvers, Kirsten

    2018-01-01

    for ROTEM, TEG, and CCTs to be used in addition to ratio driven transfusion and tranexamic acid. CONCLUSIONS: We describe a systematic approach to define threshold parameters for ROTEM and TEG. These parameters were incorporated into algorithms to support data-driven adjustments of resuscitation...

  13. Data driven innovations in structural health monitoring

    Science.gov (United States)

    Rosales, M. J.; Liyanapathirana, R.

    2017-05-01

    At present, substantial investments are being allocated to civil infrastructures also considered as valuable assets at a national or global scale. Structural Health Monitoring (SHM) is an indispensable tool required to ensure the performance and safety of these structures based on measured response parameters. The research to date on damage assessment has tended to focus on the utilization of wireless sensor networks (WSN) as it proves to be the best alternative over the traditional visual inspections and tethered or wired counterparts. Over the last decade, the structural health and behaviour of innumerable infrastructure has been measured and evaluated owing to several successful ventures of implementing these sensor networks. Various monitoring systems have the capability to rapidly transmit, measure, and store large capacities of data. The amount of data collected from these networks have eventually been unmanageable which paved the way to other relevant issues such as data quality, relevance, re-use, and decision support. There is an increasing need to integrate new technologies in order to automate the evaluation processes as well as to enhance the objectivity of data assessment routines. This paper aims to identify feasible methodologies towards the application of time-series analysis techniques to judiciously exploit the vast amount of readily available as well as the upcoming data resources. It continues the momentum of a greater effort to collect and archive SHM approaches that will serve as data-driven innovations for the assessment of damage through efficient algorithms and data analytics.

  14. The Role of Guided Induction in Paper-Based Data-Driven Learning

    Science.gov (United States)

    Smart, Jonathan

    2014-01-01

    This study examines the role of guided induction as an instructional approach in paper-based data-driven learning (DDL) in the context of an ESL grammar course during an intensive English program at an American public university. Specifically, it examines whether corpus-informed grammar instruction is more effective through inductive, data-driven…

  15. LHC-GCS a model-driven approach for automatic PLC and SCADA code generation

    CERN Document Server

    Thomas, Geraldine; Barillère, Renaud; Cabaret, Sebastien; Kulman, Nikolay; Pons, Xavier; Rochez, Jacques

    2005-01-01

    The LHC experiments’ Gas Control System (LHC GCS) project [1] aims to provide the four LHC experiments (ALICE, ATLAS, CMS and LHCb) with control for their 23 gas systems. To ease the production and maintenance of 23 control systems, a model-driven approach has been adopted to generate automatically the code for the Programmable Logic Controllers (PLCs) and for the Supervision Control And Data Acquisition (SCADA) systems. The first milestones of the project have been achieved. The LHC GCS framework [4] and the generation tools have been produced. A first control application has actually been generated and is in production, and a second is in preparation. This paper describes the principle and the architecture of the model-driven solution. It will in particular detail how the model-driven solution fits with the LHC GCS framework and with the UNICOS [5] data-driven tools.

  16. Retrospective data-driven respiratory gating for PET/CT

    International Nuclear Information System (INIS)

    Schleyer, Paul J; O'Doherty, Michael J; Barrington, Sally F; Marsden, Paul K

    2009-01-01

    Respiratory motion can adversely affect both PET and CT acquisitions. Respiratory gating allows an acquisition to be divided into a series of motion-reduced bins according to the respiratory signal, which is typically hardware acquired. In order that the effects of motion can potentially be corrected for, we have developed a novel, automatic, data-driven gating method which retrospectively derives the respiratory signal from the acquired PET and CT data. PET data are acquired in listmode and analysed in sinogram space, and CT data are acquired in cine mode and analysed in image space. Spectral analysis is used to identify regions within the CT and PET data which are subject to respiratory motion, and the variation of counts within these regions is used to estimate the respiratory signal. Amplitude binning is then used to create motion-reduced PET and CT frames. The method was demonstrated with four patient datasets acquired on a 4-slice PET/CT system. To assess the accuracy of the data-derived respiratory signal, a hardware-based signal was acquired for comparison. Data-driven gating was successfully performed on PET and CT datasets for all four patients. Gated images demonstrated respiratory motion throughout the bin sequences for all PET and CT series, and image analysis and direct comparison of the traces derived from the data-driven method with the hardware-acquired traces indicated accurate recovery of the respiratory signal.

  17. A Consumer-Driven Approach To Increase Suggestive Selling.

    Science.gov (United States)

    Rohn, Don; Austin, John; Sanford, Alison

    2003-01-01

    Discussion of the effectiveness of behavioral interventions in improving suggestive selling behavior of sales staff focuses on a study that examined the efficacy of a consumer-driven approach to improve suggestive selling behavior of three employees of a fast food franchise. Reports that consumer-driven intervention increased suggestive selling…

  18. Personalized Mortality Prediction Driven by Electronic Medical Data and a Patient Similarity Metric

    Science.gov (United States)

    Lee, Joon; Maslove, David M.; Dubin, Joel A.

    2015-01-01

    Background Clinical outcome prediction normally employs static, one-size-fits-all models that perform well for the average patient but are sub-optimal for individual patients with unique characteristics. In the era of digital healthcare, it is feasible to dynamically personalize decision support by identifying and analyzing similar past patients, in a way that is analogous to personalized product recommendation in e-commerce. Our objectives were: 1) to prove that analyzing only similar patients leads to better outcome prediction performance than analyzing all available patients, and 2) to characterize the trade-off between training data size and the degree of similarity between the training data and the index patient for whom prediction is to be made. Methods and Findings We deployed a cosine-similarity-based patient similarity metric (PSM) to an intensive care unit (ICU) database to identify patients that are most similar to each patient and subsequently to custom-build 30-day mortality prediction models. Rich clinical and administrative data from the first day in the ICU from 17,152 adult ICU admissions were analyzed. The results confirmed that using data from only a small subset of most similar patients for training improves predictive performance in comparison with using data from all available patients. The results also showed that when too few similar patients are used for training, predictive performance degrades due to the effects of small sample sizes. Our PSM-based approach outperformed well-known ICU severity of illness scores. Although the improved prediction performance is achieved at the cost of increased computational burden, Big Data technologies can help realize personalized data-driven decision support at the point of care. Conclusions The present study provides crucial empirical evidence for the promising potential of personalized data-driven decision support systems. With the increasing adoption of electronic medical record (EMR) systems, our

  19. Towards Data-Driven Simulations of Wildfire Spread using Ensemble-based Data Assimilation

    Science.gov (United States)

    Rochoux, M. C.; Bart, J.; Ricci, S. M.; Cuenot, B.; Trouvé, A.; Duchaine, F.; Morel, T.

    2012-12-01

    Real-time predictions of a propagating wildfire remain a challenging task because the problem involves both multi-physics and multi-scales. The propagation speed of wildfires, also called the rate of spread (ROS), is indeed determined by complex interactions between pyrolysis, combustion and flow dynamics, atmospheric dynamics occurring at vegetation, topographical and meteorological scales. Current operational fire spread models are mainly based on a semi-empirical parameterization of the ROS in terms of vegetation, topographical and meteorological properties. For the fire spread simulation to be predictive and compatible with operational applications, the uncertainty on the ROS model should be reduced. As recent progress made in remote sensing technology provides new ways to monitor the fire front position, a promising approach to overcome the difficulties found in wildfire spread simulations is to integrate fire modeling and fire sensing technologies using data assimilation (DA). For this purpose we have developed a prototype data-driven wildfire spread simulator in order to provide optimal estimates of poorly known model parameters [*]. The data-driven simulation capability is adapted for more realistic wildfire spread : it considers a regional-scale fire spread model that is informed by observations of the fire front location. An Ensemble Kalman Filter algorithm (EnKF) based on a parallel computing platform (OpenPALM) was implemented in order to perform a multi-parameter sequential estimation where wind magnitude and direction are in addition to vegetation properties (see attached figure). The EnKF algorithm shows its good ability to track a small-scale grassland fire experiment and ensures a good accounting for the sensitivity of the simulation outcomes to the control parameters. As a conclusion, it was shown that data assimilation is a promising approach to more accurately forecast time-varying wildfire spread conditions as new airborne-like observations of

  20. Pipe break prediction based on evolutionary data-driven methods with brief recorded data

    International Nuclear Information System (INIS)

    Xu Qiang; Chen Qiuwen; Li Weifeng; Ma Jinfeng

    2011-01-01

    Pipe breaks often occur in water distribution networks, imposing great pressure on utility managers to secure stable water supply. However, pipe breaks are hard to detect by the conventional method. It is therefore necessary to develop reliable and robust pipe break models to assess the pipe's probability to fail and then to optimize the pipe break detection scheme. In the absence of deterministic physical models for pipe break, data-driven techniques provide a promising approach to investigate the principles underlying pipe break. In this paper, two data-driven techniques, namely Genetic Programming (GP) and Evolutionary Polynomial Regression (EPR) are applied to develop pipe break models for the water distribution system of Beijing City. The comparison with the recorded pipe break data from 1987 to 2005 showed that the models have great capability to obtain reliable predictions. The models can be used to prioritize pipes for break inspection and then improve detection efficiency.

  1. Data Driven Fault Tolerant Control : A Subspace Approach

    NARCIS (Netherlands)

    Dong, J.

    2009-01-01

    The main stream research on fault detection and fault tolerant control has been focused on model based methods. As far as a model is concerned, changes therein due to faults have to be extracted from measured data. Generally speaking, existing approaches process measured inputs and outputs either by

  2. Data-driven asthma endotypes defined from blood biomarker and gene expression data.

    Directory of Open Access Journals (Sweden)

    Barbara Jane George

    Full Text Available The diagnosis and treatment of childhood asthma is complicated by its mechanistically distinct subtypes (endotypes driven by genetic susceptibility and modulating environmental factors. Clinical biomarkers and blood gene expression were collected from a stratified, cross-sectional study of asthmatic and non-asthmatic children from Detroit, MI. This study describes four distinct asthma endotypes identified via a purely data-driven method. Our method was specifically designed to integrate blood gene expression and clinical biomarkers in a way that provides new mechanistic insights regarding the different asthma endotypes. For example, we describe metabolic syndrome-induced systemic inflammation as an associated factor in three of the four asthma endotypes. Context provided by the clinical biomarker data was essential in interpreting gene expression patterns and identifying putative endotypes, which emphasizes the importance of integrated approaches when studying complex disease etiologies. These synthesized patterns of gene expression and clinical markers from our research may lead to development of novel serum-based biomarker panels.

  3. Data-driven integration of genome-scale regulatory and metabolic network models

    Directory of Open Access Journals (Sweden)

    Saheed eImam

    2015-05-01

    Full Text Available Microbes are diverse and extremely versatile organisms that play vital roles in all ecological niches. Understanding and harnessing microbial systems will be key to the sustainability of our planet. One approach to improving our knowledge of microbial processes is through data-driven and mechanism-informed computational modeling. Individual models of biological networks (such as metabolism, transcription and signaling have played pivotal roles in driving microbial research through the years. These networks, however, are highly interconnected and function in concert – a fact that has led to the development of a variety of approaches aimed at simulating the integrated functions of two or more network types. Though the task of integrating these different models is fraught with new challenges, the large amounts of high-throughput data sets being generated, and algorithms being developed, means that the time is at hand for concerted efforts to build integrated regulatory-metabolic networks in a data-driven fashion. In this perspective, we review current approaches for constructing integrated regulatory-metabolic models and outline new strategies for future development of these network models for any microbial system.

  4. A data-driven modeling approach to identify disease-specific multi-organ networks driving physiological dysregulation.

    Directory of Open Access Journals (Sweden)

    Warren D Anderson

    2017-07-01

    Full Text Available Multiple physiological systems interact throughout the development of a complex disease. Knowledge of the dynamics and connectivity of interactions across physiological systems could facilitate the prevention or mitigation of organ damage underlying complex diseases, many of which are currently refractory to available therapeutics (e.g., hypertension. We studied the regulatory interactions operating within and across organs throughout disease development by integrating in vivo analysis of gene expression dynamics with a reverse engineering approach to infer data-driven dynamic network models of multi-organ gene regulatory influences. We obtained experimental data on the expression of 22 genes across five organs, over a time span that encompassed the development of autonomic nervous system dysfunction and hypertension. We pursued a unique approach for identification of continuous-time models that jointly described the dynamics and structure of multi-organ networks by estimating a sparse subset of ∼12,000 possible gene regulatory interactions. Our analyses revealed that an autonomic dysfunction-specific multi-organ sequence of gene expression activation patterns was associated with a distinct gene regulatory network. We analyzed the model structures for adaptation motifs, and identified disease-specific network motifs involving genes that exhibited aberrant temporal dynamics. Bioinformatic analyses identified disease-specific single nucleotide variants within or near transcription factor binding sites upstream of key genes implicated in maintaining physiological homeostasis. Our approach illustrates a novel framework for investigating the pathogenesis through model-based analysis of multi-organ system dynamics and network properties. Our results yielded novel candidate molecular targets driving the development of cardiovascular disease, metabolic syndrome, and immune dysfunction.

  5. NEBULAS A High Performance Data-Driven Event-Building Architecture based on an Asynchronous Self-Routing Packet-Switching Network

    CERN Multimedia

    Costa, M; Letheren, M; Djidi, K; Gustafsson, L; Lazraq, T; Minerskjold, M; Tenhunen, H; Manabe, A; Nomachi, M; Watase, Y

    2002-01-01

    RD31 : The project is evaluating a new approach to event building for level-two and level-three processor farms at high rate experiments. It is based on the use of commercial switching fabrics to replace the traditional bus-based architectures used in most previous data acquisition sytems. Switching fabrics permit the construction of parallel, expandable, hardware-driven event builders that can deliver higher aggregate throughput than the bus-based architectures. A standard industrial switching fabric technology is being evaluated. It is based on Asynchronous Transfer Mode (ATM) packet-switching network technology. Commercial, expandable ATM switching fabrics and processor interfaces, now being developed for the future Broadband ISDN infrastructure, could form the basis of an implementation. The goals of the project are to demonstrate the viability of this approach, to evaluate the trade-offs involved in make versus buy options, to study the interfacing of the physics frontend data buffers to such a fabric, a...

  6. KNMI DataLab experiences in serving data-driven innovations

    Science.gov (United States)

    Noteboom, Jan Willem; Sluiter, Raymond

    2016-04-01

    Climate change research and innovations in weather forecasting rely more and more on (Big) data. Besides increasing data from traditional sources (such as observation networks, radars and satellites), the use of open data, crowd sourced data and the Internet of Things (IoT) is emerging. To deploy these sources of data optimally in our services and products, KNMI has established a DataLab to serve data-driven innovations in collaboration with public and private sector partners. Big data management, data integration, data analytics including machine learning and data visualization techniques are playing an important role in the DataLab. Cross-domain data-driven innovations that arise from public-private collaborative projects and research programmes can be explored, experimented and/or piloted by the KNMI DataLab. Furthermore, advice can be requested on (Big) data techniques and data sources. In support of collaborative (Big) data science activities, scalable environments are offered with facilities for data integration, data analysis and visualization. In addition, Data Science expertise is provided directly or from a pool of internal and external experts. At the EGU conference, gained experiences and best practices are presented in operating the KNMI DataLab to serve data-driven innovations for weather and climate applications optimally.

  7. Network Model-Assisted Inference from Respondent-Driven Sampling Data.

    Science.gov (United States)

    Gile, Krista J; Handcock, Mark S

    2015-06-01

    Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.

  8. A copula-based sampling method for data-driven prognostics

    International Nuclear Information System (INIS)

    Xi, Zhimin; Jing, Rong; Wang, Pingfeng; Hu, Chao

    2014-01-01

    This paper develops a Copula-based sampling method for data-driven prognostics. The method essentially consists of an offline training process and an online prediction process: (i) the offline training process builds a statistical relationship between the failure time and the time realizations at specified degradation levels on the basis of off-line training data sets; and (ii) the online prediction process identifies probable failure times for online testing units based on the statistical model constructed in the offline process and the online testing data. Our contributions in this paper are three-fold, namely the definition of a generic health index system to quantify the health degradation of an engineering system, the construction of a Copula-based statistical model to learn the statistical relationship between the failure time and the time realizations at specified degradation levels, and the development of a simulation-based approach for the prediction of remaining useful life (RUL). Two engineering case studies, namely the electric cooling fan health prognostics and the 2008 IEEE PHM challenge problem, are employed to demonstrate the effectiveness of the proposed methodology. - Highlights: • We develop a novel mechanism for data-driven prognostics. • A generic health index system quantifies health degradation of engineering systems. • Off-line training model is constructed based on the Bayesian Copula model. • Remaining useful life is predicted from a simulation-based approach

  9. A data-driven predictive approach for drug delivery using machine learning techniques.

    Directory of Open Access Journals (Sweden)

    Yuanyuan Li

    Full Text Available In drug delivery, there is often a trade-off between effective killing of the pathogen, and harmful side effects associated with the treatment. Due to the difficulty in testing every dosing scenario experimentally, a computational approach will be helpful to assist with the prediction of effective drug delivery methods. In this paper, we have developed a data-driven predictive system, using machine learning techniques, to determine, in silico, the effectiveness of drug dosing. The system framework is scalable, autonomous, robust, and has the ability to predict the effectiveness of the current drug treatment and the subsequent drug-pathogen dynamics. The system consists of a dynamic model incorporating both the drug concentration and pathogen population into distinct states. These states are then analyzed using a temporal model to describe the drug-cell interactions over time. The dynamic drug-cell interactions are learned in an adaptive fashion and used to make sequential predictions on the effectiveness of the dosing strategy. Incorporated into the system is the ability to adjust the sensitivity and specificity of the learned models based on a threshold level determined by the operator for the specific application. As a proof-of-concept, the system was validated experimentally using the pathogen Giardia lamblia and the drug metronidazole in vitro.

  10. A data-driven weighting scheme for multivariate phenotypic endpoints recapitulates zebrafish developmental cascades

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Guozhu, E-mail: gzhang6@ncsu.edu [Bioinformatics Research Center, North Carolina State University, Raleigh, NC (United States); Roell, Kyle R., E-mail: krroell@ncsu.edu [Bioinformatics Research Center, North Carolina State University, Raleigh, NC (United States); Truong, Lisa, E-mail: lisa.truong@oregonstate.edu [Department of Environmental and Molecular Toxicology, Sinnhuber Aquatic Research Laboratory, Oregon State University, Corvallis, OR (United States); Tanguay, Robert L., E-mail: robert.tanguay@oregonstate.edu [Department of Environmental and Molecular Toxicology, Sinnhuber Aquatic Research Laboratory, Oregon State University, Corvallis, OR (United States); Reif, David M., E-mail: dmreif@ncsu.edu [Bioinformatics Research Center, North Carolina State University, Raleigh, NC (United States); Department of Biological Sciences, Center for Human Health and the Environment, North Carolina State University, Raleigh, NC (United States)

    2017-01-01

    Zebrafish have become a key alternative model for studying health effects of environmental stressors, partly due to their genetic similarity to humans, fast generation time, and the efficiency of generating high-dimensional systematic data. Studies aiming to characterize adverse health effects in zebrafish typically include several phenotypic measurements (endpoints). While there is a solid biomedical basis for capturing a comprehensive set of endpoints, making summary judgments regarding health effects requires thoughtful integration across endpoints. Here, we introduce a Bayesian method to quantify the informativeness of 17 distinct zebrafish endpoints as a data-driven weighting scheme for a multi-endpoint summary measure, called weighted Aggregate Entropy (wAggE). We implement wAggE using high-throughput screening (HTS) data from zebrafish exposed to five concentrations of all 1060 ToxCast chemicals. Our results show that our empirical weighting scheme provides better performance in terms of the Receiver Operating Characteristic (ROC) curve for identifying significant morphological effects and improves robustness over traditional curve-fitting approaches. From a biological perspective, our results suggest that developmental cascade effects triggered by chemical exposure can be recapitulated by analyzing the relationships among endpoints. Thus, wAggE offers a powerful approach for analysis of multivariate phenotypes that can reveal underlying etiological processes. - Highlights: • Introduced a data-driven weighting scheme for multiple phenotypic endpoints. • Weighted Aggregate Entropy (wAggE) implies differential importance of endpoints. • Endpoint relationships reveal developmental cascade effects triggered by exposure. • wAggE is generalizable to multi-endpoint data of different shapes and scales.

  11. A data-driven framework for investigating customer retention

    OpenAIRE

    Mgbemena, Chidozie Simon

    2016-01-01

    This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London. This study presents a data-driven simulation framework in order to understand customer behaviour and therefore improve customer retention. The overarching system design methodology used for this study is aligned with the design science paradigm. The Social Media Domain Analysis (SoMeDoA) approach is adopted and evaluated to build a model on the determinants of customer satisfaction ...

  12. End-to-end System Performance Simulation: A Data-Centric Approach

    Science.gov (United States)

    Guillaume, Arnaud; Laffitte de Petit, Jean-Luc; Auberger, Xavier

    2013-08-01

    In the early times of space industry, the feasibility of Earth observation missions was directly driven by what could be achieved by the satellite. It was clear to everyone that the ground segment would be able to deal with the small amount of data sent by the payload. Over the years, the amounts of data processed by the spacecrafts have been increasing drastically, leading to put more and more constraints on the ground segment performances - and in particular on timeliness. Nowadays, many space systems require high data throughputs and short response times, with information coming from multiple sources and involving complex algorithms. It has become necessary to perform thorough end-to-end analyses of the full system in order to optimise its cost and efficiency, but even sometimes to assess the feasibility of the mission. This paper presents a novel framework developed by Astrium Satellites in order to meet these needs of timeliness evaluation and optimisation. This framework, named ETOS (for “End-to-end Timeliness Optimisation of Space systems”), provides a modelling process with associated tools, models and GUIs. These are integrated thanks to a common data model and suitable adapters, with the aim of building suitable space systems simulators of the full end-to-end chain. A big challenge of such environment is to integrate heterogeneous tools (each one being well-adapted to part of the chain) into a relevant timeliness simulation.

  13. Data-Driven Design of Intelligent Wireless Networks: An Overview and Tutorial

    Directory of Open Access Journals (Sweden)

    Merima Kulin

    2016-06-01

    Full Text Available Data science or “data-driven research” is a research approach that uses real-life data to gain insight about the behavior of systems. It enables the analysis of small, simple as well as large and more complex systems in order to assess whether they function according to the intended design and as seen in simulation. Data science approaches have been successfully applied to analyze networked interactions in several research areas such as large-scale social networks, advanced business and healthcare processes. Wireless networks can exhibit unpredictable interactions between algorithms from multiple protocol layers, interactions between multiple devices, and hardware specific influences. These interactions can lead to a difference between real-world functioning and design time functioning. Data science methods can help to detect the actual behavior and possibly help to correct it. Data science is increasingly used in wireless research. To support data-driven research in wireless networks, this paper illustrates the step-by-step methodology that has to be applied to extract knowledge from raw data traces. To this end, the paper (i clarifies when, why and how to use data science in wireless network research; (ii provides a generic framework for applying data science in wireless networks; (iii gives an overview of existing research papers that utilized data science approaches in wireless networks; (iv illustrates the overall knowledge discovery process through an extensive example in which device types are identified based on their traffic patterns; (v provides the reader the necessary datasets and scripts to go through the tutorial steps themselves.

  14. A Model-Driven Approach for 3D Modeling of Pylon from Airborne LiDAR Data

    Directory of Open Access Journals (Sweden)

    Qingquan Li

    2015-09-01

    Full Text Available Reconstructing three-dimensional model of the pylon from LiDAR (Light Detection And Ranging point clouds automatically is one of the key techniques for facilities management GIS system of high-voltage nationwide transmission smart grid. This paper presents a model-driven three-dimensional pylon modeling (MD3DM method using airborne LiDAR data. We start with constructing a parametric model of pylon, based on its actual structure and the characteristics of point clouds data. In this model, a pylon is divided into three parts: pylon legs, pylon body and pylon head. The modeling approach mainly consists of four steps. Firstly, point clouds of individual pylon are detected and segmented from massive high-voltage transmission corridor point clouds automatically. Secondly, an individual pylon is divided into three relatively simple parts in order to reconstruct different parts with different strategies. Its position and direction are extracted by contour analysis of the pylon body in this stage. Thirdly, the geometric features of the pylon head are extracted, from which the head type is derived with a SVM (Support Vector Machine classifier. After that, the head is constructed by seeking corresponding model from pre-build model library. Finally, the body is modeled by fitting the point cloud to planes. Experiment results on several point clouds data sets from China Southern high-voltage nationwide transmission grid from Yunnan Province to Guangdong Province show that the proposed approach can achieve the goal of automatic three-dimensional modeling of the pylon effectively.

  15. Thermo-driven microcrawlers fabricated via a microfluidic approach

    International Nuclear Information System (INIS)

    Wang Wei; Yao Chen; Zhang Maojie; Ju Xiaojie; Xie Rui; Chu Liangyin

    2013-01-01

    A novel thermo-driven microcrawler that can transform thermal stimuli into directional mechanical motion is developed by a simple microfluidic approach together with emulsion-template synthesis. The microcrawler is designed with a thermo-responsive poly(N-isopropylacrylamide) (PNIPAM) hydrogel body and a bell-like structure with an eccentric cavity. The asymmetric shrinking–swelling circulation of the microcrawlers enables a thermo-driven locomotion responding to repeated temperature changes, which provides a novel model with symmetry breaking principle for designing biomimetic soft microrobots. The microfluidic approach offers a novel and promising platform for design and fabrication of biomimetic soft microrobots. (paper)

  16. A data-driven approach for denoising GNSS position time series

    Science.gov (United States)

    Li, Yanyan; Xu, Caijun; Yi, Lei; Fang, Rongxin

    2017-12-01

    Global navigation satellite system (GNSS) datasets suffer from common mode error (CME) and other unmodeled errors. To decrease the noise level in GNSS positioning, we propose a new data-driven adaptive multiscale denoising method in this paper. Both synthetic and real-world long-term GNSS datasets were employed to assess the performance of the proposed method, and its results were compared with those of stacking filtering, principal component analysis (PCA) and the recently developed multiscale multiway PCA. It is found that the proposed method can significantly eliminate the high-frequency white noise and remove the low-frequency CME. Furthermore, the proposed method is more precise for denoising GNSS signals than the other denoising methods. For example, in the real-world example, our method reduces the mean standard deviation of the north, east and vertical components from 1.54 to 0.26, 1.64 to 0.21 and 4.80 to 0.72 mm, respectively. Noise analysis indicates that for the original signals, a combination of power-law plus white noise model can be identified as the best noise model. For the filtered time series using our method, the generalized Gauss-Markov model is the best noise model with the spectral indices close to - 3, indicating that flicker walk noise can be identified. Moreover, the common mode error in the unfiltered time series is significantly reduced by the proposed method. After filtering with our method, a combination of power-law plus white noise model is the best noise model for the CMEs in the study region.

  17. Data-driven diagnostics of terrestrial carbon dynamics over North America

    Science.gov (United States)

    Jingfeng Xiao; Scott V. Ollinger; Steve Frolking; George C. Hurtt; David Y. Hollinger; Kenneth J. Davis; Yude Pan; Xiaoyang Zhang; Feng Deng; Jiquan Chen; Dennis D. Baldocchi; Bevery E. Law; M. Altaf Arain; Ankur R. Desai; Andrew D. Richardson; Ge Sun; Brian Amiro; Hank Margolis; Lianhong Gu; Russell L. Scott; Peter D. Blanken; Andrew E. Suyker

    2014-01-01

    The exchange of carbon dioxide is a key measure of ecosystem metabolism and a critical intersection between the terrestrial biosphere and the Earth's climate. Despite the general agreement that the terrestrial ecosystems in North America provide a sizeable carbon sink, the size and distribution of the sink remain uncertain. We use a data-driven approach to upscale...

  18. Data-driven Inference and Investigation of Thermosphere Dynamics and Variations

    Science.gov (United States)

    Mehta, P. M.; Linares, R.

    2017-12-01

    This paper presents a methodology for data-driven inference and investigation of thermosphere dynamics and variations. The approach uses data-driven modal analysis to extract the most energetic modes of variations for neutral thermospheric species using proper orthogonal decomposition, where the time-independent modes or basis represent the dynamics and the time-depedent coefficients or amplitudes represent the model parameters. The data-driven modal analysis approach combined with sparse, discrete observations is used to infer amplitues for the dynamic modes and to calibrate the energy content of the system. In this work, two different data-types, namely the number density measurements from TIMED/GUVI and the mass density measurements from CHAMP/GRACE are simultaneously ingested for an accurate and self-consistent specification of the thermosphere. The assimilation process is achieved with a non-linear least squares solver and allows estimation/tuning of the model parameters or amplitudes rather than the driver. In this work, we use the Naval Research Lab's MSIS model to derive the most energetic modes for six different species, He, O, N2, O2, H, and N. We examine the dominant drivers of variations for helium in MSIS and observe that seasonal latitudinal variation accounts for about 80% of the dynamic energy with a strong preference of helium for the winter hemisphere. We also observe enhanced helium presence near the poles at GRACE altitudes during periods of low solar activity (Feb 2007) as previously deduced. We will also examine the storm-time response of helium derived from observations. The results are expected to be useful in tuning/calibration of the physics-based models.

  19. Distributed simulation a model driven engineering approach

    CERN Document Server

    Topçu, Okan; Oğuztüzün, Halit; Yilmaz, Levent

    2016-01-01

    Backed by substantive case studies, the novel approach to software engineering for distributed simulation outlined in this text demonstrates the potent synergies between model-driven techniques, simulation, intelligent agents, and computer systems development.

  20. Network Model-Assisted Inference from Respondent-Driven Sampling Data

    Science.gov (United States)

    Gile, Krista J.; Handcock, Mark S.

    2015-01-01

    Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328

  1. Global retrieval of soil moisture and vegetation properties using data-driven methods

    Science.gov (United States)

    Rodriguez-Fernandez, Nemesio; Richaume, Philippe; Kerr, Yann

    2017-04-01

    Data-driven methods such as neural networks (NNs) are a powerful tool to retrieve soil moisture from multi-wavelength remote sensing observations at global scale. In this presentation we will review a number of recent results regarding the retrieval of soil moisture with the Soil Moisture and Ocean Salinity (SMOS) satellite, either using SMOS brightness temperatures as input data for the retrieval or using SMOS soil moisture retrievals as reference dataset for the training. The presentation will discuss several possibilities for both the input datasets and the datasets to be used as reference for the supervised learning phase. Regarding the input datasets, it will be shown that NNs take advantage of the synergy of SMOS data and data from other sensors such as the Advanced Scatterometer (ASCAT, active microwaves) and MODIS (visible and infra red). NNs have also been successfully used to construct long time series of soil moisture from the Advanced Microwave Scanning Radiometer - Earth Observing System (AMSR-E) and SMOS. A NN with input data from ASMR-E observations and SMOS soil moisture as reference for the training was used to construct a dataset sharing a similar climatology and without a significant bias with respect to SMOS soil moisture. Regarding the reference data to train the data-driven retrievals, we will show different possibilities depending on the application. Using actual in situ measurements is challenging at global scale due to the scarce distribution of sensors. In contrast, in situ measurements have been successfully used to retrieve SM at continental scale in North America, where the density of in situ measurement stations is high. Using global land surface models to train the NN constitute an interesting alternative to implement new remote sensing surface datasets. In addition, these datasets can be used to perform data assimilation into the model used as reference for the training. This approach has recently been tested at the European Centre

  2. Model Driven Development of Data Sensitive Systems

    DEFF Research Database (Denmark)

    Olsen, Petur

    2014-01-01

    storage systems, where the actual values of the data is not relevant for the behavior of the system. For many systems the values are important. For instance the control flow of the system can be dependent on the input values. We call this type of system data sensitive, as the execution is sensitive...... to the values of variables. This theses strives to improve model-driven development of such data-sensitive systems. This is done by addressing three research questions. In the first we combine state-based modeling and abstract interpretation, in order to ease modeling of data-sensitive systems, while allowing...... efficient model-checking and model-based testing. In the second we develop automatic abstraction learning used together with model learning, in order to allow fully automatic learning of data-sensitive systems to allow learning of larger systems. In the third we develop an approach for modeling and model-based...

  3. Test Driven Development: Performing Art

    Science.gov (United States)

    Bache, Emily

    The art of Test Driven Development (TDD) is a skill that needs to be learnt, and which needs time and practice to master. In this workshop a select number of conference participants with considerable skill and experience are invited to perform code katas [1]. The aim is for them to demonstrate excellence and the use of Test Driven Development, and result in some high quality code. This would be for the benefit of the many programmers attending the conference, who could come along and witness high quality code being written using TDD, and get a chance to ask questions and provide feedback.

  4. QoE-Driven Energy-Aware Multipath Content Delivery Approach for MPTCP-Based Mobile Phones

    Institute of Scientific and Technical Information of China (English)

    Yuanlong Cao; Shengyang Chen; Qinghua Liu; Yi Zuo; Hao Wang; Minghe Huang

    2017-01-01

    Mobile phones equipped with multiple wireless interfaces can increase their goodput performance by making use of concurrent transmissions over multiple paths,enabled by the Multipath TCP (MPTCP).However,utilizing MPTCP for data delivery may generally result in higher energy consumption,while the battery power of a mobile phone is limited.Thus,how to optimize the energy usage becomes very crucial and urgent.In this paper,we propose MPTCP-QE,a novel quality of experience (QoE)-driven energy-aware multipath content delivery approach for MPTCP-based mobile phones.The main idea of MPTCP-QE is described as follows:it first provides an application rate-aware energy-efficient subflow management strategy to tradeoff throughput performance and energy consumption for mobile phones;then uses an available bandwidth-aware congestion window fast recovery strategy to make a sender avoid unnecessary slow-start and utilize wireless resource quickly;and further introduces a novel receiver-driven energy-efficient SACK strategy to help a receiver possible to detect SACK loss timely and trigger loss recovery in a more energy-efficient way.The simulation results show that with the MPTCP-QE,the energy usage is enhanced while the performance level is maintained compared to existing MPTCP solutions.

  5. Cognitive Effects of Mindfulness Training: Results of a Pilot Study Based on a Theory Driven Approach.

    Science.gov (United States)

    Wimmer, Lena; Bellingrath, Silja; von Stockhausen, Lisa

    2016-01-01

    The present paper reports a pilot study which tested cognitive effects of mindfulness practice in a theory-driven approach. Thirty-four fifth graders received either a mindfulness training which was based on the mindfulness-based stress reduction approach (experimental group), a concentration training (active control group), or no treatment (passive control group). Based on the operational definition of mindfulness by Bishop et al. (2004), effects on sustained attention, cognitive flexibility, cognitive inhibition, and data-driven as opposed to schema-based information processing were predicted. These abilities were assessed in a pre-post design by means of a vigilance test, a reversible figures test, the Wisconsin Card Sorting Test, a Stroop test, a visual search task, and a recognition task of prototypical faces. Results suggest that the mindfulness training specifically improved cognitive inhibition and data-driven information processing.

  6. Cognitive effects of mindfulness training: Results of a pilot study based on a theory driven approach

    Directory of Open Access Journals (Sweden)

    Lena Wimmer

    2016-07-01

    Full Text Available The present paper reports a pilot study which tested cognitive effects of mindfulness practice in a theory-driven approach. Thirty-four fifth graders received either a mindfulness training which was based on the mindfulness-based stress reduction approach (experimental group, a concentration training (active control group or no treatment (passive control group. Based on the operational definition of mindfulness by Bishop et al. (2004, effects on sustained attention, cognitive flexibility, cognitive inhibition and data-driven as opposed to schema-based information processing were predicted. These abilities were assessed in a pre-post design by means of a vigilance test, a reversible figures test, the Wisconsin Card Sorting Test, a Stroop test, a visual search task, and a recognition task of prototypical faces. Results suggest that the mindfulness training specifically improved cognitive inhibition and data-driven information processing.

  7. Performance-Driven Interface Contract Enforcement for Scientific Components

    Energy Technology Data Exchange (ETDEWEB)

    Dahlgren, Tamara Lynn [Univ. of California, Davis, CA (United States)

    2008-01-01

    Performance-driven interface contract enforcement research aims to improve the quality of programs built from plug-and-play scientific components. Interface contracts make the obligations on the caller and all implementations of the specified methods explicit. Runtime contract enforcement is a well-known technique for enhancing testing and debugging. However, checking all of the associated constraints during deployment is generally considered too costly from a performance stand point. Previous solutions enforced subsets of constraints without explicit consideration of their performance implications. Hence, this research measures the impacts of different interface contract sampling strategies and compares results with new techniques driven by execution time estimates. Results from three studies indicate automatically adjusting the level of checking based on performance constraints improves the likelihood of detecting contract violations under certain circumstances. Specifically, performance-driven enforcement is better suited to programs exercising constraints whose costs are at most moderately expensive relative to normal program execution.

  8. Using data-driven approach for wind power prediction: A comparative study

    International Nuclear Information System (INIS)

    Taslimi Renani, Ehsan; Elias, Mohamad Fathi Mohamad; Rahim, Nasrudin Abd.

    2016-01-01

    Highlights: • Double exponential smoothing is the most accurate model in wind speed prediction. • A two-stage feature selection method is proposed to select most important inputs. • Direct prediction illustrates better accuracy than indirect prediction. • Adaptive neuro fuzzy inference system outperforms data mining algorithms. • Random forest performs the worst compared to other data mining algorithm. - Abstract: Although wind energy is intermittent and stochastic in nature, it is increasingly important in the power generation due to its sustainability and pollution-free. Increased utilization of wind energy sources calls for more robust and efficient prediction models to mitigate uncertainties associated with wind power. This research compares two different approaches in wind power forecasting which are indirect and direct prediction methods. In indirect method, several times series are applied to forecast the wind speed, whereas the logistic function with five parameters is then used to forecast the wind power. In this study, backtracking search algorithm with novel crossover and mutation operators is employed to find the best parameters of five-parameter logistic function. A new feature selection technique, combining the mutual information and neural network is proposed in this paper to extract the most informative features with a maximum relevancy and minimum redundancy. From the comparative study, the results demonstrate that, in the direct prediction approach where the historical weather data are used to predict the wind power generation directly, adaptive neuro fuzzy inference system outperforms five data mining algorithms namely, random forest, M5Rules, k-nearest neighbor, support vector machine and multilayer perceptron. Moreover, it is also found that the mean absolute percentage error of the direct prediction method using adaptive neuro fuzzy inference system is 1.47% which is approximately less than half of the error obtained with the

  9. Statistical multi-model approach for performance assessment of cooling tower

    International Nuclear Information System (INIS)

    Pan, Tian-Hong; Shieh, Shyan-Shu; Jang, Shi-Shang; Tseng, Wen-Hung; Wu, Chan-Wei; Ou, Jenq-Jang

    2011-01-01

    This paper presents a data-driven model-based assessment strategy to investigate the performance of a cooling tower. In order to achieve this objective, the operations of a cooling tower are first characterized using a data-driven method, multiple models, which presents a set of local models in the format of linear equations. Satisfactory fuzzy c-mean clustering algorithm is used to classify operating data into several groups to build local models. The developed models are then applied to predict the performance of the system based on design input parameters provided by the manufacturer. The tower characteristics are also investigated using the proposed models via the effects of the water/air flow ratio. The predicted results tend to agree well with the calculated tower characteristics using actual measured operating data from an industrial plant. By comparison with the design characteristic curve provided by the manufacturer, the effectiveness of cooling tower can be obtained in the end. A case study conducted in a commercial plant demonstrates the validity of proposed approach. It should be noted that this is the first attempt to assess the cooling efficiency which is deviated from the original design value using operating data for an industrial scale process. Moreover, the evaluated process need not interrupt the normal operation of the cooling tower. This should be of particular interest in industrial applications.

  10. The effects of data-driven learning activities on EFL learners' writing development.

    Science.gov (United States)

    Luo, Qinqin

    2016-01-01

    Data-driven learning has been proved as an effective approach in helping learners solve various writing problems such as correcting lexical or grammatical errors, improving the use of collocations and generating ideas in writing, etc. This article reports on an empirical study in which data-driven learning was accomplished with the assistance of the user-friendly BNCweb, and presents the evaluation of the outcome by comparing the effectiveness of BNCweb and a search engine Baidu which is most commonly used as reference resource by Chinese learners of English as a foreign language. The quantitative results about 48 Chinese college students revealed that the experimental group which used BNCweb performed significantly better in the post-test in terms of writing fluency and accuracy, as compared with the control group which used the search engine Baidu. However, no significant difference was found between the two groups in terms of writing complexity. The qualitative results about the interview revealed that learners generally showed a positive attitude toward the use of BNCweb but there were still some problems of using corpora in the writing process, thus the combined use of corpora and other types of reference resource was suggested as a possible way to counter the potential barriers for Chinese learners of English.

  11. Data mining, knowledge discovery and data-driven modelling

    NARCIS (Netherlands)

    Solomatine, D.P.; Velickov, S.; Bhattacharya, B.; Van der Wal, B.

    2003-01-01

    The project was aimed at exploring the possibilities of a new paradigm in modelling - data-driven modelling, often referred as "data mining". Several application areas were considered: sedimentation problems in the Port of Rotterdam, automatic soil classification on the basis of cone penetration

  12. Interactive volume exploration of petascale microscopy data streams using a visualization-driven virtual memory approach

    KAUST Repository

    Hadwiger, Markus; Beyer, Johanna; Jeong, Wonki; Pfister, Hanspeter

    2012-01-01

    This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience. © 1995-2012 IEEE.

  13. Interactive volume exploration of petascale microscopy data streams using a visualization-driven virtual memory approach

    KAUST Repository

    Hadwiger, Markus

    2012-12-01

    This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience. © 1995-2012 IEEE.

  14. Dynamically adaptive data-driven simulation of extreme hydrological flows

    Science.gov (United States)

    Kumar Jain, Pushkar; Mandli, Kyle; Hoteit, Ibrahim; Knio, Omar; Dawson, Clint

    2018-02-01

    Hydrological hazards such as storm surges, tsunamis, and rainfall-induced flooding are physically complex events that are costly in loss of human life and economic productivity. Many such disasters could be mitigated through improved emergency evacuation in real-time and through the development of resilient infrastructure based on knowledge of how systems respond to extreme events. Data-driven computational modeling is a critical technology underpinning these efforts. This investigation focuses on the novel combination of methodologies in forward simulation and data assimilation. The forward geophysical model utilizes adaptive mesh refinement (AMR), a process by which a computational mesh can adapt in time and space based on the current state of a simulation. The forward solution is combined with ensemble based data assimilation methods, whereby observations from an event are assimilated into the forward simulation to improve the veracity of the solution, or used to invert for uncertain physical parameters. The novelty in our approach is the tight two-way coupling of AMR and ensemble filtering techniques. The technology is tested using actual data from the Chile tsunami event of February 27, 2010. These advances offer the promise of significantly transforming data-driven, real-time modeling of hydrological hazards, with potentially broader applications in other science domains.

  15. Dynamically adaptive data-driven simulation of extreme hydrological flows

    KAUST Repository

    Kumar Jain, Pushkar

    2017-12-27

    Hydrological hazards such as storm surges, tsunamis, and rainfall-induced flooding are physically complex events that are costly in loss of human life and economic productivity. Many such disasters could be mitigated through improved emergency evacuation in real-time and through the development of resilient infrastructure based on knowledge of how systems respond to extreme events. Data-driven computational modeling is a critical technology underpinning these efforts. This investigation focuses on the novel combination of methodologies in forward simulation and data assimilation. The forward geophysical model utilizes adaptive mesh refinement (AMR), a process by which a computational mesh can adapt in time and space based on the current state of a simulation. The forward solution is combined with ensemble based data assimilation methods, whereby observations from an event are assimilated into the forward simulation to improve the veracity of the solution, or used to invert for uncertain physical parameters. The novelty in our approach is the tight two-way coupling of AMR and ensemble filtering techniques. The technology is tested using actual data from the Chile tsunami event of February 27, 2010. These advances offer the promise of significantly transforming data-driven, real-time modeling of hydrological hazards, with potentially broader applications in other science domains.

  16. Robust PLS approach for KPI-related prediction and diagnosis against outliers and missing data

    Science.gov (United States)

    Yin, Shen; Wang, Guang; Yang, Xu

    2014-07-01

    In practical industrial applications, the key performance indicator (KPI)-related prediction and diagnosis are quite important for the product quality and economic benefits. To meet these requirements, many advanced prediction and monitoring approaches have been developed which can be classified into model-based or data-driven techniques. Among these approaches, partial least squares (PLS) is one of the most popular data-driven methods due to its simplicity and easy implementation in large-scale industrial process. As PLS is totally based on the measured process data, the characteristics of the process data are critical for the success of PLS. Outliers and missing values are two common characteristics of the measured data which can severely affect the effectiveness of PLS. To ensure the applicability of PLS in practical industrial applications, this paper introduces a robust version of PLS to deal with outliers and missing values, simultaneously. The effectiveness of the proposed method is finally demonstrated by the application results of the KPI-related prediction and diagnosis on an industrial benchmark of Tennessee Eastman process.

  17. Performance driven IT management five practical steps to business success

    CERN Document Server

    Sachs, Ira

    2011-01-01

    This book argues that the Federal Government needs a new approach to IT management. Introducing a novel five-step process called performance-driven management (PDM), author Ira Sachs explains in detail how to reduce risk on large IT programs and projects. This is an essential tool for all IT and business managers in government and contractors doing business with the government, and it has much useful and actionable information for anyone who is interested in helping their business save money and take on effective, successful practices.

  18. Data-driven non-Markovian closure models

    Science.gov (United States)

    Kondrashov, Dmitri; Chekroun, Mickaël D.; Ghil, Michael

    2015-03-01

    This paper has two interrelated foci: (i) obtaining stable and efficient data-driven closure models by using a multivariate time series of partial observations from a large-dimensional system; and (ii) comparing these closure models with the optimal closures predicted by the Mori-Zwanzig (MZ) formalism of statistical physics. Multilayer stochastic models (MSMs) are introduced as both a generalization and a time-continuous limit of existing multilevel, regression-based approaches to closure in a data-driven setting; these approaches include empirical model reduction (EMR), as well as more recent multi-layer modeling. It is shown that the multilayer structure of MSMs can provide a natural Markov approximation to the generalized Langevin equation (GLE) of the MZ formalism. A simple correlation-based stopping criterion for an EMR-MSM model is derived to assess how well it approximates the GLE solution. Sufficient conditions are derived on the structure of the nonlinear cross-interactions between the constitutive layers of a given MSM to guarantee the existence of a global random attractor. This existence ensures that no blow-up can occur for a broad class of MSM applications, a class that includes non-polynomial predictors and nonlinearities that do not necessarily preserve quadratic energy invariants. The EMR-MSM methodology is first applied to a conceptual, nonlinear, stochastic climate model of coupled slow and fast variables, in which only slow variables are observed. It is shown that the resulting closure model with energy-conserving nonlinearities efficiently captures the main statistical features of the slow variables, even when there is no formal scale separation and the fast variables are quite energetic. Second, an MSM is shown to successfully reproduce the statistics of a partially observed, generalized Lotka-Volterra model of population dynamics in its chaotic regime. The challenges here include the rarity of strange attractors in the model's parameter

  19. Using data-driven agent-based models for forecasting emerging infectious diseases

    Directory of Open Access Journals (Sweden)

    Srinivasan Venkatramanan

    2018-03-01

    Full Text Available Producing timely, well-informed and reliable forecasts for an ongoing epidemic of an emerging infectious disease is a huge challenge. Epidemiologists and policy makers have to deal with poor data quality, limited understanding of the disease dynamics, rapidly changing social environment and the uncertainty on effects of various interventions in place. Under this setting, detailed computational models provide a comprehensive framework for integrating diverse data sources into a well-defined model of disease dynamics and social behavior, potentially leading to better understanding and actions. In this paper, we describe one such agent-based model framework developed for forecasting the 2014–2015 Ebola epidemic in Liberia, and subsequently used during the Ebola forecasting challenge. We describe the various components of the model, the calibration process and summarize the forecast performance across scenarios of the challenge. We conclude by highlighting how such a data-driven approach can be refined and adapted for future epidemics, and share the lessons learned over the course of the challenge. Keywords: Emerging infectious diseases, Agent-based models, Simulation optimization, Bayesian calibration, Ebola

  20. Data-driven architectural design to production and operation

    NARCIS (Netherlands)

    Bier, H.H.; Mostafavi, S.

    2015-01-01

    Data-driven architectural production and operation explored within Hyperbody rely heavily on system thinking implying that all parts of a system are to be understood in relation to each other. These relations are established bi-directionally so that data-driven architecture is not only produced

  1. Bending of Euler-Bernoulli nanobeams based on the strain-driven and stress-driven nonlocal integral models: a numerical approach

    Science.gov (United States)

    Oskouie, M. Faraji; Ansari, R.; Rouhi, H.

    2018-04-01

    Eringen's nonlocal elasticity theory is extensively employed for the analysis of nanostructures because it is able to capture nanoscale effects. Previous studies have revealed that using the differential form of the strain-driven version of this theory leads to paradoxical results in some cases, such as bending analysis of cantilevers, and recourse must be made to the integral version. In this article, a novel numerical approach is developed for the bending analysis of Euler-Bernoulli nanobeams in the context of strain- and stress-driven integral nonlocal models. This numerical approach is proposed for the direct solution to bypass the difficulties related to converting the integral governing equation into a differential equation. First, the governing equation is derived based on both strain-driven and stress-driven nonlocal models by means of the minimum total potential energy. Also, in each case, the governing equation is obtained in both strong and weak forms. To solve numerically the derived equations, matrix differential and integral operators are constructed based upon the finite difference technique and trapezoidal integration rule. It is shown that the proposed numerical approach can be efficiently applied to the strain-driven nonlocal model with the aim of resolving the mentioned paradoxes. Also, it is able to solve the problem based on the strain-driven model without inconsistencies of the application of this model that are reported in the literature.

  2. A Knowledge-driven Approach to Composite Activity Recognition in Smart Environments

    OpenAIRE

    Chen, Liming; Wang, H.; Sterritt, Roy; Okeyo, George

    2012-01-01

    Knowledge-driven activity recognition has recently attracted increasing attention but mainly focused on simple activities. This paper extends previous work to introduce a knowledge-driven approach to recognition of composite activities such as interleaved and concurrent activities. The approach combines ontological and temporal knowledge modelling formalisms for composite activity modelling. It exploits ontological reasoning for simple activity recognition and rule-based temporal inference to...

  3. Effective Rating Scale Development for Speaking Tests: Performance Decision Trees

    Science.gov (United States)

    Fulcher, Glenn; Davidson, Fred; Kemp, Jenny

    2011-01-01

    Rating scale design and development for testing speaking is generally conducted using one of two approaches: the measurement-driven approach or the performance data-driven approach. The measurement-driven approach prioritizes the ordering of descriptors onto a single scale. Meaning is derived from the scaling methodology and the agreement of…

  4. Data to Decisions: Creating a Culture of Model-Driven Drug Discovery.

    Science.gov (United States)

    Brown, Frank K; Kopti, Farida; Chang, Charlie Zhenyu; Johnson, Scott A; Glick, Meir; Waller, Chris L

    2017-09-01

    Merck & Co., Inc., Kenilworth, NJ, USA, is undergoing a transformation in the way that it prosecutes R&D programs. Through the adoption of a "model-driven" culture, enhanced R&D productivity is anticipated, both in the form of decreased attrition at each stage of the process and by providing a rational framework for understanding and learning from the data generated along the way. This new approach focuses on the concept of a "Design Cycle" that makes use of all the data possible, internally and externally, to drive decision-making. These data can take the form of bioactivity, 3D structures, genomics, pathway, PK/PD, safety data, etc. Synthesis of high-quality data into models utilizing both well-established and cutting-edge methods has been shown to yield high confidence predictions to prioritize decision-making and efficiently reposition resources within R&D. The goal is to design an adaptive research operating plan that uses both modeled data and experiments, rather than just testing, to drive project decision-making. To support this emerging culture, an ambitious information management (IT) program has been initiated to implement a harmonized platform to facilitate the construction of cross-domain workflows to enable data-driven decision-making and the construction and validation of predictive models. These goals are achieved through depositing model-ready data, agile persona-driven access to data, a unified cross-domain predictive model lifecycle management platform, and support for flexible scientist-developed workflows that simplify data manipulation and consume model services. The end-to-end nature of the platform, in turn, not only supports but also drives the culture change by enabling scientists to apply predictive sciences throughout their work and over the lifetime of a project. This shift in mindset for both scientists and IT was driven by an early impactful demonstration of the potential benefits of the platform, in which expert-level early discovery

  5. Manuscript 101: A Data-Driven Writing Exercise For Beginning Scientists

    OpenAIRE

    Ralston, Amy; Halbisen, Michael

    2017-01-01

    Learning to write a scientific manuscript is one of the most important and rewarding scientific training experiences, yet most young scientists only embark on this experience relatively late in graduate school, after gathering sufficient data in the lab. Yet, familiarity with the process of writing a scientific manuscript and receiving peer reviews, often leads to a more focused and driven experimental approach. To jump-start this training, we developed a protocol for teaching manuscript writ...

  6. Data-Intensive Science Meets Inquiry-Driven Pedagogy: Interactive Big Data Exploration, Threshold Concepts, and Liminality

    Science.gov (United States)

    Ramachandran, R.; Nair, U. S.; Word, A.

    2014-12-01

    Threshold concepts in any discipline are the core concepts an individual must understand in order to master a discipline. By their very nature, these concepts are troublesome, irreversible, integrative, bounded, discursive, and reconstitutive. Although grasping threshold concepts can be extremely challenging for each learner as s/he moves through stages of cognitive development relative to a given discipline, the learner's grasp of these concepts determines the extent to which s/he is prepared to work competently and creatively within the field itself. The movement of individuals from a state of ignorance of these core concepts to one of mastery occurs not along a linear path but in iterative cycles of knowledge creation and adjustment in liminal spaces - conceptual spaces through which learners move from the vaguest awareness of concepts to mastery, accompanied by understanding of their relevance, connectivity, and usefulness relative to questions and constructs in a given discipline. With the explosive growth of data available in atmospheric science, driven largely by satellite Earth observations and high-resolution numerical simulations, paradigms such as that of data-intensive science have emerged. These paradigm shifts are based on the growing realization that current infrastructure, tools and processes will not allow us to analyze and fully utilize the complex and voluminous data that is being gathered. In this emerging paradigm, the scientific discovery process is driven by knowledge extracted from large volumes of data. In this presentation, we contend that this paradigm naturally lends to inquiry-driven pedagogy where knowledge is discovered through inductive engagement with large volumes of data rather than reached through traditional, deductive, hypothesis-driven analyses. In particular, data-intensive techniques married with an inductive methodology allow for exploration on a scale that is not possible in the traditional classroom with its typical

  7. A clinically driven variant prioritization framework outperforms purely computational approaches for the diagnostic analysis of singleton WES data.

    Science.gov (United States)

    Stark, Zornitza; Dashnow, Harriet; Lunke, Sebastian; Tan, Tiong Y; Yeung, Alison; Sadedin, Simon; Thorne, Natalie; Macciocca, Ivan; Gaff, Clara; Oshlack, Alicia; White, Susan M; James, Paul A

    2017-11-01

    Rapid identification of clinically significant variants is key to the successful application of next generation sequencing technologies in clinical practice. The Melbourne Genomics Health Alliance (MGHA) variant prioritization framework employs a gene prioritization index based on clinician-generated a priori gene lists, and a variant prioritization index (VPI) based on rarity, conservation and protein effect. We used data from 80 patients who underwent singleton whole exome sequencing (WES) to test the ability of the framework to rank causative variants highly, and compared it against the performance of other gene and variant prioritization tools. Causative variants were identified in 59 of the patients. Using the MGHA prioritization framework the average rank of the causative variant was 2.24, with 76% ranked as the top priority variant, and 90% ranked within the top five. Using clinician-generated gene lists resulted in ranking causative variants an average of 8.2 positions higher than prioritization based on variant properties alone. This clinically driven prioritization approach significantly outperformed purely computational tools, placing a greater proportion of causative variants top or in the top 5 (permutation P-value=0.001). Clinicians included 40 of the 49 WES diagnoses in their a priori list of differential diagnoses (81%). The lists generated by PhenoTips and Phenomizer contained 14 (29%) and 18 (37%) of these diagnoses respectively. These results highlight the benefits of clinically led variant prioritization in increasing the efficiency of singleton WES data analysis and have important implications for developing models for the funding and delivery of genomic services.

  8. Data-Driven Assistance Functions for Industrial Automation Systems

    International Nuclear Information System (INIS)

    Windmann, Stefan; Niggemann, Oliver

    2015-01-01

    The increasing amount of data in industrial automation systems overburdens the user in process control and diagnosis tasks. One possibility to cope with these challenges consists of using smart assistance systems that automatically monitor and optimize processes. This article deals with aspects of data-driven assistance systems such as assistance functions, process models and data acquisition. The paper describes novel approaches for self-diagnosis and self-optimization, and shows how these assistance functions can be integrated in different industrial environments. The considered assistance functions are based on process models that are automatically learned from process data. Fault detection and isolation is based on the comparison of observations of the real system with predictions obtained by application of the process models. The process models are further employed for energy efficiency optimization of industrial processes. Experimental results are presented for fault detection and energy efficiency optimization of a drive system. (paper)

  9. Implementing NASA's Capability-Driven Approach: Insight into NASA's Processes for Maturing Exploration Systems

    Science.gov (United States)

    Williams-Byrd, Julie; Arney, Dale; Rodgers, Erica; Antol, Jeff; Simon, Matthew; Hay, Jason; Larman, Kevin

    2015-01-01

    NASA is engaged in transforming human spaceflight. The Agency is shifting from an exploration-based program with human activities focused on low Earth orbit (LEO) and targeted robotic missions in deep space to a more sustainable and integrated pioneering approach. Through pioneering, NASA seeks to address national goals to develop the capacity for people to work, learn, operate, live, and thrive safely beyond the Earth for extended periods of time. However, pioneering space involves more than the daunting technical challenges of transportation, maintaining health, and enabling crew productivity for long durations in remote, hostile, and alien environments. This shift also requires a change in operating processes for NASA. The Agency can no longer afford to engineer systems for specific missions and destinations and instead must focus on common capabilities that enable a range of destinations and missions. NASA has codified a capability driven approach, which provides flexible guidance for the development and maturation of common capabilities necessary for human pioneers beyond LEO. This approach has been included in NASA policy and is captured in the Agency's strategic goals. It is currently being implemented across NASA's centers and programs. Throughout 2014, NASA engaged in an Agency-wide process to define and refine exploration-related capabilities and associated gaps, focusing only on those that are critical for human exploration beyond LEO. NASA identified 12 common capabilities ranging from Environmental Control and Life Support Systems to Robotics, and established Agency-wide teams or working groups comprised of subject matter experts that are responsible for the maturation of these exploration capabilities. These teams, called the System Maturation Teams (SMTs) help formulate, guide and resolve performance gaps associated with the identified exploration capabilities. The SMTs are defining performance parameters and goals for each of the 12 capabilities

  10. Evidence-based and data-driven road safety management

    Directory of Open Access Journals (Sweden)

    Fred Wegman

    2015-07-01

    Full Text Available Over the past decades, road safety in highly-motorised countries has made significant progress. Although we have a fair understanding of the reasons for this progress, we don't have conclusive evidence for this. A new generation of road safety management approaches has entered road safety, starting when countries decided to guide themselves by setting quantitative targets (e.g. 50% less casualties in ten years' time. Setting realistic targets, designing strategies and action plans to achieve these targets and monitoring progress have resulted in more scientific research to support decision-making on these topics. Three subjects are key in this new approach of evidence-based and data-driven road safety management: ex-post and ex-ante evaluation of both individual interventions and intervention packages in road safety strategies, and transferability (external validity of the research results. In this article, we explore these subjects based on recent experiences in four jurisdictions (Western Australia, the Netherlands, Sweden and Switzerland. All four apply similar approaches and tools; differences are considered marginal. It is concluded that policy-making and political decisions were influenced to a great extent by the results of analysis and research. Nevertheless, to compensate for a relatively weak theoretical basis and to improve the power of this new approach, a number of issues will need further research. This includes ex-post and ex-ante evaluation, a better understanding of extrapolation of historical trends and the transferability of research results. This new approach cannot be realized without high-quality road safety data. Good data and knowledge are indispensable for this new and very promising approach.

  11. Data-driven soft sensor design with multiple-rate sampled data

    DEFF Research Database (Denmark)

    Lin, Bao; Recke, Bodil; Knudsen, Jørgen K.H.

    2007-01-01

    Multi-rate systems are common in industrial processes where quality measurements have slower sampling rate than other process variables. Since inter-sample information is desirable for effective quality control, different approaches have been reported to estimate the quality between samples......, including numerical interpolation, polynomial transformation, data lifting and weighted partial least squares (WPLS). Two modifications to the original data lifting approach are proposed in this paper: reformulating the extraction of a fast model as an optimization problem and ensuring the desired model...... properties through Tikhonov Regularization. A comparative investigation of the four approaches is performed in this paper. Their applicability, accuracy and robustness to process noise are evaluated on a single-input single output (SISO) system. The regularized data lifting and WPLS approaches...

  12. A data-driven approach to patient blood management.

    Science.gov (United States)

    Cohn, Claudia S; Welbig, Julie; Bowman, Robert; Kammann, Susan; Frey, Katherine; Zantek, Nicole

    2014-02-01

    Patient blood management (PBM) has become a topic of intense interest; however, implementing a robust PBM system in a large academic hospital can be a challenge. In a joint effort between transfusion medicine and information technology, we have developed three overlapping databases that allow for a comprehensive, semiautomated approach to monitoring up-to-date red blood cell (RBC) usage in our hospital. Data derived from this work have allowed us to target our PBM efforts. Information on transfusions is collected using three databases: daily report, discharge database, and denominator database. The daily report collects data on all transfusions in the past 24 hours. The discharge database integrates transfusion data and diagnostic billing codes. The denominator database allows for rate calculations by tracking all patients with a hemoglobin test ordered. A set of algorithms is applied to automatically audit RBC transfusions. The transfusions that do not fit the algorithms' rules are manually reviewed. Data from audits are compiled into reports and distributed to medical directors. Data are also used to target education efforts. Since our PBM program began, the percentage of appropriate RBC orders increased from an initial 70%-80% to 90%-95%, and the overall RBC transfusions/1000 patient-days has decreased by 67% in targeted areas of the hospital. Our PBM program has shaved approximately 3% from our hospital's blood budget. Our semiautomated auditing system allows us to quickly and comprehensively analyze and track blood usage throughout our hospital. Using this technology, we have seen improvements in our hospital's PBM. © 2013 American Association of Blood Banks.

  13. Data-Driven Model Uncertainty Estimation in Hydrologic Data Assimilation

    Science.gov (United States)

    Pathiraja, S.; Moradkhani, H.; Marshall, L.; Sharma, A.; Geenens, G.

    2018-02-01

    The increasing availability of earth observations necessitates mathematical methods to optimally combine such data with hydrologic models. Several algorithms exist for such purposes, under the umbrella of data assimilation (DA). However, DA methods are often applied in a suboptimal fashion for complex real-world problems, due largely to several practical implementation issues. One such issue is error characterization, which is known to be critical for a successful assimilation. Mischaracterized errors lead to suboptimal forecasts, and in the worst case, to degraded estimates even compared to the no assimilation case. Model uncertainty characterization has received little attention relative to other aspects of DA science. Traditional methods rely on subjective, ad hoc tuning factors or parametric distribution assumptions that may not always be applicable. We propose a novel data-driven approach (named SDMU) to model uncertainty characterization for DA studies where (1) the system states are partially observed and (2) minimal prior knowledge of the model error processes is available, except that the errors display state dependence. It includes an approach for estimating the uncertainty in hidden model states, with the end goal of improving predictions of observed variables. The SDMU is therefore suited to DA studies where the observed variables are of primary interest. Its efficacy is demonstrated through a synthetic case study with low-dimensional chaotic dynamics and a real hydrologic experiment for one-day-ahead streamflow forecasting. In both experiments, the proposed method leads to substantial improvements in the hidden states and observed system outputs over a standard method involving perturbation with Gaussian noise.

  14. BMI cyberworkstation: enabling dynamic data-driven brain-machine interface research through cyberinfrastructure.

    Science.gov (United States)

    Zhao, Ming; Rattanatamrong, Prapaporn; DiGiovanna, Jack; Mahmoudi, Babak; Figueiredo, Renato J; Sanchez, Justin C; Príncipe, José C; Fortes, José A B

    2008-01-01

    Dynamic data-driven brain-machine interfaces (DDDBMI) have great potential to advance the understanding of neural systems and improve the design of brain-inspired rehabilitative systems. This paper presents a novel cyberinfrastructure that couples in vivo neurophysiology experimentation with massive computational resources to provide seamless and efficient support of DDDBMI research. Closed-loop experiments can be conducted with in vivo data acquisition, reliable network transfer, parallel model computation, and real-time robot control. Behavioral experiments with live animals are supported with real-time guarantees. Offline studies can be performed with various configurations for extensive analysis and training. A Web-based portal is also provided to allow users to conveniently interact with the cyberinfrastructure, conducting both experimentation and analysis. New motor control models are developed based on this approach, which include recursive least square based (RLS) and reinforcement learning based (RLBMI) algorithms. The results from an online RLBMI experiment shows that the cyberinfrastructure can successfully support DDDBMI experiments and meet the desired real-time requirements.

  15. Making a case for a development-driven approach to law as a ...

    African Journals Online (AJOL)

    driven approach to law as a linchpin for the post-2015 development agenda. KEYWORDS: Law, development, development-driven law, development law, development goals, development strategy, Millennium Development Goals, Millennium ...

  16. Data-Driven Iterative Vibration Signal Enhancement Strategy Using Alpha Stable Distribution

    Directory of Open Access Journals (Sweden)

    Grzegorz Żak

    2017-01-01

    Full Text Available The authors propose a novel procedure for enhancement of the signal to noise ratio in vibration data acquired from machines working in mining industry environment. Proposed method allows performing data-driven reduction of the deterministic, high energy, and low frequency components. Furthermore, it provides a way to enhance signal of interest. Procedure incorporates application of the time-frequency decomposition, α-stable distribution based signal modeling, and stability parameter in the time domain as a stoppage criterion for iterative part of the procedure. An advantage of the proposed algorithm is data-driven, automative detection of the informative frequency band as well as band with high energy due to the properties of the used distribution. Furthermore, there is no need to have knowledge regarding kinematics, speed, and so on. The proposed algorithm is applied towards real data acquired from the belt conveyor pulley drive’s gearbox.

  17. Data driven marketing for dummies

    CERN Document Server

    Semmelroth, David

    2013-01-01

    Embrace data and use it to sell and market your products Data is everywhere and it keeps growing and accumulating. Companies need to embrace big data and make it work harder to help them sell and market their products. Successful data analysis can help marketing professionals spot sales trends, develop smarter marketing campaigns, and accurately predict customer loyalty. Data Driven Marketing For Dummies helps companies use all the data at their disposal to make current customers more satisfied, reach new customers, and sell to their most important customer segments more efficiently. Identifyi

  18. DOE's Institute for Advanced Architecture and Algorithms: An application-driven approach

    International Nuclear Information System (INIS)

    Murphy, Richard C

    2009-01-01

    This paper describes an application driven methodology for understanding the impact of future architecture decisions on the end of the MPP era. Fundamental transistor device limitations combined with application performance characteristics have created the switch to multicore/multithreaded architectures. Designing large-scale supercomputers to match application demands is particularly challenging since performance characteristics are highly counter-intuitive. In fact, data movement more than FLOPS dominates. This work discusses some basic performance analysis for a set of DOE applications, the limits of CMOS technology, and the impact of both on future architectures.

  19. Smart energy households' pilot projects in the Netherlands with a design-driven approach

    NARCIS (Netherlands)

    Geelen, D.V.; Scheepens, A.; Kobus, C.; Obinna, U.; Mugge, R.; Schoormans, J.; Reinders, Angelina H.M.E.

    2013-01-01

    Residential smart grid projects can be evaluated by a design-driven approach, which focuses on gaining insights for successful product and service development by taking the end-users as a starting point. Because only little experience exists with this design-driven approach, this paper addresses how

  20. Data-driven modeling and real-time distributed control for energy efficient manufacturing systems

    International Nuclear Information System (INIS)

    Zou, Jing; Chang, Qing; Arinez, Jorge; Xiao, Guoxian

    2017-01-01

    As manufacturers face the challenges of increasing global competition and energy saving requirements, it is imperative to seek out opportunities to reduce energy waste and overall cost. In this paper, a novel data-driven stochastic manufacturing system modeling method is proposed to identify and predict energy saving opportunities and their impact on production. A real-time distributed feedback production control policy, which integrates the current and predicted system performance, is established to improve the overall profit and energy efficiency. A case study is presented to demonstrate the effectiveness of the proposed control policy. - Highlights: • A data-driven stochastic manufacturing system model is proposed. • Real-time system performance and energy saving opportunity identification method is developed. • Prediction method for future potential system performance and energy saving opportunity is developed. • A real-time distributed feedback control policy is established to improve energy efficiency and overall system profit.

  1. Migraine Subclassification via a Data-Driven Automated Approach Using Multimodality Factor Mixture Modeling of Brain Structure Measurements.

    Science.gov (United States)

    Schwedt, Todd J; Si, Bing; Li, Jing; Wu, Teresa; Chong, Catherine D

    2017-07-01

    The current subclassification of migraine is according to headache frequency and aura status. The variability in migraine symptoms, disease course, and response to treatment suggest the presence of additional heterogeneity or subclasses within migraine. The study objective was to subclassify migraine via a data-driven approach, identifying latent factors by jointly exploiting multiple sets of brain structural features obtained via magnetic resonance imaging (MRI). Migraineurs (n = 66) and healthy controls (n = 54) had brain MRI measurements of cortical thickness, cortical surface area, and volumes for 68 regions. A multimodality factor mixture model was used to subclassify MRIs and to determine the brain structural factors that most contributed to the subclassification. Clinical characteristics of subjects in each subgroup were compared. Automated MRI classification divided the subjects into two subgroups. Migraineurs in subgroup #1 had more severe allodynia symptoms during migraines (6.1 ± 5.3 vs. 3.6 ± 3.2, P = .03), more years with migraine (19.2 ± 11.3 years vs 13 ± 8.3 years, P = .01), and higher Migraine Disability Assessment (MIDAS) scores (25 ± 22.9 vs 15.7 ± 12.2, P = .04). There were not differences in headache frequency or migraine aura status between the two subgroups. Data-driven subclassification of brain MRIs based upon structural measurements identified two subgroups. Amongst migraineurs, the subgroups differed in allodynia symptom severity, years with migraine, and migraine-related disability. Since allodynia is associated with this imaging-based subclassification of migraine and prior publications suggest that allodynia impacts migraine treatment response and disease prognosis, future migraine diagnostic criteria could consider allodynia when defining migraine subgroups. © 2017 American Headache Society.

  2. LeadMine: a grammar and dictionary driven approach to entity recognition

    Science.gov (United States)

    2015-01-01

    Background Chemical entity recognition has traditionally been performed by machine learning approaches. Here we describe an approach using grammars and dictionaries. This approach has the advantage that the entities found can be directly related to a given grammar or dictionary, which allows the type of an entity to be known and, if an entity is misannotated, indicates which resource should be corrected. As recognition is driven by what is expected, if spelling errors occur, they can be corrected. Correcting such errors is highly useful when attempting to lookup an entity in a database or, in the case of chemical names, converting them to structures. Results Our system uses a mixture of expertly curated grammars and dictionaries, as well as dictionaries automatically derived from public resources. We show that the heuristics developed to filter our dictionary of trivial chemical names (from PubChem) yields a better performing dictionary than the previously published Jochem dictionary. Our final system performs post-processing steps to modify the boundaries of entities and to detect abbreviations. These steps are shown to significantly improve performance (2.6% and 4.0% F1-score respectively). Our complete system, with incremental post-BioCreative workshop improvements, achieves 89.9% precision and 85.4% recall (87.6% F1-score) on the CHEMDNER test set. Conclusions Grammar and dictionary approaches can produce results at least as good as the current state of the art in machine learning approaches. While machine learning approaches are commonly thought of as "black box" systems, our approach directly links the output entities to the input dictionaries and grammars. Our approach also allows correction of errors in detected entities, which can assist with entity resolution. PMID:25810776

  3. Data–driven modeling of nano-nose gas sensor arrays

    DEFF Research Database (Denmark)

    Alstrøm, Tommy Sonne; Larsen, Jan; Nielsen, Claus Højgård

    2010-01-01

    We present a data-driven approach to classification of Quartz Crystal Microbalance (QCM) sensor data. The sensor is a nano-nose gas sensor that detects concentrations of analytes down to ppm levels using plasma polymorized coatings. Each sensor experiment takes approximately one hour hence...... the number of available training data is limited. We suggest a data-driven classification model which work from few examples. The paper compares a number of data-driven classification and quantification schemes able to detect the gas and the concentration level. The data-driven approaches are based on state...

  4. Examining Data Driven Decision Making via Formative Assessment: A Confluence of Technology, Data Interpretation Heuristics and Curricular Policy

    Science.gov (United States)

    Swan, Gerry; Mazur, Joan

    2011-01-01

    Although the term data-driven decision making (DDDM) is relatively new (Moss, 2007), the underlying concept of DDDM is not. For example, the practices of formative assessment and computer-managed instruction have historically involved the use of student performance data to guide what happens next in the instructional sequence (Morrison, Kemp, &…

  5. A program wide framework for evaluating data driven teaching and learning - earth analytics approaches, results and lessons learned

    Science.gov (United States)

    Wasser, L. A.; Gold, A. U.

    2017-12-01

    There is a deluge of earth systems data available to address cutting edge science problems yet specific skills are required to work with these data. The Earth analytics education program, a core component of Earth Lab at the University of Colorado - Boulder - is building a data intensive program that provides training in realms including 1) interdisciplinary communication and collaboration 2) earth science domain knowledge including geospatial science and remote sensing and 3) reproducible, open science workflows ("earth analytics"). The earth analytics program includes an undergraduate internship, undergraduate and graduate level courses and a professional certificate / degree program. All programs share the goals of preparing a STEM workforce for successful earth analytics driven careers. We are developing an program-wide evaluation framework that assesses the effectiveness of data intensive instruction combined with domain science learning to better understand and improve data-intensive teaching approaches using blends of online, in situ, asynchronous and synchronous learning. We are using targeted online search engine optimization (SEO) to increase visibility and in turn program reach. Finally our design targets longitudinal program impacts on participant career tracts over time.. Here we present results from evaluation of both an interdisciplinary undergrad / graduate level earth analytics course and and undergraduate internship. Early results suggest that a blended approach to learning and teaching that includes both synchronous in-person teaching and active classroom hands-on learning combined with asynchronous learning in the form of online materials lead to student success. Further we will present our model for longitudinal tracking of participant's career focus overtime to better understand long-term program impacts. We also demonstrate the impact of SEO optimization on online content reach and program visibility.

  6. Data driven modelling of vertical atmospheric radiation

    International Nuclear Information System (INIS)

    Antoch, Jaromir; Hlubinka, Daniel

    2011-01-01

    In the Czech Hydrometeorological Institute (CHMI) there exists a unique set of meteorological measurements consisting of the values of vertical atmospheric levels of beta and gamma radiation. In this paper a stochastic data-driven model based on nonlinear regression and on nonhomogeneous Poisson process is suggested. In the first part of the paper, growth curves were used to establish an appropriate nonlinear regression model. For comparison we considered a nonhomogeneous Poisson process with its intensity based on growth curves. In the second part both approaches were applied to the real data and compared. Computational aspects are briefly discussed as well. The primary goal of this paper is to present an improved understanding of the distribution of environmental radiation as obtained from the measurements of the vertical radioactivity profiles by the radioactivity sonde system. - Highlights: → We model vertical atmospheric levels of beta and gamma radiation. → We suggest appropriate nonlinear regression model based on growth curves. → We compare nonlinear regression modelling with Poisson process based modeling. → We apply both models to the real data.

  7. Using Performance Task Data to Improve Instruction

    Science.gov (United States)

    Abbott, Amy L.; Wren, Douglas G.

    2016-01-01

    Two well-accepted ideas among educators are (a) performance assessment is an effective means of assessing higher-order thinking skills and (b) data-driven instruction planning is a valuable tool for optimizing student learning. This article describes a locally developed performance task (LDPT) designed to measure critical thinking, problem…

  8. A Multiple Data Fusion Approach to Wheel Slip Control for Decentralized Electric Vehicles

    Directory of Open Access Journals (Sweden)

    Dejun Yin

    2017-04-01

    Full Text Available Currently, active safety control methods for cars, i.e., the antilock braking system (ABS, the traction control system (TCS, and electronic stability control (ESC, govern the wheel slip control based on the wheel slip ratio, which relies on the information from non-driven wheels. However, these methods are not applicable in the cases without non-driven wheels, e.g., a four-wheel decentralized electric vehicle. Therefore, this paper proposes a new wheel slip control approach based on a novel data fusion method to ensure good traction performance in any driving condition. Firstly, with the proposed data fusion algorithm, the acceleration estimator makes use of the data measured by the sensor installed near the vehicle center of mass (CM to calculate the reference acceleration of each wheel center. Then, the wheel slip is constrained by controlling the acceleration deviation between the actual wheel and the reference wheel center. By comparison with non-control and model following control (MFC cases in double lane change tests, the simulation results demonstrate that the proposed control method has significant anti-slip effectiveness and stabilizing control performance.

  9. A data-driven prediction method for fast-slow systems

    Science.gov (United States)

    Groth, Andreas; Chekroun, Mickael; Kondrashov, Dmitri; Ghil, Michael

    2016-04-01

    In this work, we present a prediction method for processes that exhibit a mixture of variability on low and fast scales. The method relies on combining empirical model reduction (EMR) with singular spectrum analysis (SSA). EMR is a data-driven methodology for constructing stochastic low-dimensional models that account for nonlinearity and serial correlation in the estimated noise, while SSA provides a decomposition of the complex dynamics into low-order components that capture spatio-temporal behavior on different time scales. Our study focuses on the data-driven modeling of partial observations from dynamical systems that exhibit power spectra with broad peaks. The main result in this talk is that the combination of SSA pre-filtering with EMR modeling improves, under certain circumstances, the modeling and prediction skill of such a system, as compared to a standard EMR prediction based on raw data. Specifically, it is the separation into "fast" and "slow" temporal scales by the SSA pre-filtering that achieves the improvement. We show, in particular that the resulting EMR-SSA emulators help predict intermittent behavior such as rapid transitions between specific regions of the system's phase space. This capability of the EMR-SSA prediction will be demonstrated on two low-dimensional models: the Rössler system and a Lotka-Volterra model for interspecies competition. In either case, the chaotic dynamics is produced through a Shilnikov-type mechanism and we argue that the latter seems to be an important ingredient for the good prediction skills of EMR-SSA emulators. Shilnikov-type behavior has been shown to arise in various complex geophysical fluid models, such as baroclinic quasi-geostrophic flows in the mid-latitude atmosphere and wind-driven double-gyre ocean circulation models. This pervasiveness of the Shilnikow mechanism of fast-slow transition opens interesting perspectives for the extension of the proposed EMR-SSA approach to more realistic situations.

  10. A Model-Driven Approach to e-Course Management

    Science.gov (United States)

    Savic, Goran; Segedinac, Milan; Milenkovic, Dušica; Hrin, Tamara; Segedinac, Mirjana

    2018-01-01

    This paper presents research on using a model-driven approach to the development and management of electronic courses. We propose a course management system which stores a course model represented as distinct machine-readable components containing domain knowledge of different course aspects. Based on this formally defined platform-independent…

  11. Parameterized data-driven fuzzy model based optimal control of a semi-batch reactor.

    Science.gov (United States)

    Kamesh, Reddi; Rani, K Yamuna

    2016-09-01

    A parameterized data-driven fuzzy (PDDF) model structure is proposed for semi-batch processes, and its application for optimal control is illustrated. The orthonormally parameterized input trajectories, initial states and process parameters are the inputs to the model, which predicts the output trajectories in terms of Fourier coefficients. Fuzzy rules are formulated based on the signs of a linear data-driven model, while the defuzzification step incorporates a linear regression model to shift the domain from input to output domain. The fuzzy model is employed to formulate an optimal control problem for single rate as well as multi-rate systems. Simulation study on a multivariable semi-batch reactor system reveals that the proposed PDDF modeling approach is capable of capturing the nonlinear and time-varying behavior inherent in the semi-batch system fairly accurately, and the results of operating trajectory optimization using the proposed model are found to be comparable to the results obtained using the exact first principles model, and are also found to be comparable to or better than parameterized data-driven artificial neural network model based optimization results. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.

  12. Enhanced Component Performance Study: Motor-Driven Pumps 1998-2014

    International Nuclear Information System (INIS)

    Schroeder, John Alton

    2016-01-01

    This report presents an enhanced performance evaluation of motor-driven pumps at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience failure reports from fiscal year 1998 through 2014 for the component reliability as reported in the Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES). The motor-driven pump failure modes considered for standby systems are failure to start, failure to run less than or equal to one hour, and failure to run more than one hour; for normally running systems, the failure modes considered are failure to start and failure to run. An eight hour unreliability estimate is also calculated and trended. The component reliability estimates and the reliability data are trended for the most recent 10-year period while yearly estimates for reliability are provided for the entire active period. Statistically significant increasing trends were identified in pump run hours per reactor year. Statistically significant decreasing trends were identified for standby systems industry-wide frequency of start demands, and run hours per reactor year for runs of less than or equal to one hour.

  13. Data-Driven Machine-Learning Model in District Heating System for Heat Load Prediction: A Comparison Study

    Directory of Open Access Journals (Sweden)

    Fisnik Dalipi

    2016-01-01

    Full Text Available We present our data-driven supervised machine-learning (ML model to predict heat load for buildings in a district heating system (DHS. Even though ML has been used as an approach to heat load prediction in literature, it is hard to select an approach that will qualify as a solution for our case as existing solutions are quite problem specific. For that reason, we compared and evaluated three ML algorithms within a framework on operational data from a DH system in order to generate the required prediction model. The algorithms examined are Support Vector Regression (SVR, Partial Least Square (PLS, and random forest (RF. We use the data collected from buildings at several locations for a period of 29 weeks. Concerning the accuracy of predicting the heat load, we evaluate the performance of the proposed algorithms using mean absolute error (MAE, mean absolute percentage error (MAPE, and correlation coefficient. In order to determine which algorithm had the best accuracy, we conducted performance comparison among these ML algorithms. The comparison of the algorithms indicates that, for DH heat load prediction, SVR method presented in this paper is the most efficient one out of the three also compared to other methods found in the literature.

  14. Data-driven system to predict academic grades and dropout

    Science.gov (United States)

    Rovira, Sergi; Puertas, Eloi

    2017-01-01

    Nowadays, the role of a tutor is more important than ever to prevent students dropout and improve their academic performance. This work proposes a data-driven system to extract relevant information hidden in the student academic data and, thus, help tutors to offer their pupils a more proactive personal guidance. In particular, our system, based on machine learning techniques, makes predictions of dropout intention and courses grades of students, as well as personalized course recommendations. Moreover, we present different visualizations which help in the interpretation of the results. In the experimental validation, we show that the system obtains promising results with data from the degree studies in Law, Computer Science and Mathematics of the Universitat de Barcelona. PMID:28196078

  15. Employment relations: A data driven analysis of job markets using online job boards and online professional networks

    CSIR Research Space (South Africa)

    Marivate, Vukosi N

    2017-08-01

    Full Text Available Data from online job boards and online professional networks present an opportunity to understand job markets as well as how professionals transition from one job/career to another. We propose a data driven approach to begin to understand a slice...

  16. A robust data-driven approach for gene ontology annotation.

    Science.gov (United States)

    Li, Yanpeng; Yu, Hong

    2014-01-01

    Gene ontology (GO) and GO annotation are important resources for biological information management and knowledge discovery, but the speed of manual annotation became a major bottleneck of database curation. BioCreative IV GO annotation task aims to evaluate the performance of system that automatically assigns GO terms to genes based on the narrative sentences in biomedical literature. This article presents our work in this task as well as the experimental results after the competition. For the evidence sentence extraction subtask, we built a binary classifier to identify evidence sentences using reference distance estimator (RDE), a recently proposed semi-supervised learning method that learns new features from around 10 million unlabeled sentences, achieving an F1 of 19.3% in exact match and 32.5% in relaxed match. In the post-submission experiment, we obtained 22.1% and 35.7% F1 performance by incorporating bigram features in RDE learning. In both development and test sets, RDE-based method achieved over 20% relative improvement on F1 and AUC performance against classical supervised learning methods, e.g. support vector machine and logistic regression. For the GO term prediction subtask, we developed an information retrieval-based method to retrieve the GO term most relevant to each evidence sentence using a ranking function that combined cosine similarity and the frequency of GO terms in documents, and a filtering method based on high-level GO classes. The best performance of our submitted runs was 7.8% F1 and 22.2% hierarchy F1. We found that the incorporation of frequency information and hierarchy filtering substantially improved the performance. In the post-submission evaluation, we obtained a 10.6% F1 using a simpler setting. Overall, the experimental analysis showed our approaches were robust in both the two tasks. © The Author(s) 2014. Published by Oxford University Press.

  17. Initial Results from an Energy-Aware Airborne Dynamic, Data-Driven Application System Performing Sampling in Coherent Boundary-Layer Structures

    Science.gov (United States)

    Frew, E.; Argrow, B. M.; Houston, A. L.; Weiss, C.

    2014-12-01

    The energy-aware airborne dynamic, data-driven application system (EA-DDDAS) performs persistent sampling in complex atmospheric conditions by exploiting wind energy using the dynamic data-driven application system paradigm. The main challenge for future airborne sampling missions is operation with tight integration of physical and computational resources over wireless communication networks, in complex atmospheric conditions. The physical resources considered here include sensor platforms, particularly mobile Doppler radar and unmanned aircraft, the complex conditions in which they operate, and the region of interest. Autonomous operation requires distributed computational effort connected by layered wireless communication. Onboard decision-making and coordination algorithms can be enhanced by atmospheric models that assimilate input from physics-based models and wind fields derived from multiple sources. These models are generally too complex to be run onboard the aircraft, so they need to be executed in ground vehicles in the field, and connected over broadband or other wireless links back to the field. Finally, the wind field environment drives strong interaction between the computational and physical systems, both as a challenge to autonomous path planning algorithms and as a novel energy source that can be exploited to improve system range and endurance. Implementation details of a complete EA-DDDAS will be provided, along with preliminary flight test results targeting coherent boundary-layer structures.

  18. Data-driven performance evaluation method for CMS RPC trigger ...

    Indian Academy of Sciences (India)

    2012-10-06

    Oct 6, 2012 ... hardware-implemented algorithm, which performs the task of combining and merging information from muon ... Figure 1 shows the comparison of efficiencies obtained with the two methods containing .... [3] The CMS Collaboration, The trigger and data acquisition project, Volume 1, The Level 1. Trigger ...

  19. Econophysics and Data Driven Modelling of Market Dynamics

    CERN Document Server

    Aoyama, Hideaki; Chakrabarti, Bikas; Chakraborti, Anirban; Ghosh, Asim; Econophysics and Data Driven Modelling of Market Dynamics

    2015-01-01

    This book presents the works and research findings of physicists, economists, mathematicians, statisticians, and financial engineers who have undertaken data-driven modelling of market dynamics and other empirical studies in the field of Econophysics. During recent decades, the financial market landscape has changed dramatically with the deregulation of markets and the growing complexity of products. The ever-increasing speed and decreasing costs of computational power and networks have led to the emergence of huge databases. The availability of these data should permit the development of models that are better founded empirically, and econophysicists have accordingly been advocating that one should rely primarily on the empirical observations in order to construct models and validate them. The recent turmoil in financial markets and the 2008 crash appear to offer a strong rationale for new models and approaches. The Econophysics community accordingly has an important future role to play in market modelling....

  20. Analyzing Big Data in Psychology: A Split/Analyze/Meta-Analyze Approach

    Directory of Open Access Journals (Sweden)

    Mike W.-L. Cheung

    2016-05-01

    Full Text Available Big data is a field that has traditionally been dominated by disciplines such as computer science and business, where mainly data-driven analyses have been performed. Psychology, a discipline in which a strong emphasis is placed on behavioral theories and empirical research, has the potential to contribute greatly to the big data movement. However, one challenge to psychologists – and probably the most crucial one – is that most researchers may not have the necessary programming and computational skills to analyze big data. In this study we argue that psychologists can also conduct big data research and that, rather than trying to acquire new programming and computational skills, they should focus on their strengths, such as performing psychometric analyses and testing theories using multivariate analyses to explain phenomena. We propose a split/analyze/meta-analyze approach that allows psychologists to easily analyze big data. Two real datasets are used to demonstrate the proposed procedures in R. A new research agenda related to the analysis of big data in psychology is outlined at the end of the study.

  1. Analyzing Big Data in Psychology: A Split/Analyze/Meta-Analyze Approach.

    Science.gov (United States)

    Cheung, Mike W-L; Jak, Suzanne

    2016-01-01

    Big data is a field that has traditionally been dominated by disciplines such as computer science and business, where mainly data-driven analyses have been performed. Psychology, a discipline in which a strong emphasis is placed on behavioral theories and empirical research, has the potential to contribute greatly to the big data movement. However, one challenge to psychologists-and probably the most crucial one-is that most researchers may not have the necessary programming and computational skills to analyze big data. In this study we argue that psychologists can also conduct big data research and that, rather than trying to acquire new programming and computational skills, they should focus on their strengths, such as performing psychometric analyses and testing theories using multivariate analyses to explain phenomena. We propose a split/analyze/meta-analyze approach that allows psychologists to easily analyze big data. Two real datasets are used to demonstrate the proposed procedures in R. A new research agenda related to the analysis of big data in psychology is outlined at the end of the study.

  2. Data driven CAN node reliability assessment for manufacturing system

    Science.gov (United States)

    Zhang, Leiming; Yuan, Yong; Lei, Yong

    2017-01-01

    The reliability of the Controller Area Network(CAN) is critical to the performance and safety of the system. However, direct bus-off time assessment tools are lacking in practice due to inaccessibility of the node information and the complexity of the node interactions upon errors. In order to measure the mean time to bus-off(MTTB) of all the nodes, a novel data driven node bus-off time assessment method for CAN network is proposed by directly using network error information. First, the corresponding network error event sequence for each node is constructed using multiple-layer network error information. Then, the generalized zero inflated Poisson process(GZIP) model is established for each node based on the error event sequence. Finally, the stochastic model is constructed to predict the MTTB of the node. The accelerated case studies with different error injection rates are conducted on a laboratory network to demonstrate the proposed method, where the network errors are generated by a computer controlled error injection system. Experiment results show that the MTTB of nodes predicted by the proposed method agree well with observations in the case studies. The proposed data driven node time to bus-off assessment method for CAN networks can successfully predict the MTTB of nodes by directly using network error event data.

  3. Data-Driven Innovation through Open Government Data

    DEFF Research Database (Denmark)

    Jetzek, Thorhildur; Avital, Michel; Bjørn-Andersen, Niels

    2014-01-01

    The exponentially growing production of data and the social trend towards openness and sharing are power-ful forces that are changing the global economy and society. Governments around the world have become active participants in this evolution, opening up their data for access and reuse by public...... and private agents alike. The phenomenon of Open Government Data has spread around the world in the last four years, driven by the widely held belief that use of Open Government Data has the ability to generate both economic and social value. However, a cursory review of the popular press, as well...... as an investigation of academic research and empirical data, reveals the need to further understand the relationship between Open Government Data and value. In this paper, we focus on how use of Open Government Data can bring about new innovative solutions that can generate social and economic value. We apply...

  4. The Orion GN and C Data-Driven Flight Software Architecture for Automated Sequencing and Fault Recovery

    Science.gov (United States)

    King, Ellis; Hart, Jeremy; Odegard, Ryan

    2010-01-01

    The Orion Crew Exploration Vehicle (CET) is being designed to include significantly more automation capability than either the Space Shuttle or the International Space Station (ISS). In particular, the vehicle flight software has requirements to accommodate increasingly automated missions throughout all phases of flight. A data-driven flight software architecture will provide an evolvable automation capability to sequence through Guidance, Navigation & Control (GN&C) flight software modes and configurations while maintaining the required flexibility and human control over the automation. This flexibility is a key aspect needed to address the maturation of operational concepts, to permit ground and crew operators to gain trust in the system and mitigate unpredictability in human spaceflight. To allow for mission flexibility and reconfrgurability, a data driven approach is being taken to load the mission event plan as well cis the flight software artifacts associated with the GN&C subsystem. A database of GN&C level sequencing data is presented which manages and tracks the mission specific and algorithm parameters to provide a capability to schedule GN&C events within mission segments. The flight software data schema for performing automated mission sequencing is presented with a concept of operations for interactions with ground and onboard crew members. A prototype architecture for fault identification, isolation and recovery interactions with the automation software is presented and discussed as a forward work item.

  5. Data-driven risk identification in phase III clinical trials using central statistical monitoring.

    Science.gov (United States)

    Timmermans, Catherine; Venet, David; Burzykowski, Tomasz

    2016-02-01

    Our interest lies in quality control for clinical trials, in the context of risk-based monitoring (RBM). We specifically study the use of central statistical monitoring (CSM) to support RBM. Under an RBM paradigm, we claim that CSM has a key role to play in identifying the "risks to the most critical data elements and processes" that will drive targeted oversight. In order to support this claim, we first see how to characterize the risks that may affect clinical trials. We then discuss how CSM can be understood as a tool for providing a set of data-driven key risk indicators (KRIs), which help to organize adaptive targeted monitoring. Several case studies are provided where issues in a clinical trial have been identified thanks to targeted investigation after the identification of a risk using CSM. Using CSM to build data-driven KRIs helps to identify different kinds of issues in clinical trials. This ability is directly linked with the exhaustiveness of the CSM approach and its flexibility in the definition of the risks that are searched for when identifying the KRIs. In practice, a CSM assessment of the clinical database seems essential to ensure data quality. The atypical data patterns found in some centers and variables are seen as KRIs under a RBM approach. Targeted monitoring or data management queries can be used to confirm whether the KRIs point to an actual issue or not.

  6. Data-driven design of fault diagnosis systems nonlinear multimode processes

    CERN Document Server

    Haghani Abandan Sari, Adel

    2014-01-01

    In many industrial applications early detection and diagnosis of abnormal behavior of the plant is of great importance. During the last decades, the complexity of process plants has been drastically increased, which imposes great challenges in development of model-based monitoring approaches and it sometimes becomes unrealistic for modern large-scale processes. The main objective of Adel Haghani Abandan Sari is to study efficient fault diagnosis techniques for complex industrial systems using process historical data and considering the nonlinear behavior of the process. To this end, different methods are presented to solve the fault diagnosis problem based on the overall behavior of the process and its dynamics. Moreover, a novel technique is proposed for fault isolation and determination of the root-cause of the faults in the system, based on the fault impacts on the process measurements. Contents Process monitoring Fault diagnosis and fault-tolerant control Data-driven approaches and decision making Target...

  7. Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization.

    Science.gov (United States)

    Peng, Huan-Kai; Lee, Hao-Chih; Pan, Jia-Yu; Marculescu, Radu

    2016-01-01

    In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications.

  8. Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization.

    Directory of Open Access Journals (Sweden)

    Huan-Kai Peng

    Full Text Available In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications.

  9. Data-Driven Engineering of Social Dynamics: Pattern Matching and Profit Maximization

    Science.gov (United States)

    Peng, Huan-Kai; Lee, Hao-Chih; Pan, Jia-Yu; Marculescu, Radu

    2016-01-01

    In this paper, we define a new problem related to social media, namely, the data-driven engineering of social dynamics. More precisely, given a set of observations from the past, we aim at finding the best short-term intervention that can lead to predefined long-term outcomes. Toward this end, we propose a general formulation that covers two useful engineering tasks as special cases, namely, pattern matching and profit maximization. By incorporating a deep learning model, we derive a solution using convex relaxation and quadratic-programming transformation. Moreover, we propose a data-driven evaluation method in place of the expensive field experiments. Using a Twitter dataset, we demonstrate the effectiveness of our dynamics engineering approach for both pattern matching and profit maximization, and study the multifaceted interplay among several important factors of dynamics engineering, such as solution validity, pattern-matching accuracy, and intervention cost. Finally, the method we propose is general enough to work with multi-dimensional time series, so it can potentially be used in many other applications. PMID:26771830

  10. TOWARDS DEMAND DRIVEN PUBLISHING: APPROCHES TO THE PRIORITISATION OF DIGITISATION OF NATURAL HISTORY COLLECTIONS DATA

    Directory of Open Access Journals (Sweden)

    Vishwas Chavan

    2010-10-01

    Full Text Available Natural history collections represent a vast repository of biodiversity data of international significance. There is an imperative to capture the data through digitisation projects in order to expose the data to new and established users of biodiversity data. On the basis of review of current state of digitization of natural history collections, a demand driven approach is advocated through the use of metadata to promote and increase access to natural history collection data.

  11. Linear dynamical modes as new variables for data-driven ENSO forecast

    Science.gov (United States)

    Gavrilov, Andrey; Seleznev, Aleksei; Mukhin, Dmitry; Loskutov, Evgeny; Feigin, Alexander; Kurths, Juergen

    2018-05-01

    A new data-driven model for analysis and prediction of spatially distributed time series is proposed. The model is based on a linear dynamical mode (LDM) decomposition of the observed data which is derived from a recently developed nonlinear dimensionality reduction approach. The key point of this approach is its ability to take into account simple dynamical properties of the observed system by means of revealing the system's dominant time scales. The LDMs are used as new variables for empirical construction of a nonlinear stochastic evolution operator. The method is applied to the sea surface temperature anomaly field in the tropical belt where the El Nino Southern Oscillation (ENSO) is the main mode of variability. The advantage of LDMs versus traditionally used empirical orthogonal function decomposition is demonstrated for this data. Specifically, it is shown that the new model has a competitive ENSO forecast skill in comparison with the other existing ENSO models.

  12. Data-Driven Based Asynchronous Motor Control for Printing Servo Systems

    Science.gov (United States)

    Bian, Min; Guo, Qingyun

    Modern digital printing equipment aims to the environmental-friendly industry with high dynamic performances and control precision and low vibration and abrasion. High performance motion control system of printing servo systems was required. Control system of asynchronous motor based on data acquisition was proposed. Iterative learning control (ILC) algorithm was studied. PID control was widely used in the motion control. However, it was sensitive to the disturbances and model parameters variation. The ILC applied the history error data and present control signals to approximate the control signal directly in order to fully track the expect trajectory without the system models and structures. The motor control algorithm based on the ILC and PID was constructed and simulation results were given. The results show that data-driven control method is effective dealing with bounded disturbances for the motion control of printing servo systems.

  13. Data-driven outbreak forecasting with a simple nonlinear growth model.

    Science.gov (United States)

    Lega, Joceline; Brown, Heidi E

    2016-12-01

    Recent events have thrown the spotlight on infectious disease outbreak response. We developed a data-driven method, EpiGro, which can be applied to cumulative case reports to estimate the order of magnitude of the duration, peak and ultimate size of an ongoing outbreak. It is based on a surprisingly simple mathematical property of many epidemiological data sets, does not require knowledge or estimation of disease transmission parameters, is robust to noise and to small data sets, and runs quickly due to its mathematical simplicity. Using data from historic and ongoing epidemics, we present the model. We also provide modeling considerations that justify this approach and discuss its limitations. In the absence of other information or in conjunction with other models, EpiGro may be useful to public health responders. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.

  14. The Impact of Data-Based Science Instruction on Standardized Test Performance

    Science.gov (United States)

    Herrington, Tia W.

    Increased teacher accountability efforts have resulted in the use of data to improve student achievement. This study addressed teachers' inconsistent use of data-driven instruction in middle school science. Evidence of the impact of data-based instruction on student achievement and school and district practices has been well documented by researchers. In science, less information has been available on teachers' use of data for classroom instruction. Drawing on data-driven decision making theory, the purpose of this study was to examine whether data-based instruction impacted performance on the science Criterion Referenced Competency Test (CRCT) and to explore the factors that impeded its use by a purposeful sample of 12 science teachers at a data-driven school. The research questions addressed in this study included understanding: (a) the association between student performance on the science portion of the CRCT and data-driven instruction professional development, (b) middle school science teachers' perception of the usefulness of data, and (c) the factors that hindered the use of data for science instruction. This study employed a mixed methods sequential explanatory design. Data collected included 8th grade CRCT data, survey responses, and individual teacher interviews. A chi-square test revealed no improvement in the CRCT scores following the implementation of professional development on data-driven instruction (chi 2 (1) = .183, p = .67). Results from surveys and interviews revealed that teachers used data to inform their instruction, indicating time as the major hindrance to their use. Implications for social change include the development of lesson plans that will empower science teachers to deliver data-based instruction and students to achieve identified academic goals.

  15. A new data-driven controllability measure with application in intelligent buildings

    DEFF Research Database (Denmark)

    Shaker, Hamid Reza; Lazarova-Molnar, Sanja

    2017-01-01

    and instrumentation within today's intelligent buildings enable collecting high quality data which could be used directly in data-based analysis and control methods. The area of data-based systems analysis and control is concentrating on developing analysis and control methods that rely on data collected from meters...... and sensors, and information obtained by data processing. This differs from the traditional model-based approaches that are based on mathematical models of systems. We propose and describe a data-driven controllability measure for discrete-time linear systems. The concept is developed within a data......-based system analysis and control framework. Therefore, only measured data is used to obtain the proposed controllability measure. The proposed controllability measure not only shows if the system is controllable or not, but also reveals the level of controllability, which is the information its previous...

  16. Data Driven Constraints for the SVM

    DEFF Research Database (Denmark)

    Darkner, Sune; Clemmensen, Line Katrine Harder

    2012-01-01

    We propose a generalized data driven constraint for support vector machines exemplified by classification of paired observations in general and specifically on the human ear canal. This is particularly interesting in dynamic cases such as tissue movement or pathologies developing over time. Assum...

  17. A data driven method to measure electron charge mis-identification rate

    CERN Document Server

    Bakhshiansohi, Hamed

    2009-01-01

    Electron charge mis-measurement is an important challenge in analyses which depend on the charge of electron. To estimate the probability of {\\it electron charge mis-measurement} a data driven method is introduced and a good agreement with MC based methods is achieved.\\\\ The third moment of $\\phi$ distribution of hits in electron SuperCluster is studied. The correlation between this variable and the electron charge is also investigated. Using this `new' variable and some other variables the electron charge measurement is improved by two different approaches.

  18. Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks.

    Science.gov (United States)

    Vlachas, Pantelis R; Byeon, Wonmin; Wan, Zhong Y; Sapsis, Themistoklis P; Koumoutsakos, Petros

    2018-05-01

    We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.

  19. Assigning clinical codes with data-driven concept representation on Dutch clinical free text.

    Science.gov (United States)

    Scheurwegs, Elyne; Luyckx, Kim; Luyten, Léon; Goethals, Bart; Daelemans, Walter

    2017-05-01

    Clinical codes are used for public reporting purposes, are fundamental to determining public financing for hospitals, and form the basis for reimbursement claims to insurance providers. They are assigned to a patient stay to reflect the diagnosis and performed procedures during that stay. This paper aims to enrich algorithms for automated clinical coding by taking a data-driven approach and by using unsupervised and semi-supervised techniques for the extraction of multi-word expressions that convey a generalisable medical meaning (referred to as concepts). Several methods for extracting concepts from text are compared, two of which are constructed from a large unannotated corpus of clinical free text. A distributional semantic model (i.c. the word2vec skip-gram model) is used to generalize over concepts and retrieve relations between them. These methods are validated on three sets of patient stay data, in the disease areas of urology, cardiology, and gastroenterology. The datasets are in Dutch, which introduces a limitation on available concept definitions from expert-based ontologies (e.g. UMLS). The results show that when expert-based knowledge in ontologies is unavailable, concepts derived from raw clinical texts are a reliable alternative. Both concepts derived from raw clinical texts perform and concepts derived from expert-created dictionaries outperform a bag-of-words approach in clinical code assignment. Adding features based on tokens that appear in a semantically similar context has a positive influence for predicting diagnostic codes. Furthermore, the experiments indicate that a distributional semantics model can find relations between semantically related concepts in texts but also introduces erroneous and redundant relations, which can undermine clinical coding performance. Copyright © 2017. Published by Elsevier Inc.

  20. Simulation-Driven Development and Optimization of a High-Performance Six-Dimensional Wrist Force/Torque Sensor

    Directory of Open Access Journals (Sweden)

    Qiaokang LIANG

    2010-05-01

    Full Text Available This paper describes the Simulation-Driven Development and Optimization (SDDO of a six-dimensional force/torque sensor with high performance. By the implementation of the SDDO, the developed sensor possesses high performance such as high sensitivity, linearity, stiffness and repeatability simultaneously, which is hard for tranditional force/torque sensor. Integrated approach provided by software ANSYS was used to streamline and speed up the process chain and thereby to deliver results significantly faster than traditional approaches. The result of calibration experiment possesses some impressive characters, therefore the developed fore/torque sensor can be usefully used in industry and the methods of design can also be used to develop industrial product.

  1. Prediction of In-hospital Mortality in Emergency Department Patients With Sepsis: A Local Big Data-Driven, Machine Learning Approach.

    Science.gov (United States)

    Taylor, R Andrew; Pare, Joseph R; Venkatesh, Arjun K; Mowafi, Hani; Melnick, Edward R; Fleischman, William; Hall, M Kennedy

    2016-03-01

    Predictive analytics in emergency care has mostly been limited to the use of clinical decision rules (CDRs) in the form of simple heuristics and scoring systems. In the development of CDRs, limitations in analytic methods and concerns with usability have generally constrained models to a preselected small set of variables judged to be clinically relevant and to rules that are easily calculated. Furthermore, CDRs frequently suffer from questions of generalizability, take years to develop, and lack the ability to be updated as new information becomes available. Newer analytic and machine learning techniques capable of harnessing the large number of variables that are already available through electronic health records (EHRs) may better predict patient outcomes and facilitate automation and deployment within clinical decision support systems. In this proof-of-concept study, a local, big data-driven, machine learning approach is compared to existing CDRs and traditional analytic methods using the prediction of sepsis in-hospital mortality as the use case. This was a retrospective study of adult ED visits admitted to the hospital meeting criteria for sepsis from October 2013 to October 2014. Sepsis was defined as meeting criteria for systemic inflammatory response syndrome with an infectious admitting diagnosis in the ED. ED visits were randomly partitioned into an 80%/20% split for training and validation. A random forest model (machine learning approach) was constructed using over 500 clinical variables from data available within the EHRs of four hospitals to predict in-hospital mortality. The machine learning prediction model was then compared to a classification and regression tree (CART) model, logistic regression model, and previously developed prediction tools on the validation data set using area under the receiver operating characteristic curve (AUC) and chi-square statistics. There were 5,278 visits among 4,676 unique patients who met criteria for sepsis. Of

  2. Deriving albedo maps for HAPEX-Sahel from ASAS data using kernel-driven BRDF models

    Directory of Open Access Journals (Sweden)

    P. Lewis

    1999-01-01

    Full Text Available This paper describes the application and testing of a method for deriving spatial estimates of albedo from multi-angle remote sensing data. Linear kernel-driven models of surface bi-directional reflectance have been inverted against high spatial resolution multi-angular, multi- spectral airborne data of the principal cover types within the HAPEX-Sahel study site in Niger, West Africa. The airborne data are obtained from the NASA Airborne Solid-state Imaging Spectrometer (ASAS instrument, flown in Niger in September and October 1992. The maps of model parameters produced are used to estimate integrated reflectance properties related to spectral albedo. Broadband albedo has been estimated from this by weighting the spectral albedo for each pixel within the map as a function of the appropriate spectral solar irradiance and proportion of direct and diffuse illumination. Partial validation of the results was performed by comparing ASAS reflectance and derived directional-hemispherical reflectance with simulations of a millet canopy made with a complex geometric canopy reflectance model, the Botanical Plant Modelling System (BPMS. Both were found to agree well in magnitude. Broadband albedo values derived from the ASAS data were compared with ground-based (point sample albedo measurements and found to agree extremely well. These results indicate that the linear kernel-driven modelling approach, which is to be used operationally to produce global 16 day, 1 km albedo maps from forthcoming NASA Earth Observing System spaceborne data, is both sound and practical for the estimation of angle-integrated spectral reflectance quantities related to albedo. Results for broadband albedo are dependent on spectral sampling and on obtaining the correct spectral weigthings.

  3. Data-driven regionalization of housing markets

    NARCIS (Netherlands)

    Helbich, M.; Brunauer, W.; Hagenauer, J.; Leitner, M.

    2013-01-01

    This article presents a data-driven framework for housing market segmentation. Local marginal house price surfaces are investigated by means of mixed geographically weighted regression and are reduced to a set of principal component maps, which in turn serve as input for spatial regionalization. The

  4. Locative media and data-driven computing experiments

    Directory of Open Access Journals (Sweden)

    Sung-Yueh Perng

    2016-06-01

    Full Text Available Over the past two decades urban social life has undergone a rapid and pervasive geocoding, becoming mediated, augmented and anticipated by location-sensitive technologies and services that generate and utilise big, personal, locative data. The production of these data has prompted the development of exploratory data-driven computing experiments that seek to find ways to extract value and insight from them. These projects often start from the data, rather than from a question or theory, and try to imagine and identify their potential utility. In this paper, we explore the desires and mechanics of data-driven computing experiments. We demonstrate how both locative media data and computing experiments are ‘staged’ to create new values and computing techniques, which in turn are used to try and derive possible futures that are ridden with unintended consequences. We argue that using computing experiments to imagine potential urban futures produces effects that often have little to do with creating new urban practices. Instead, these experiments promote Big Data science and the prospect that data produced for one purpose can be recast for another and act as alternative mechanisms of envisioning urban futures.

  5. Writing through Big Data: New Challenges and Possibilities for Data-Driven Arguments

    Science.gov (United States)

    Beveridge, Aaron

    2017-01-01

    As multimodal writing continues to shift and expand in the era of Big Data, writing studies must confront the new challenges and possibilities emerging from data mining, data visualization, and data-driven arguments. Often collected under the broad banner of "data literacy," students' experiences of data visualization and data-driven…

  6. A Novel Online Data-Driven Algorithm for Detecting UAV Navigation Sensor Faults

    OpenAIRE

    Rui Sun; Qi Cheng; Guanyu Wang; Washington Yotto Ochieng

    2017-01-01

    The use of Unmanned Aerial Vehicles (UAVs) has increased significantly in recent years. On-board integrated navigation sensors are a key component of UAVs’ flight control systems and are essential for flight safety. In order to ensure flight safety, timely and effective navigation sensor fault detection capability is required. In this paper, a novel data-driven Adaptive Neuron Fuzzy Inference System (ANFIS)-based approach is presented for the detection of on-board navigation sensor faults in ...

  7. Enhanced Component Performance Study: Motor-Driven Pumps 1998–2014

    Energy Technology Data Exchange (ETDEWEB)

    Schroeder, John Alton [Idaho National Lab. (INL), Idaho Falls, ID (United States)

    2016-02-01

    This report presents an enhanced performance evaluation of motor-driven pumps at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience failure reports from fiscal year 1998 through 2014 for the component reliability as reported in the Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES). The motor-driven pump failure modes considered for standby systems are failure to start, failure to run less than or equal to one hour, and failure to run more than one hour; for normally running systems, the failure modes considered are failure to start and failure to run. An eight hour unreliability estimate is also calculated and trended. The component reliability estimates and the reliability data are trended for the most recent 10-year period while yearly estimates for reliability are provided for the entire active period. Statistically significant increasing trends were identified in pump run hours per reactor year. Statistically significant decreasing trends were identified for standby systems industry-wide frequency of start demands, and run hours per reactor year for runs of less than or equal to one hour.

  8. Alaska/Yukon Geoid Improvement by a Data-Driven Stokes's Kernel Modification Approach

    Science.gov (United States)

    Li, Xiaopeng; Roman, Daniel R.

    2015-04-01

    Geoid modeling over Alaska of USA and Yukon Canada being a trans-national issue faces a great challenge primarily due to the inhomogeneous surface gravity data (Saleh et al, 2013) and the dynamic geology (Freymueller et al, 2008) as well as its complex geological rheology. Previous study (Roman and Li 2014) used updated satellite models (Bruinsma et al 2013) and newly acquired aerogravity data from the GRAV-D project (Smith 2007) to capture the gravity field changes in the targeting areas primarily in the middle-to-long wavelength. In CONUS, the geoid model was largely improved. However, the precision of the resulted geoid model in Alaska was still in the decimeter level, 19cm at the 32 tide bench marks and 24cm on the 202 GPS/Leveling bench marks that gives a total of 23.8cm at all of these calibrated surface control points, where the datum bias was removed. Conventional kernel modification methods in this area (Li and Wang 2011) had limited effects on improving the precision of the geoid models. To compensate the geoid miss fits, a new Stokes's kernel modification method based on a data-driven technique is presented in this study. First, the method was tested on simulated data sets (Fig. 1), where the geoid errors have been reduced by 2 orders of magnitude (Fig 2). For the real data sets, some iteration steps are required to overcome the rank deficiency problem caused by the limited control data that are irregularly distributed in the target area. For instance, after 3 iterations, the standard deviation dropped about 2.7cm (Fig 3). Modification at other critical degrees can further minimize the geoid model miss fits caused either by the gravity error or the remaining datum error in the control points.

  9. Data-driven modeling and predictive control for boiler-turbine unit using fuzzy clustering and subspace methods.

    Science.gov (United States)

    Wu, Xiao; Shen, Jiong; Li, Yiguo; Lee, Kwang Y

    2014-05-01

    This paper develops a novel data-driven fuzzy modeling strategy and predictive controller for boiler-turbine unit using fuzzy clustering and subspace identification (SID) methods. To deal with the nonlinear behavior of boiler-turbine unit, fuzzy clustering is used to provide an appropriate division of the operation region and develop the structure of the fuzzy model. Then by combining the input data with the corresponding fuzzy membership functions, the SID method is extended to extract the local state-space model parameters. Owing to the advantages of the both methods, the resulting fuzzy model can represent the boiler-turbine unit very closely, and a fuzzy model predictive controller is designed based on this model. As an alternative approach, a direct data-driven fuzzy predictive control is also developed following the same clustering and subspace methods, where intermediate subspace matrices developed during the identification procedure are utilized directly as the predictor. Simulation results show the advantages and effectiveness of the proposed approach. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  10. Process-Structure Linkages Using a Data Science Approach: Application to Simulated Additive Manufacturing Data

    International Nuclear Information System (INIS)

    Popova, Evdokia; Rodgers, Theron M.; Gong, Xinyi; Cecen, Ahmet; Madison, Jonathan D.; Kalidindi, Surya R.

    2017-01-01

    A novel data science workflow is developed and demonstrated to extract process-structure linkages (i.e., reduced-order model) for microstructure evolution problems when the final microstructure depends on (simulation or experimental) processing parameters. Our workflow consists of four main steps: data pre-processing, microstructure quantification, dimensionality reduction, and extraction/validation of process-structure linkages. These methods that can be employed within each step vary based on the type and amount of available data. In this paper, this data-driven workflow is applied to a set of synthetic additive manufacturing microstructures obtained using the Potts-kinetic Monte Carlo (kMC) approach. Additive manufacturing techniques inherently produce complex microstructures that can vary significantly with processing conditions. Using the developed workflow, a low-dimensional data-driven model was established to correlate process parameters with the predicted final microstructure. In addition, the modular workflows developed and presented in this work facilitate easy dissemination and curation by the broader community.

  11. Augmenting Bag-of-Words: Data-Driven Discovery of Temporal and Structural Information for Activity Recognition

    OpenAIRE

    Bettadapura, Vinay; Schindler, Grant; Plotz, Thomaz; Essa, Irfan

    2015-01-01

    We present data-driven techniques to augment Bag of Words (BoW) models, which allow for more robust modeling and recognition of complex long-term activities, especially when the structure and topology of the activities are not known a priori. Our approach specifically addresses the limitations of standard BoW approaches, which fail to represent the underlying temporal and causal information that is inherent in activity streams. In addition, we also propose the use of randomly sampled regular ...

  12. A New Path-Constrained Rendezvous Planning Approach for Large-Scale Event-Driven Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Ahmadreza Vajdi

    2018-05-01

    Full Text Available We study the problem of employing a mobile-sink into a large-scale Event-Driven Wireless Sensor Networks (EWSNs for the purpose of data harvesting from sensor-nodes. Generally, this employment improves the main weakness of WSNs that is about energy-consumption in battery-driven sensor-nodes. The main motivation of our work is to address challenges which are related to a network’s topology by adopting a mobile-sink that moves in a predefined trajectory in the environment. Since, in this fashion, it is not possible to gather data from sensor-nodes individually, we adopt the approach of defining some of the sensor-nodes as Rendezvous Points (RPs in the network. We argue that RP-planning in this case is a tradeoff between minimizing the number of RPs while decreasing the number of hops for a sensor-node that needs data transformation to the related RP which leads to minimizing average energy consumption in the network. We address the problem by formulating the challenges and expectations as a Mixed Integer Linear Programming (MILP. Henceforth, by proving the NP-hardness of the problem, we propose three effective and distributed heuristics for RP-planning, identifying sojourn locations, and constructing routing trees. Finally, experimental results prove the effectiveness of our approach.

  13. A New Path-Constrained Rendezvous Planning Approach for Large-Scale Event-Driven Wireless Sensor Networks.

    Science.gov (United States)

    Vajdi, Ahmadreza; Zhang, Gongxuan; Zhou, Junlong; Wei, Tongquan; Wang, Yongli; Wang, Tianshu

    2018-05-04

    We study the problem of employing a mobile-sink into a large-scale Event-Driven Wireless Sensor Networks (EWSNs) for the purpose of data harvesting from sensor-nodes. Generally, this employment improves the main weakness of WSNs that is about energy-consumption in battery-driven sensor-nodes. The main motivation of our work is to address challenges which are related to a network’s topology by adopting a mobile-sink that moves in a predefined trajectory in the environment. Since, in this fashion, it is not possible to gather data from sensor-nodes individually, we adopt the approach of defining some of the sensor-nodes as Rendezvous Points (RPs) in the network. We argue that RP-planning in this case is a tradeoff between minimizing the number of RPs while decreasing the number of hops for a sensor-node that needs data transformation to the related RP which leads to minimizing average energy consumption in the network. We address the problem by formulating the challenges and expectations as a Mixed Integer Linear Programming (MILP). Henceforth, by proving the NP-hardness of the problem, we propose three effective and distributed heuristics for RP-planning, identifying sojourn locations, and constructing routing trees. Finally, experimental results prove the effectiveness of our approach.

  14. A New Path-Constrained Rendezvous Planning Approach for Large-Scale Event-Driven Wireless Sensor Networks

    Science.gov (United States)

    Zhang, Gongxuan; Wang, Yongli; Wang, Tianshu

    2018-01-01

    We study the problem of employing a mobile-sink into a large-scale Event-Driven Wireless Sensor Networks (EWSNs) for the purpose of data harvesting from sensor-nodes. Generally, this employment improves the main weakness of WSNs that is about energy-consumption in battery-driven sensor-nodes. The main motivation of our work is to address challenges which are related to a network’s topology by adopting a mobile-sink that moves in a predefined trajectory in the environment. Since, in this fashion, it is not possible to gather data from sensor-nodes individually, we adopt the approach of defining some of the sensor-nodes as Rendezvous Points (RPs) in the network. We argue that RP-planning in this case is a tradeoff between minimizing the number of RPs while decreasing the number of hops for a sensor-node that needs data transformation to the related RP which leads to minimizing average energy consumption in the network. We address the problem by formulating the challenges and expectations as a Mixed Integer Linear Programming (MILP). Henceforth, by proving the NP-hardness of the problem, we propose three effective and distributed heuristics for RP-planning, identifying sojourn locations, and constructing routing trees. Finally, experimental results prove the effectiveness of our approach. PMID:29734718

  15. Challenges and Limitations of Applying an Emotion-driven Design Approach on Elderly Users

    DEFF Research Database (Denmark)

    Andersen, Casper L.; Gudmundsson, Hjalte P.; Achiche, Sofiane

    2011-01-01

    a competitive advantage for companies. In this paper, challenges of applying an emotion-driven design approach applied on elderly people, in order to identify their user needs towards walking frames, are discussed. The discussion will be based on the experiences and results obtained from the case study...... related to the participants’ age and cognitive abilities. The challenges encountered are discussed and guidelines on what should be taken into account to facilitate an emotion-driven design approach for elderly people are proposed....

  16. High performance data transfer

    Science.gov (United States)

    Cottrell, R.; Fang, C.; Hanushevsky, A.; Kreuger, W.; Yang, W.

    2017-10-01

    The exponentially increasing need for high speed data transfer is driven by big data, and cloud computing together with the needs of data intensive science, High Performance Computing (HPC), defense, the oil and gas industry etc. We report on the Zettar ZX software. This has been developed since 2013 to meet these growing needs by providing high performance data transfer and encryption in a scalable, balanced, easy to deploy and use way while minimizing power and space utilization. In collaboration with several commercial vendors, Proofs of Concept (PoC) consisting of clusters have been put together using off-the- shelf components to test the ZX scalability and ability to balance services using multiple cores, and links. The PoCs are based on SSD flash storage that is managed by a parallel file system. Each cluster occupies 4 rack units. Using the PoCs, between clusters we have achieved almost 200Gbps memory to memory over two 100Gbps links, and 70Gbps parallel file to parallel file with encryption over a 5000 mile 100Gbps link.

  17. Current Approaches to Tactical Performance Analyses in Soccer Using Position Data.

    Science.gov (United States)

    Memmert, Daniel; Lemmink, Koen A P M; Sampaio, Jaime

    2017-01-01

    Tactical match performance depends on the quality of actions of individual players or teams in space and time during match-play in order to be successful. Technological innovations have led to new possibilities to capture accurate spatio-temporal information of all players and unravel the dynamics and complexity of soccer matches. The main aim of this article is to give an overview of the current state of development of the analysis of position data in soccer. Based on the same single set of position data of a high-level 11 versus 11 match (Bayern Munich against FC Barcelona) three different promising approaches from the perspective of dynamic systems and neural networks will be presented: Tactical performance analysis revealed inter-player coordination, inter-team and inter-line coordination before critical events, as well as team-team interaction and compactness coefficients. This could lead to a multi-disciplinary discussion on match analyses in sport science and new avenues for theoretical and practical implications in soccer.

  18. Oracle database performance and scalability a quantitative approach

    CERN Document Server

    Liu, Henry H

    2011-01-01

    A data-driven, fact-based, quantitative text on Oracle performance and scalability With database concepts and theories clearly explained in Oracle's context, readers quickly learn how to fully leverage Oracle's performance and scalability capabilities at every stage of designing and developing an Oracle-based enterprise application. The book is based on the author's more than ten years of experience working with Oracle, and is filled with dependable, tested, and proven performance optimization techniques. Oracle Database Performance and Scalability is divided into four parts that enable reader

  19. Query-Driven Visualization and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Ruebel, Oliver; Bethel, E. Wes; Prabhat, Mr.; Wu, Kesheng

    2012-11-01

    This report focuses on an approach to high performance visualization and analysis, termed query-driven visualization and analysis (QDV). QDV aims to reduce the amount of data that needs to be processed by the visualization, analysis, and rendering pipelines. The goal of the data reduction process is to separate out data that is "scientifically interesting'' and to focus visualization, analysis, and rendering on that interesting subset. The premise is that for any given visualization or analysis task, the data subset of interest is much smaller than the larger, complete data set. This strategy---extracting smaller data subsets of interest and focusing of the visualization processing on these subsets---is complementary to the approach of increasing the capacity of the visualization, analysis, and rendering pipelines through parallelism. This report discusses the fundamental concepts in QDV, their relationship to different stages in the visualization and analysis pipelines, and presents QDV's application to problems in diverse areas, ranging from forensic cybersecurity to high energy physics.

  20. Simulation of shallow groundwater levels: Comparison of a data-driven and a conceptual model

    Science.gov (United States)

    Fahle, Marcus; Dietrich, Ottfried; Lischeid, Gunnar

    2015-04-01

    Despite an abundance of models aimed at simulating shallow groundwater levels, application of such models is often hampered by a lack of appropriate input data. Difficulties especially arise with regard to soil data, which are typically hard to obtain and prone to spatial variability, eventually leading to uncertainties in the model results. Modelling approaches relying entirely on easily measured quantities are therefore an alternative to encourage the applicability of models. We present and compare two models for calculating 1-day-ahead predictions of the groundwater level that are only based on measurements of potential evapotranspiration, precipitation and groundwater levels. The first model is a newly developed conceptual model that is parametrized using the White method (which estimates the actual evapotranspiration on basis of diurnal groundwater fluctuations) and a rainfall-response ratio. Inverted versions of the two latter approaches are then used to calculate the predictions of the groundwater level. Furthermore, as a completely data-driven alternative, a simple feed-forward multilayer perceptron neural network was trained based on the same inputs and outputs. Data of 4 growing periods (April to October) from a study site situated in the Spreewald wetland in North-east Germany were taken to set-up the models and compare their performance. In addition, response surfaces that relate model outputs to combinations of different input variables are used to reveal those aspects in which the two approaches coincide and those in which they differ. Finally, it will be evaluated whether the conceptual approach can be enhanced by extracting knowledge of the neural network. This is done by replacing in the conceptual model the default function that relates groundwater recharge and groundwater level, which is assumed to be linear, by the non-linear function extracted from the neural network.

  1. A Data-Driven Stochastic Reactive Power Optimization Considering Uncertainties in Active Distribution Networks and Decomposition Method

    DEFF Research Database (Denmark)

    Ding, Tao; Yang, Qingrun; Yang, Yongheng

    2018-01-01

    To address the uncertain output of distributed generators (DGs) for reactive power optimization in active distribution networks, the stochastic programming model is widely used. The model is employed to find an optimal control strategy with minimum expected network loss while satisfying all......, in this paper, a data-driven modeling approach is introduced to assume that the probability distribution from the historical data is uncertain within a confidence set. Furthermore, a data-driven stochastic programming model is formulated as a two-stage problem, where the first-stage variables find the optimal...... control for discrete reactive power compensation equipment under the worst probability distribution of the second stage recourse. The second-stage variables are adjusted to uncertain probability distribution. In particular, this two-stage problem has a special structure so that the second-stage problem...

  2. Multiscale-Driven approach to detecting change in Synthetic Aperture Radar (SAR) imagery

    Science.gov (United States)

    Gens, R.; Hogenson, K.; Ajadi, O. A.; Meyer, F. J.; Myers, A.; Logan, T. A.; Arnoult, K., Jr.

    2017-12-01

    Detecting changes between Synthetic Aperture Radar (SAR) images can be a useful but challenging exercise. SAR with its all-weather capabilities can be an important resource in identifying and estimating the expanse of events such as flooding, river ice breakup, earthquake damage, oil spills, and forest growth, as it can overcome shortcomings of optical methods related to cloud cover. However, detecting change in SAR imagery can be impeded by many factors including speckle, complex scattering responses, low temporal sampling, and difficulty delineating boundaries. In this presentation we use a change detection method based on a multiscale-driven approach. By using information at different resolution levels, we attempt to obtain more accurate change detection maps in both heterogeneous and homogeneous regions. Integrated within the processing flow are processes that 1) improve classification performance by combining Expectation-Maximization algorithms with mathematical morphology, 2) achieve high accuracy in preserving boundaries using measurement level fusion techniques, and 3) combine modern non-local filtering and 2D-discrete stationary wavelet transform to provide robustness against noise. This multiscale-driven approach to change detection has recently been incorporated into the Alaska Satellite Facility (ASF) Hybrid Pluggable Processing Pipeline (HyP3) using radiometrically terrain corrected SAR images. Examples primarily from natural hazards are presented to illustrate the capabilities and limitations of the change detection method.

  3. A data fusion approach for track monitoring from multiple in-service trains

    Science.gov (United States)

    Lederman, George; Chen, Siheng; Garrett, James H.; Kovačević, Jelena; Noh, Hae Young; Bielak, Jacobo

    2017-10-01

    We present a data fusion approach for enabling data-driven rail-infrastructure monitoring from multiple in-service trains. A number of researchers have proposed using vibration data collected from in-service trains as a low-cost method to monitor track geometry. The majority of this work has focused on developing novel features to extract information about the tracks from data produced by individual sensors on individual trains. We extend this work by presenting a technique to combine extracted features from multiple passes over the tracks from multiple sensors aboard multiple vehicles. There are a number of challenges in combining multiple data sources, like different relative position coordinates depending on the location of the sensor within the train. Furthermore, as the number of sensors increases, the likelihood that some will malfunction also increases. We use a two-step approach that first minimizes position offset errors through data alignment, then fuses the data with a novel adaptive Kalman filter that weights data according to its estimated reliability. We show the efficacy of this approach both through simulations and on a data-set collected from two instrumented trains operating over a one-year period. Combining data from numerous in-service trains allows for more continuous and more reliable data-driven monitoring than analyzing data from any one train alone; as the number of instrumented trains increases, the proposed fusion approach could facilitate track monitoring of entire rail-networks.

  4. A data-driven approach to {{\\rm{\\pi }}}^{0},{\\rm{\\eta }} and {{\\rm{\\eta }}}^{\\prime} single and double Dalitz decays

    Science.gov (United States)

    Escribano, Rafel; Gonzàlez-Solís, Sergi

    2018-01-01

    The dilepton invariant mass spectra and integrated branching ratios of the single and double Dalitz decays {\\mathscr{P}}\\to {{{l}}}+{{{l}}}-{{γ }} and {\\mathscr{P}}\\to {{{l}}}+{{{l}}}-{{{l}}}+{{{l}}}- ({\\mathscr{P}}={{{π }}}0,{{η }},{{{η }}}\\prime; {{l}}={{e}} or {{μ }}) are predicted by means of a data-driven approach based on the use of rational approximants applied to {{{π }}}0,{{η }} and {{{η }}}\\prime transition form factor experimental data in the space-like region. Supported by the FPI scholarship BES-2012-055371 (S.G-S), the Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement de la Generalitat de Catalunya under grant 2014 SGR 1450, the Ministerio de Ciencia e Innovación under grant FPA2011-25948, the Ministerio de Economía y Competitividad under grants CICYT-FEDER-FPA 2014-55613-P and SEV-2012-0234, the Spanish Consolider-Ingenio 2010 Program CPAN (CSD2007-00042), and the European Commission under program FP7-INFRASTRUCTURES-2011-1 (283286) S.G-S also Received Support from the CAS President’s International Fellowship Initiative for Young International Scientist (2017PM0031)

  5. BIG DATA RESOURCES, MARKETING CAPABILITIES, AND FIRM PERFORMANCE.

    OpenAIRE

    Suoniemi, Samppa; Meyer-Waarden, Lars; Munzel, Andreas

    2017-01-01

    RESEARCH QUESTION Big data may significantly improve the efficiency and effectiveness of the firm's marketing capabilities. However, firms must overcome technological, skill-based and organisational challenges to become data-driven. Academic research has not empirically investigated how strategic big data resources, and to what extent, influence strategic marketing capabilities and, by extension, firm performance. The primary objective of this research is to remedy this crucial knowledge ...

  6. Data-Intensive Science meets Inquiry-Driven Pedagogy: Interactive Big Data Exploration, Threshold Concepts, and Liminality

    Science.gov (United States)

    Ramachandran, Rahul; Word, Andrea; Nair, Udasysankar

    2014-01-01

    us to analyze and fully utilize the complex and voluminous data that is being gathered. In this emerging paradigm, the scientific discovery process is driven by knowledge extracted from large volumes of data. In this presentation, we contend that this paradigm naturally lends to inquiry-driven pedagogy where knowledge is discovered through inductive engagement with large volumes of data rather than reached through traditional, deductive, hypothesis-driven analyses. In particular, data-intensive techniques married with an inductive methodology allow for exploration on a scale that is not possible in the traditional classroom with its typical problem sets and static, limited data samples. In addition, we identify existing gaps and possible solutions for addressing the infrastructure and tools as well as a pedagogical framework through which to implement this inductive approach.

  7. a Task-Driven Disaster Data Link Approach

    Science.gov (United States)

    Qiu, L. Y.; Zhu, Q.; Gu, J. Y.; Du, Z. Q.

    2015-08-01

    With the rapid development of sensor networks and Earth observation technology, a large quantity of disaster-related data is available, such as remotely sensed data, historic data, cases data, simulation data, disaster products and so on. However, the efficiency of current data management and service systems has become increasingly serious due to the task variety and heterogeneous data. For emergency task-oriented applications, data searching mainly relies on artificial experience based on simple metadata index, whose high time-consuming and low accuracy cannot satisfy the requirements of disaster products on velocity and veracity. In this paper, a task-oriented linking method is proposed for efficient disaster data management and intelligent service, with the objectives of 1) putting forward ontologies of disaster task and data to unify the different semantics of multi-source information, 2) identifying the semantic mapping from emergency tasks to multiple sources on the basis of uniform description in 1), 3) linking task-related data automatically and calculating the degree of correlation between each data and a target task. The method breaks through traditional static management of disaster data and establishes a base for intelligent retrieval and active push of disaster information. The case study presented in this paper illustrates the use of the method with a flood emergency relief task.

  8. Meta-Analysis for Sociology – A Measure-Driven Approach

    Science.gov (United States)

    Roelfs, David J.; Shor, Eran; Falzon, Louise; Davidson, Karina W.; Schwartz, Joseph E.

    2013-01-01

    Meta-analytic methods are becoming increasingly important in sociological research. In this article we present an approach for meta-analysis which is especially helpful for sociologists. Conventional approaches to meta-analysis often prioritize “concept-driven” literature searches. However, in disciplines with high theoretical diversity, such as sociology, this search approach might constrain the researcher’s ability to fully exploit the entire body of relevant work. We explicate a “measure-driven” approach, in which iterative searches and new computerized search techniques are used to increase the range of publications found (and thus the range of possible analyses) and to traverse time and disciplinary boundaries. We demonstrate this measure-driven search approach with two meta-analytic projects, examining the effects of various social variables on all-cause mortality. PMID:24163498

  9. Development of flexible process-centric web applications: An integrated model driven approach

    NARCIS (Netherlands)

    Bernardi, M.L.; Cimitile, M.; Di Lucca, G.A.; Maggi, F.M.

    2012-01-01

    In recent years, Model Driven Engineering (MDE) approaches have been proposed and used to develop and evolve WAs. However, the definition of appropriate MDE approaches for the development of flexible process-centric WAs is still limited. In particular, (flexible) workflow models have never been

  10. A Model-Driven Approach for Hybrid Power Estimation in Embedded Systems Design

    Directory of Open Access Journals (Sweden)

    Ben Atitallah Rabie

    2011-01-01

    Full Text Available Abstract As technology scales for increased circuit density and performance, the management of power consumption in system-on-chip (SoC is becoming critical. Today, having the appropriate electronic system level (ESL tools for power estimation in the design flow is mandatory. The main challenge for the design of such dedicated tools is to achieve a better tradeoff between accuracy and speed. This paper presents a consumption estimation approach allowing taking the consumption criterion into account early in the design flow during the system cosimulation. The originality of this approach is that it allows the power estimation for both white-box intellectual properties (IPs using annotated power models and black-box IPs using standalone power estimators. In order to obtain accurate power estimates, our simulations were performed at the cycle-accurate bit-accurate (CABA level, using SystemC. To make our approach fast and not tedious for users, the simulated architectures, including standalone power estimators, were generated automatically using a model driven engineering (MDE approach. Both annotated power models and standalone power estimators can be used together to estimate the consumption of the same architecture, which makes them complementary. The simulation results showed that the power estimates given by both estimation techniques for a hardware component are very close, with a difference that does not exceed 0.3%. This proves that, even when the IP code is not accessible or not modifiable, our approach allows obtaining quite accurate power estimates that early in the design flow thanks to the automation offered by the MDE approach.

  11. The effect of increasing strength and approach velocity on triple jump performance.

    Science.gov (United States)

    Allen, Sam J; Yeadon, M R Fred; King, Mark A

    2016-12-08

    The triple jump is an athletic event comprising three phases in which the optimal phase ratio (the proportion of each phase to the total distance jumped) is unknown. This study used a planar whole body torque-driven computer simulation model of the ground contact parts of all three phases of the triple jump to investigate the effect of strength and approach velocity on optimal performance. The strength and approach velocity of the simulation model were each increased by up to 30% in 10% increments from baseline data collected from a national standard triple jumper. Increasing strength always resulted in an increased overall jump distance. Increasing approach velocity also typically resulted in an increased overall jump distance but there was a point past which increasing approach velocity without increasing strength did not lead to an increase in overall jump distance. Increasing both strength and approach velocity by 10%, 20%, and 30% led to roughly equivalent increases in overall jump distances. Distances ranged from 14.05m with baseline strength and approach velocity, up to 18.49m with 30% increases in both. Optimal phase ratios were either hop-dominated or balanced, and typically became more balanced when the strength of the model was increased by a greater percentage than its approach velocity. The range of triple jump distances that resulted from the optimisation process suggests that strength and approach velocity are of great importance for triple jump performance. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. Influence of input data on airflow network accuracy in residential buildings with natural wind- and stack-driven ventilation

    Institute of Scientific and Technical Information of China (English)

    Krzysztof Arendt; Marek Krzaczek; Jacek Tejchman

    2017-01-01

    The airflow network (AFN) modeling approach provides an attractive balance between the accuracy and computational demand for naturally ventilated buildings. Its accuracy depends on input parameters such as wind pressure and opening discharge coefficients. In most cases, these parameters are obtained from secondary sources which are solely representative for very simplified buildings, i.e. for buildings without facade details. Although studies comparing wind pressure coefficients or discharge coefficients from different sources exist, the knowledge regarding the effect of input data on AFN is still poor. In this paper, the influence of wind pressure data on the accuracy of a coupled AFN-BES model for a real building with natural wind- and stack-driven ventilation was analyzed. The results of 8 computation cases with different wind pressure data from secondary sources were compared with the measured data. Both the indoor temperatures and the airflow were taken into account. The outcomes indicated that the source of wind pressure data had a significant influence on the model performance.

  13. A data-driven emulation framework for representing water-food nexus in a changing cold region

    Science.gov (United States)

    Nazemi, A.; Zandmoghaddam, S.; Hatami, S.

    2017-12-01

    Water resource systems are under increasing pressure globally. Growing population along with competition between water demands and emerging effects of climate change have caused enormous vulnerabilities in water resource management across many regions. Diagnosing such vulnerabilities and provision of effective adaptation strategies requires the availability of simulation tools that can adequately represent the interactions between competing water demands for limiting water resources and inform decision makers about the critical vulnerability thresholds under a range of potential natural and anthropogenic conditions. Despite a significant progress in integrated modeling of water resource systems, regional models are often unable to fully represent the contemplating dynamics within the key elements of water resource systems locally. Here we propose a data-driven approach to emulate a complex regional water resource system model developed for Oldman River Basin in southern Alberta, Canada. The aim of the emulation is to provide a detailed understanding of the trade-offs and interaction at the Oldman Reservoir, which is the key to flood control and irrigated agriculture in this over-allocated semi-arid cold region. Different surrogate models are developed to represent the dynamic of irrigation demand and withdrawal as well as reservoir evaporation and release individually. The nan-falsified offline models are then integrated through the water balance equation at the reservoir location to provide a coupled model for representing the dynamic of reservoir operation and water allocation at the local scale. The performance of individual and integrated models are rigorously examined and sources of uncertainty are highlighted. To demonstrate the practical utility of such surrogate modeling approach, we use the integrated data-driven model for examining the trade-off in irrigation water supply, reservoir storage and release under a range of changing climate, upstream

  14. A web-based data-querying tool based on ontology-driven methodology and flowchart-based model.

    Science.gov (United States)

    Ping, Xiao-Ou; Chung, Yufang; Tseng, Yi-Ju; Liang, Ja-Der; Yang, Pei-Ming; Huang, Guan-Tarn; Lai, Feipei

    2013-10-08

    Because of the increased adoption rate of electronic medical record (EMR) systems, more health care records have been increasingly accumulating in clinical data repositories. Therefore, querying the data stored in these repositories is crucial for retrieving the knowledge from such large volumes of clinical data. The aim of this study is to develop a Web-based approach for enriching the capabilities of the data-querying system along the three following considerations: (1) the interface design used for query formulation, (2) the representation of query results, and (3) the models used for formulating query criteria. The Guideline Interchange Format version 3.5 (GLIF3.5), an ontology-driven clinical guideline representation language, was used for formulating the query tasks based on the GLIF3.5 flowchart in the Protégé environment. The flowchart-based data-querying model (FBDQM) query execution engine was developed and implemented for executing queries and presenting the results through a visual and graphical interface. To examine a broad variety of patient data, the clinical data generator was implemented to automatically generate the clinical data in the repository, and the generated data, thereby, were employed to evaluate the system. The accuracy and time performance of the system for three medical query tasks relevant to liver cancer were evaluated based on the clinical data generator in the experiments with varying numbers of patients. In this study, a prototype system was developed to test the feasibility of applying a methodology for building a query execution engine using FBDQMs by formulating query tasks using the existing GLIF. The FBDQM-based query execution engine was used to successfully retrieve the clinical data based on the query tasks formatted using the GLIF3.5 in the experiments with varying numbers of patients. The accuracy of the three queries (ie, "degree of liver damage," "degree of liver damage when applying a mutually exclusive setting

  15. Input variable selection for data-driven models of Coriolis flowmeters for two-phase flow measurement

    International Nuclear Information System (INIS)

    Wang, Lijuan; Yan, Yong; Wang, Xue; Wang, Tao

    2017-01-01

    Input variable selection is an essential step in the development of data-driven models for environmental, biological and industrial applications. Through input variable selection to eliminate the irrelevant or redundant variables, a suitable subset of variables is identified as the input of a model. Meanwhile, through input variable selection the complexity of the model structure is simplified and the computational efficiency is improved. This paper describes the procedures of the input variable selection for the data-driven models for the measurement of liquid mass flowrate and gas volume fraction under two-phase flow conditions using Coriolis flowmeters. Three advanced input variable selection methods, including partial mutual information (PMI), genetic algorithm-artificial neural network (GA-ANN) and tree-based iterative input selection (IIS) are applied in this study. Typical data-driven models incorporating support vector machine (SVM) are established individually based on the input candidates resulting from the selection methods. The validity of the selection outcomes is assessed through an output performance comparison of the SVM based data-driven models and sensitivity analysis. The validation and analysis results suggest that the input variables selected from the PMI algorithm provide more effective information for the models to measure liquid mass flowrate while the IIS algorithm provides a fewer but more effective variables for the models to predict gas volume fraction. (paper)

  16. A Model-Driven Methodology for Big Data Analytics-as-a-Service

    OpenAIRE

    Damiani, Ernesto; Ardagna, Claudio Agostino; Ceravolo, Paolo; Bellandi, Valerio; Bezzi, Michele; Hebert, Cedric

    2017-01-01

    The Big Data revolution has promised to build a data-driven ecosystem where better decisions are supported by enhanced analytics and data management. However, critical issues still need to be solved in the road that leads to commodization of Big Data Analytics, such as the management of Big Data complexity and the protection of data security and privacy. In this paper, we focus on the first issue and propose a methodology based on Model Driven Engineering (MDE) that aims to substantially lowe...

  17. Performance Analysis of Waste Heat Driven Pressurized Adsorption Chiller

    KAUST Repository

    LOH, Wai Soong; SAHA, Bidyut Baran; CHAKRABORTY, Anutosh; NG, Kim Choon; CHUN, Won Gee

    2010-01-01

    This article presents the transient modeling and performance of waste heat driven pressurized adsorption chillers for refrigeration at subzero applications. This innovative adsorption chiller employs pitch-based activated carbon of type Maxsorb III

  18. Quantifying and reducing model-form uncertainties in Reynolds-averaged Navier–Stokes simulations: A data-driven, physics-informed Bayesian approach

    International Nuclear Information System (INIS)

    Xiao, H.; Wu, J.-L.; Wang, J.-X.; Sun, R.; Roy, C.J.

    2016-01-01

    Despite their well-known limitations, Reynolds-Averaged Navier–Stokes (RANS) models are still the workhorse tools for turbulent flow simulations in today's engineering analysis, design and optimization. While the predictive capability of RANS models depends on many factors, for many practical flows the turbulence models are by far the largest source of uncertainty. As RANS models are used in the design and safety evaluation of many mission-critical systems such as airplanes and nuclear power plants, quantifying their model-form uncertainties has significant implications in enabling risk-informed decision-making. In this work we develop a data-driven, physics-informed Bayesian framework for quantifying model-form uncertainties in RANS simulations. Uncertainties are introduced directly to the Reynolds stresses and are represented with compact parameterization accounting for empirical prior knowledge and physical constraints (e.g., realizability, smoothness, and symmetry). An iterative ensemble Kalman method is used to assimilate the prior knowledge and observation data in a Bayesian framework, and to propagate them to posterior distributions of velocities and other Quantities of Interest (QoIs). We use two representative cases, the flow over periodic hills and the flow in a square duct, to evaluate the performance of the proposed framework. Both cases are challenging for standard RANS turbulence models. Simulation results suggest that, even with very sparse observations, the obtained posterior mean velocities and other QoIs have significantly better agreement with the benchmark data compared to the baseline results. At most locations the posterior distribution adequately captures the true model error within the developed model form uncertainty bounds. The framework is a major improvement over existing black-box, physics-neutral methods for model-form uncertainty quantification, where prior knowledge and details of the models are not exploited. This approach has

  19. Quantifying and reducing model-form uncertainties in Reynolds-averaged Navier–Stokes simulations: A data-driven, physics-informed Bayesian approach

    Energy Technology Data Exchange (ETDEWEB)

    Xiao, H., E-mail: hengxiao@vt.edu; Wu, J.-L.; Wang, J.-X.; Sun, R.; Roy, C.J.

    2016-11-01

    Despite their well-known limitations, Reynolds-Averaged Navier–Stokes (RANS) models are still the workhorse tools for turbulent flow simulations in today's engineering analysis, design and optimization. While the predictive capability of RANS models depends on many factors, for many practical flows the turbulence models are by far the largest source of uncertainty. As RANS models are used in the design and safety evaluation of many mission-critical systems such as airplanes and nuclear power plants, quantifying their model-form uncertainties has significant implications in enabling risk-informed decision-making. In this work we develop a data-driven, physics-informed Bayesian framework for quantifying model-form uncertainties in RANS simulations. Uncertainties are introduced directly to the Reynolds stresses and are represented with compact parameterization accounting for empirical prior knowledge and physical constraints (e.g., realizability, smoothness, and symmetry). An iterative ensemble Kalman method is used to assimilate the prior knowledge and observation data in a Bayesian framework, and to propagate them to posterior distributions of velocities and other Quantities of Interest (QoIs). We use two representative cases, the flow over periodic hills and the flow in a square duct, to evaluate the performance of the proposed framework. Both cases are challenging for standard RANS turbulence models. Simulation results suggest that, even with very sparse observations, the obtained posterior mean velocities and other QoIs have significantly better agreement with the benchmark data compared to the baseline results. At most locations the posterior distribution adequately captures the true model error within the developed model form uncertainty bounds. The framework is a major improvement over existing black-box, physics-neutral methods for model-form uncertainty quantification, where prior knowledge and details of the models are not exploited. This approach

  20. General Purpose Data-Driven Monitoring for Space Operations

    Science.gov (United States)

    Iverson, David L.; Martin, Rodney A.; Schwabacher, Mark A.; Spirkovska, Liljana; Taylor, William McCaa; Castle, Joseph P.; Mackey, Ryan M.

    2009-01-01

    As modern space propulsion and exploration systems improve in capability and efficiency, their designs are becoming increasingly sophisticated and complex. Determining the health state of these systems, using traditional parameter limit checking, model-based, or rule-based methods, is becoming more difficult as the number of sensors and component interactions grow. Data-driven monitoring techniques have been developed to address these issues by analyzing system operations data to automatically characterize normal system behavior. System health can be monitored by comparing real-time operating data with these nominal characterizations, providing detection of anomalous data signatures indicative of system faults or failures. The Inductive Monitoring System (IMS) is a data-driven system health monitoring software tool that has been successfully applied to several aerospace applications. IMS uses a data mining technique called clustering to analyze archived system data and characterize normal interactions between parameters. The scope of IMS based data-driven monitoring applications continues to expand with current development activities. Successful IMS deployment in the International Space Station (ISS) flight control room to monitor ISS attitude control systems has led to applications in other ISS flight control disciplines, such as thermal control. It has also generated interest in data-driven monitoring capability for Constellation, NASA's program to replace the Space Shuttle with new launch vehicles and spacecraft capable of returning astronauts to the moon, and then on to Mars. Several projects are currently underway to evaluate and mature the IMS technology and complementary tools for use in the Constellation program. These include an experiment on board the Air Force TacSat-3 satellite, and ground systems monitoring for NASA's Ares I-X and Ares I launch vehicles. The TacSat-3 Vehicle System Management (TVSM) project is a software experiment to integrate fault

  1. The power of event-driven analytics in Large Scale Data Processing

    CERN Multimedia

    CERN. Geneva; Marques, Paulo

    2011-01-01

    FeedZai is a software company specialized in creating high-­‐throughput low-­‐latency data processing solutions. FeedZai develops a product called "FeedZai Pulse" for continuous event-­‐driven analytics that makes application development easier for end users. It automatically calculates key performance indicators and baselines, showing how current performance differ from previous history, creating timely business intelligence updated to the second. The tool does predictive analytics and trend analysis, displaying data on real-­‐time web-­‐based graphics. In 2010 FeedZai won the European EBN Smart Entrepreneurship Competition, in the Digital Models category, being considered one of the "top-­‐20 smart companies in Europe". The main objective of this seminar/workshop is to explore the topic for large-­‐scale data processing using Complex Event Processing and, in particular, the possible uses of Pulse in...

  2. A model-driven approach for representing clinical archetypes for Semantic Web environments.

    Science.gov (United States)

    Martínez-Costa, Catalina; Menárguez-Tortosa, Marcos; Fernández-Breis, Jesualdo Tomás; Maldonado, José Alberto

    2009-02-01

    The life-long clinical information of any person supported by electronic means configures his Electronic Health Record (EHR). This information is usually distributed among several independent and heterogeneous systems that may be syntactically or semantically incompatible. There are currently different standards for representing and exchanging EHR information among different systems. In advanced EHR approaches, clinical information is represented by means of archetypes. Most of these approaches use the Archetype Definition Language (ADL) to specify archetypes. However, ADL has some drawbacks when attempting to perform semantic activities in Semantic Web environments. In this work, Semantic Web technologies are used to specify clinical archetypes for advanced EHR architectures. The advantages of using the Ontology Web Language (OWL) instead of ADL are described and discussed in this work. Moreover, a solution combining Semantic Web and Model-driven Engineering technologies is proposed to transform ADL into OWL for the CEN EN13606 EHR architecture.

  3. Data-Driven Identification of Risk Factors of Patient Satisfaction at a Large Urban Academic Medical Center.

    Science.gov (United States)

    Li, Li; Lee, Nathan J; Glicksberg, Benjamin S; Radbill, Brian D; Dudley, Joel T

    2016-01-01

    The Hospital Consumer Assessment of Healthcare Providers and Systems (HCAHPS) survey is the first publicly reported nationwide survey to evaluate and compare hospitals. Increasing patient satisfaction is an important goal as it aims to achieve a more effective and efficient healthcare delivery system. In this study, we develop and apply an integrative, data-driven approach to identify clinical risk factors that associate with patient satisfaction outcomes. We included 1,771 unique adult patients who completed the HCAHPS survey and were discharged from the inpatient Medicine service from 2010 to 2012. We collected 266 clinical features including patient demographics, lab measurements, medications, disease categories, and procedures. We developed and applied a data-driven approach to identify risk factors that associate with patient satisfaction outcomes. We identify 102 significant risk factors associating with 18 surveyed questions. The most significantly recurrent clinical risk factors were: self-evaluation of health, education level, Asian, White, treatment in BMT oncology division, being prescribed a new medication. Patients who were prescribed pregabalin were less satisfied particularly in relation to communication with nurses and pain management. Explanation of medication usage was associated with communication with nurses (q = 0.001); however, explanation of medication side effects was associated with communication with doctors (q = 0.003). Overall hospital rating was associated with hospital environment, communication with doctors, and communication about medicines. However, patient likelihood to recommend hospital was associated with hospital environment, communication about medicines, pain management, and communication with nurse. Our study identified a number of putatively novel clinical risk factors for patient satisfaction that suggest new opportunities to better understand and manage patient satisfaction. Hospitals can use a data-driven approach to

  4. Scalable data-driven short-term traffic prediction

    NARCIS (Netherlands)

    Friso, K.; Wismans, L. J.J.; Tijink, M. B.

    2017-01-01

    Short-term traffic prediction has a lot of potential for traffic management. However, most research has traditionally focused on either traffic models-which do not scale very well to large networks, computationally-or on data-driven methods for freeways, leaving out urban arterials completely. Urban

  5. A data mining approach to analyze occupant behavior motivation

    NARCIS (Netherlands)

    Ren, X.; Zhao, Y.; Zeiler, W.; Boxem, G.; Li, T.

    2017-01-01

    Occupants' behavior could bring significant impact on the performance of built environment. Methods of analyzing people's behavior have not been adequately developed. The traditional methods such as survey or interview are not efficient. This study proposed a data-driven method to analyze the

  6. Gauging Skills of Hospital Security Personnel: a Statistically-driven, Questionnaire-based Approach.

    Science.gov (United States)

    Rinkoo, Arvind Vashishta; Mishra, Shubhra; Rahesuddin; Nabi, Tauqeer; Chandra, Vidha; Chandra, Hem

    2013-01-01

    This study aims to gauge the technical and soft skills of the hospital security personnel so as to enable prioritization of their training needs. A cross sectional questionnaire based study was conducted in December 2011. Two separate predesigned and pretested questionnaires were used for gauging soft skills and technical skills of the security personnel. Extensive statistical analysis, including Multivariate Analysis (Pillai-Bartlett trace along with Multi-factorial ANOVA) and Post-hoc Tests (Bonferroni Test) was applied. The 143 participants performed better on the soft skills front with an average score of 6.43 and standard deviation of 1.40. The average technical skills score was 5.09 with a standard deviation of 1.44. The study avowed a need for formal hands on training with greater emphasis on technical skills. Multivariate analysis of the available data further helped in identifying 20 security personnel who should be prioritized for soft skills training and a group of 36 security personnel who should receive maximum attention during technical skills training. This statistically driven approach can be used as a prototype by healthcare delivery institutions worldwide, after situation specific customizations, to identify the training needs of any category of healthcare staff.

  7. Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM

    KAUST Repository

    Amer, Abdelhalim; Maruyama, Naoya; Pericà s, Miquel; Taura, Kenjiro; Yokota, Rio; Matsuoka, Satoshi

    2013-01-01

    Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications of fork-join and data-driven execution

  8. An evaluation of data-driven motion estimation in comparison to the usage of external-surrogates in cardiac SPECT imaging

    International Nuclear Information System (INIS)

    Mukherjee, Joyeeta Mitra; Johnson, Karen L; Pretorius, P Hendrik; King, Michael A; Hutton, Brian F

    2013-01-01

    Motion estimation methods in single photon emission computed tomography (SPECT) can be classified into methods which depend on just the emission data (data-driven), or those that use some other source of information such as an external surrogate. The surrogate-based methods estimate the motion exhibited externally which may not correlate exactly with the movement of organs inside the body. The accuracy of data-driven strategies on the other hand is affected by the type and timing of motion occurrence during acquisition, the source distribution, and various degrading factors such as attenuation, scatter, and system spatial resolution. The goal of this paper is to investigate the performance of two data-driven motion estimation schemes based on the rigid-body registration of projections of motion-transformed source distributions to the acquired projection data for cardiac SPECT studies. Comparison is also made of six intensity based registration metrics to an external surrogate-based method. In the data-driven schemes, a partially reconstructed heart is used as the initial source distribution. The partially-reconstructed heart has inaccuracies due to limited angle artifacts resulting from using only a part of the SPECT projections acquired while the patient maintained the same pose. The performance of different cost functions in quantifying consistency with the SPECT projection data in the data-driven schemes was compared for clinically realistic patient motion occurring as discrete pose changes, one or two times during acquisition. The six intensity-based metrics studied were mean-squared difference, mutual information, normalized mutual information (NMI), pattern intensity (PI), normalized cross-correlation and entropy of the difference. Quantitative and qualitative analysis of the performance is reported using Monte-Carlo simulations of a realistic heart phantom including degradation factors such as attenuation, scatter and system spatial resolution. Further the

  9. Data-driven quantification of the robustness and sensitivity of cell signaling networks

    International Nuclear Information System (INIS)

    Mukherjee, Sayak; Seok, Sang-Cheol; Vieland, Veronica J; Das, Jayajit

    2013-01-01

    Robustness and sensitivity of responses generated by cell signaling networks has been associated with survival and evolvability of organisms. However, existing methods analyzing robustness and sensitivity of signaling networks ignore the experimentally observed cell-to-cell variations of protein abundances and cell functions or contain ad hoc assumptions. We propose and apply a data-driven maximum entropy based method to quantify robustness and sensitivity of Escherichia coli (E. coli) chemotaxis signaling network. Our analysis correctly rank orders different models of E. coli chemotaxis based on their robustness and suggests that parameters regulating cell signaling are evolutionary selected to vary in individual cells according to their abilities to perturb cell functions. Furthermore, predictions from our approach regarding distribution of protein abundances and properties of chemotactic responses in individual cells based on cell population averaged data are in excellent agreement with their experimental counterparts. Our approach is general and can be used to evaluate robustness as well as generate predictions of single cell properties based on population averaged experimental data in a wide range of cell signaling systems. (paper)

  10. Data-driven analysis of functional brain interactions during free listening to music and speech.

    Science.gov (United States)

    Fang, Jun; Hu, Xintao; Han, Junwei; Jiang, Xi; Zhu, Dajiang; Guo, Lei; Liu, Tianming

    2015-06-01

    Natural stimulus functional magnetic resonance imaging (N-fMRI) such as fMRI acquired when participants were watching video streams or listening to audio streams has been increasingly used to investigate functional mechanisms of the human brain in recent years. One of the fundamental challenges in functional brain mapping based on N-fMRI is to model the brain's functional responses to continuous, naturalistic and dynamic natural stimuli. To address this challenge, in this paper we present a data-driven approach to exploring functional interactions in the human brain during free listening to music and speech streams. Specifically, we model the brain responses using N-fMRI by measuring the functional interactions on large-scale brain networks with intrinsically established structural correspondence, and perform music and speech classification tasks to guide the systematic identification of consistent and discriminative functional interactions when multiple subjects were listening music and speech in multiple categories. The underlying premise is that the functional interactions derived from N-fMRI data of multiple subjects should exhibit both consistency and discriminability. Our experimental results show that a variety of brain systems including attention, memory, auditory/language, emotion, and action networks are among the most relevant brain systems involved in classic music, pop music and speech differentiation. Our study provides an alternative approach to investigating the human brain's mechanism in comprehension of complex natural music and speech.

  11. The current status of exposure-driven approaches for chemical safety assessment: A cross-sector perspective.

    Science.gov (United States)

    Sewell, Fiona; Aggarwal, Manoj; Bachler, Gerald; Broadmeadow, Alan; Gellatly, Nichola; Moore, Emma; Robinson, Sally; Rooseboom, Martijn; Stevens, Alexander; Terry, Claire; Burden, Natalie

    2017-08-15

    For the purposes of chemical safety assessment, the value of using non-animal (in silico and in vitro) approaches and generating mechanistic information on toxic effects is being increasingly recognised. For sectors where in vivo toxicity tests continue to be a regulatory requirement, there has been a parallel focus on how to refine studies (i.e. reduce suffering and improve animal welfare) and increase the value that in vivo data adds to the safety assessment process, as well as where to reduce animal numbers where possible. A key element necessary to ensure the transition towards successfully utilising both non-animal and refined safety testing is the better understanding of chemical exposure. This includes approaches such as measuring chemical concentrations within cell-based assays and during in vivo studies, understanding how predicted human exposures relate to levels tested, and using existing information on human exposures to aid in toxicity study design. Such approaches promise to increase the human relevance of safety assessment, and shift the focus from hazard-driven to risk-driven strategies similar to those used in the pharmaceutical sectors. Human exposure-based safety assessment offers scientific and 3Rs benefits across all sectors marketing chemical or medicinal products. The UK's National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs) convened an expert working group of scientists across the agrochemical, industrial chemical and pharmaceutical industries plus a contract research organisation (CRO) to discuss the current status of the utilisation of exposure-driven approaches, and the challenges and potential next steps for wider uptake and acceptance. This paper summarises these discussions, highlights the challenges - particularly those identified by industry - and proposes initial steps for moving the field forward. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.

  12. Data-Driven Planning: Using Assessment in Strategic Planning

    Science.gov (United States)

    Bresciani, Marilee J.

    2010-01-01

    Data-driven planning or evidence-based decision making represents nothing new in its concept. For years, business leaders have claimed they have implemented planning informed by data that have been strategically and systematically gathered. Within higher education and student affairs, there may be less evidence of the actual practice of…

  13. Preface [HD3-2015: International meeting on high-dimensional data-driven science

    International Nuclear Information System (INIS)

    2016-01-01

    A never-ending series of innovations in measurement technology and evolutions in information and communication technologies have led to the ongoing generation and accumulation of large quantities of high-dimensional data every day. While detailed data-centric approaches have been pursued in respective research fields, situations have been encountered where the same mathematical framework of high-dimensional data analysis can be found in a wide variety of seemingly unrelated research fields, such as estimation on the basis of undersampled Fourier transform in nuclear magnetic resonance spectroscopy in chemistry, in magnetic resonance imaging in medicine, and in astronomical interferometry in astronomy. In such situations, bringing diverse viewpoints together therefore becomes a driving force for the creation of innovative developments in various different research fields. This meeting focuses on “Sparse Modeling” (SpM) as a methodology for creation of innovative developments through the incorporation of a wide variety of viewpoints in various research fields. The objective of this meeting is to offer a forum where researchers with interest in SpM can assemble and exchange information on the latest results and newly established methodologies, and discuss future directions of the interdisciplinary studies for High-Dimensional Data-Driven science (HD 3 ). The meeting was held in Kyoto from 14-17 December 2015. We are pleased to publish 22 papers contributed by invited speakers in this volume of Journal of Physics: Conference Series. We hope that this volume will promote further development of High-Dimensional Data-Driven science. (paper)

  14. Comparison of Different Approaches to Predict the Performance of Pumps As Turbines (PATs

    Directory of Open Access Journals (Sweden)

    Mauro Venturini

    2018-04-01

    Full Text Available This paper deals with the comparison of different methods which can be used for the prediction of the performance curves of pumps as turbines (PATs. The considered approaches are four, i.e., one physics-based simulation model (“white box” model, two “gray box” models, which integrate theory on turbomachines with specific data correlations, and one “black box” model. More in detail, the modeling approaches are: (1 a physics-based simulation model developed by the same authors, which includes the equations for estimating head, power, and efficiency and uses loss coefficients and specific parameters; (2 a model developed by Derakhshan and Nourbakhsh, which first predicts the best efficiency point of a PAT and then reconstructs their complete characteristic curves by means of two ad hoc equations; (3 the prediction model developed by Singh and Nestmann, which predicts the complete turbine characteristics based on pump shape and size; (4 an Evolutionary Polynomial Regression model, which represents a data-driven hybrid scheme which can be used for identifying the explicit mathematical relationship between PAT and pump curves. All approaches are applied to literature data, relying on both pump and PAT performance curves of head, power, and efficiency over the entire range of operation. The experimental data were provided by Derakhshan and Nourbakhsh for four different turbomachines, working in both pump and PAT mode with specific speed values in the range 1.53–5.82. This paper provides a quantitative assessment of the predictions made by means of the considered approaches and also analyzes consistency from a physical point of view. Advantages and drawbacks of each method are also analyzed and discussed.

  15. Human body segmentation via data-driven graph cut.

    Science.gov (United States)

    Li, Shifeng; Lu, Huchuan; Shao, Xingqing

    2014-11-01

    Human body segmentation is a challenging and important problem in computer vision. Existing methods usually entail a time-consuming training phase for prior knowledge learning with complex shape matching for body segmentation. In this paper, we propose a data-driven method that integrates top-down body pose information and bottom-up low-level visual cues for segmenting humans in static images within the graph cut framework. The key idea of our approach is first to exploit human kinematics to search for body part candidates via dynamic programming for high-level evidence. Then, by using the body parts classifiers, obtaining bottom-up cues of human body distribution for low-level evidence. All the evidence collected from top-down and bottom-up procedures are integrated in a graph cut framework for human body segmentation. Qualitative and quantitative experiment results demonstrate the merits of the proposed method in segmenting human bodies with arbitrary poses from cluttered backgrounds.

  16. Data-driven, Interpretable Photometric Redshifts Trained on Heterogeneous and Unrepresentative Data

    International Nuclear Information System (INIS)

    Leistedt, Boris; Hogg, David W.

    2017-01-01

    We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data-driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine learning methods and template fitting methods by building template SEDs directly from the spectroscopic training data. This is made computationally tractable with Gaussian processes operating in flux–redshift space, encoding the physics of redshifts and the projection of galaxy SEDs onto photometric bandpasses. This method alleviates the need to acquire representative training data or to construct detailed galaxy SED models; it requires only that the photometric bandpasses and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data with reliable redshifts, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the i -magnitude-selected, spectroscopically confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST ) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes or to simulate populations of galaxies with realistic fluxes and redshifts, for example.

  17. Data-driven, Interpretable Photometric Redshifts Trained on Heterogeneous and Unrepresentative Data

    Energy Technology Data Exchange (ETDEWEB)

    Leistedt, Boris; Hogg, David W., E-mail: boris.leistedt@nyu.edu, E-mail: david.hogg@nyu.edu [Center for Cosmology and Particle Physics, Department of Physics, New York University, New York, NY 10003 (United States)

    2017-03-20

    We present a new method for inferring photometric redshifts in deep galaxy and quasar surveys, based on a data-driven model of latent spectral energy distributions (SEDs) and a physical model of photometric fluxes as a function of redshift. This conceptually novel approach combines the advantages of both machine learning methods and template fitting methods by building template SEDs directly from the spectroscopic training data. This is made computationally tractable with Gaussian processes operating in flux–redshift space, encoding the physics of redshifts and the projection of galaxy SEDs onto photometric bandpasses. This method alleviates the need to acquire representative training data or to construct detailed galaxy SED models; it requires only that the photometric bandpasses and calibrations be known or have parameterized unknowns. The training data can consist of a combination of spectroscopic and deep many-band photometric data with reliable redshifts, which do not need to entirely spatially overlap with the target survey of interest or even involve the same photometric bands. We showcase the method on the i -magnitude-selected, spectroscopically confirmed galaxies in the COSMOS field. The model is trained on the deepest bands (from SUBARU and HST ) and photometric redshifts are derived using the shallower SDSS optical bands only. We demonstrate that we obtain accurate redshift point estimates and probability distributions despite the training and target sets having very different redshift distributions, noise properties, and even photometric bands. Our model can also be used to predict missing photometric fluxes or to simulate populations of galaxies with realistic fluxes and redshifts, for example.

  18. Using XML Configuration-Driven Development to Create a Customizable Ground Data System

    Science.gov (United States)

    Nash, Brent; DeMore, Martha

    2009-01-01

    The Mission data Processing and Control Subsystem (MPCS) is being developed as a multi-mission Ground Data System with the Mars Science Laboratory (MSL) as the first fully supported mission. MPCS is a fully featured, Java-based Ground Data System (GDS) for telecommand and telemetry processing based on Configuration-Driven Development (CDD). The eXtensible Markup Language (XML) is the ideal language for CDD because it is easily readable and editable by all levels of users and is also backed by a World Wide Web Consortium (W3C) standard and numerous powerful processing tools that make it uniquely flexible. The CDD approach adopted by MPCS minimizes changes to compiled code by using XML to create a series of configuration files that provide both coarse and fine grained control over all aspects of GDS operation.

  19. Measuring energy performance with sectoral heterogeneity: A non-parametric frontier approach

    International Nuclear Information System (INIS)

    Wang, H.; Ang, B.W.; Wang, Q.W.; Zhou, P.

    2017-01-01

    Evaluating economy-wide energy performance is an integral part of assessing the effectiveness of a country's energy efficiency policy. Non-parametric frontier approach has been widely used by researchers for such a purpose. This paper proposes an extended non-parametric frontier approach to studying economy-wide energy efficiency and productivity performances by accounting for sectoral heterogeneity. Relevant techniques in index number theory are incorporated to quantify the driving forces behind changes in the economy-wide energy productivity index. The proposed approach facilitates flexible modelling of different sectors' production processes, and helps to examine sectors' impact on the aggregate energy performance. A case study of China's economy-wide energy efficiency and productivity performances in its 11th five-year plan period (2006–2010) is presented. It is found that sectoral heterogeneities in terms of energy performance are significant in China. Meanwhile, China's economy-wide energy productivity increased slightly during the study period, mainly driven by the technical efficiency improvement. A number of other findings have also been reported. - Highlights: • We model economy-wide energy performance by considering sectoral heterogeneity. • The proposed approach can identify sectors' impact on the aggregate energy performance. • Obvious sectoral heterogeneities are identified in evaluating China's energy performance.

  20. A suggested approach toward measuring sorption and applying sorption data to repository performance assessment

    International Nuclear Information System (INIS)

    Rundberg, R.S.

    1992-01-01

    The prediction of radionuclide migration for the purpose of assessing the safety of a nuclear waste repository will be based on a collective knowledge of hydrologic and geochemical properties of the surrounding rock and groundwater. This knowledge along with assumption about the interactions of radionuclides with groundwater and minerals form the scientific basis for a model capable of accurately predicting the repository's performance. Because the interaction of radionuclides in geochemical systems is known to be complicated, several fundamental and empirical approaches to measuring the interaction between radionuclides and the geologic barrier have been developed. The approaches applied to the measurement of sorption involve the use of pure minerals, intact, or crushed rock in dynamic and static experiments. Each approach has its advantages and disadvantages. There is no single best method for providing sorption data for performance assessment models which can be applied without invoking information derived from multiple experiments. 53 refs., 12 figs

  1. First-principles data-driven discovery of transition metal oxides for artificial photosynthesis

    Science.gov (United States)

    Yan, Qimin

    We develop a first-principles data-driven approach for rapid identification of transition metal oxide (TMO) light absorbers and photocatalysts for artificial photosynthesis using the Materials Project. Initially focusing on Cr, V, and Mn-based ternary TMOs in the database, we design a broadly-applicable multiple-layer screening workflow automating density functional theory (DFT) and hybrid functional calculations of bulk and surface electronic and magnetic structures. We further assess the electrochemical stability of TMOs in aqueous environments from computed Pourbaix diagrams. Several promising earth-abundant low band-gap TMO compounds with desirable band edge energies and electrochemical stability are identified by our computational efforts and then synergistically evaluated using high-throughput synthesis and photoelectrochemical screening techniques by our experimental collaborators at Caltech. Our joint theory-experiment effort has successfully identified new earth-abundant copper and manganese vanadate complex oxides that meet highly demanding requirements for photoanodes, substantially expanding the known space of such materials. By integrating theory and experiment, we validate our approach and develop important new insights into structure-property relationships for TMOs for oxygen evolution photocatalysts, paving the way for use of first-principles data-driven techniques in future applications. This work is supported by the Materials Project Predictive Modeling Center and the Joint Center for Artificial Photosynthesis through the U.S. Department of Energy, Office of Basic Energy Sciences, Materials Sciences and Engineering Division, under Contract No. DE-AC02-05CH11231. Computational resources also provided by the Department of Energy through the National Energy Supercomputing Center.

  2. VLAM-G: Interactive Data Driven Workflow Engine for Grid-Enabled Resources

    Directory of Open Access Journals (Sweden)

    Vladimir Korkhov

    2007-01-01

    Full Text Available Grid brings the power of many computers to scientists. However, the development of Grid-enabled applications requires knowledge about Grid infrastructure and low-level API to Grid services. In turn, workflow management systems provide a high-level environment for rapid prototyping of experimental computing systems. Coupling Grid and workflow paradigms is important for the scientific community: it makes the power of the Grid easily available to the end user. The paradigm of data driven workflow execution is one of the ways to enable distributed workflow on the Grid. The work presented in this paper is carried out in the context of the Virtual Laboratory for e-Science project. We present the VLAM-G workflow management system and its core component: the Run-Time System (RTS. The RTS is a dataflow driven workflow engine which utilizes Grid resources, hiding the complexity of the Grid from a scientist. Special attention is paid to the concept of dataflow and direct data streaming between distributed workflow components. We present the architecture and components of the RTS, describe the features of VLAM-G workflow execution, and evaluate the system by performance measurements and a real life use case.

  3. Data Driven Tuning of Inventory Controllers

    DEFF Research Database (Denmark)

    Huusom, Jakob Kjøbsted; Santacoloma, Paloma Andrade; Poulsen, Niels Kjølstad

    2007-01-01

    A systematic method for criterion based tuning of inventory controllers based on data-driven iterative feedback tuning is presented. This tuning method circumvent problems with modeling bias. The process model used for the design of the inventory control is utilized in the tuning...... as an approximation to reduce time required on experiments. The method is illustrated in an application with a multivariable inventory control implementation on a four tank system....

  4. Data-driven discovery of Koopman eigenfunctions using deep learning

    Science.gov (United States)

    Lusch, Bethany; Brunton, Steven L.; Kutz, J. Nathan

    2017-11-01

    Koopman operator theory transforms any autonomous non-linear dynamical system into an infinite-dimensional linear system. Since linear systems are well-understood, a mapping of non-linear dynamics to linear dynamics provides a powerful approach to understanding and controlling fluid flows. However, finding the correct change of variables remains an open challenge. We present a strategy to discover an approximate mapping using deep learning. Our neural networks find this change of variables, its inverse, and a finite-dimensional linear dynamical system defined on the new variables. Our method is completely data-driven and only requires measurements of the system, i.e. it does not require derivatives or knowledge of the governing equations. We find a minimal set of approximate Koopman eigenfunctions that are sufficient to reconstruct and advance the system to future states. We demonstrate the method on several dynamical systems.

  5. Data-driven forward model inference for EEG brain imaging

    DEFF Research Database (Denmark)

    Hansen, Sofie Therese; Hauberg, Søren; Hansen, Lars Kai

    2016-01-01

    Electroencephalography (EEG) is a flexible and accessible tool with excellent temporal resolution but with a spatial resolution hampered by volume conduction. Reconstruction of the cortical sources of measured EEG activity partly alleviates this problem and effectively turns EEG into a brain......-of-concept study, we show that, even when anatomical knowledge is unavailable, a suitable forward model can be estimated directly from the EEG. We propose a data-driven approach that provides a low-dimensional parametrization of head geometry and compartment conductivities, built using a corpus of forward models....... Combined with only a recorded EEG signal, we are able to estimate both the brain sources and a person-specific forward model by optimizing this parametrization. We thus not only solve an inverse problem, but also optimize over its specification. Our work demonstrates that personalized EEG brain imaging...

  6. The paradigm of consumer-driven and responsive supply chains: An integrated project approach

    NARCIS (Netherlands)

    Zimmermann, K.L.; Lans, van der I.A.

    2009-01-01

    This paper describes an integrated project approach that forms the basis of the studies on consumer-driven innovative and responsive supply chains in ISAFRUIT Pillar 1. This integrated approach leads to a wide range of indepth results on trends, preferences, and innovativeness of the European

  7. Data-driven models of dominantly-inherited Alzheimer's disease progression.

    Science.gov (United States)

    Oxtoby, Neil P; Young, Alexandra L; Cash, David M; Benzinger, Tammie L S; Fagan, Anne M; Morris, John C; Bateman, Randall J; Fox, Nick C; Schott, Jonathan M; Alexander, Daniel C

    2018-03-22

    Dominantly-inherited Alzheimer's disease is widely hoped to hold the key to developing interventions for sporadic late onset Alzheimer's disease. We use emerging techniques in generative data-driven disease progression modelling to characterize dominantly-inherited Alzheimer's disease progression with unprecedented resolution, and without relying upon familial estimates of years until symptom onset. We retrospectively analysed biomarker data from the sixth data freeze of the Dominantly Inherited Alzheimer Network observational study, including measures of amyloid proteins and neurofibrillary tangles in the brain, regional brain volumes and cortical thicknesses, brain glucose hypometabolism, and cognitive performance from the Mini-Mental State Examination (all adjusted for age, years of education, sex, and head size, as appropriate). Data included 338 participants with known mutation status (211 mutation carriers in three subtypes: 163 PSEN1, 17 PSEN2, and 31 APP) and a baseline visit (age 19-66; up to four visits each, 1.1 ± 1.9 years in duration; spanning 30 years before, to 21 years after, parental age of symptom onset). We used an event-based model to estimate sequences of biomarker changes from baseline data across disease subtypes (mutation groups), and a differential equation model to estimate biomarker trajectories from longitudinal data (up to 66 mutation carriers, all subtypes combined). The two models concur that biomarker abnormality proceeds as follows: amyloid deposition in cortical then subcortical regions (∼24 ± 11 years before onset); phosphorylated tau (17 ± 8 years), tau and amyloid-β changes in cerebrospinal fluid; neurodegeneration first in the putamen and nucleus accumbens (up to 6 ± 2 years); then cognitive decline (7 ± 6 years), cerebral hypometabolism (4 ± 4 years), and further regional neurodegeneration. Our models predicted symptom onset more accurately than predictions that used familial estimates: root mean squared error of 1

  8. A Dynamic Remote Sensing Data-Driven Approach for Oil Spill Simulation in the Sea

    Directory of Open Access Journals (Sweden)

    Jining Yan

    2015-05-01

    Full Text Available In view of the fact that oil spill remote sensing could only generate the oil slick information at a specific time and that traditional oil spill simulation models were not designed to deal with dynamic conditions, a dynamic data-driven application system (DDDAS was introduced. The DDDAS entails both the ability to incorporate additional data into an executing application and, in reverse, the ability of applications to dynamically steer the measurement process. Based on the DDDAS, combing a remote sensor system that detects oil spills with a numerical simulation, an integrated data processing, analysis, forecasting and emergency response system was established. Once an oil spill accident occurs, the DDDAS-based oil spill model receives information about the oil slick extracted from the dynamic remote sensor data in the simulation. Through comparison, information fusion and feedback updates, continuous and more precise oil spill simulation results can be obtained. Then, the simulation results can provide help for disaster control and clean-up. The Penglai, Xingang and Suizhong oil spill results showed our simulation model could increase the prediction accuracy and reduce the error caused by empirical parameters in existing simulation systems. Therefore, the DDDAS-based detection and simulation system can effectively improve oil spill simulation and diffusion forecasting, as well as provide decision-making information and technical support for emergency responses to oil spills.

  9. Dynamic Performance of the Standalone Wind Power Driven Heat Pump

    OpenAIRE

    H. Li; P.E. Campana; S. Berretta; Y. Tan; J. Yan

    2016-01-01

    Reducing energy consumption and increasing use of renewable energyin the building sector arecrucial to the mitigation of climate change. Wind power driven heat pumps have been considered as a sustainable measure to supply heat for detached houses, especially those that even don’t have access to the grid. This work is to investigate the dynamic performance of a heat pump system directly driven by a wind turbine. The heat demand of a detached single family house was simulated in details. Accord...

  10. A Hypothesis-Driven Approach to Site Investigation

    Science.gov (United States)

    Nowak, W.

    2008-12-01

    Variability of subsurface formations and the scarcity of data lead to the notion of aquifer parameters as geostatistical random variables. Given an information need and limited resources for field campaigns, site investigation is often put into the context of optimal design. In optimal design, the types, numbers and positions of samples are optimized under case-specific objectives to meet the information needs. Past studies feature optimal data worth (balancing maximum financial profit in an engineering task versus the cost of additional sampling), or aim at a minimum prediction uncertainty of stochastic models for a prescribed investigation budget. Recent studies also account for other sources of uncertainty outside the hydrogeological range, such as uncertain toxicity, ingestion and behavioral parameters of the affected population when predicting the human health risk from groundwater contaminations. The current study looks at optimal site investigation from a new angle. Answering a yes/no question under uncertainty directly requires recasting the original question as a hypothesis test. Otherwise, false confidence in the resulting answer would be pretended. A straightforward example is whether a recent contaminant spill will cause contaminant concentrations in excess of a legal limit at a nearby drinking water well. This question can only be answered down to a specified chance of error, i.e., based on the significance level used in hypothesis tests. Optimal design is placed into the hypothesis-driven context by using the chance of providing a false yes/no answer as new criterion to be minimized. Different configurations apply for one-sided and two-sided hypothesis tests. If a false answer entails financial liability, the hypothesis-driven context can be re-cast in the context of data worth. The remaining difference is that failure is a hard constraint in the data worth context versus a monetary punishment term in the hypothesis-driven context. The basic principle

  11. Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application

    Science.gov (United States)

    Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H

    2017-01-01

    Background The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. Objective The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Methods Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. Results The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. Conclusions This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. PMID:28903894

  12. Forecasting wind-driven wildfires using an inverse modelling approach

    Directory of Open Access Journals (Sweden)

    O. Rios

    2014-06-01

    Full Text Available A technology able to rapidly forecast wildfire dynamics would lead to a paradigm shift in the response to emergencies, providing the Fire Service with essential information about the ongoing fire. This paper presents and explores a novel methodology to forecast wildfire dynamics in wind-driven conditions, using real-time data assimilation and inverse modelling. The forecasting algorithm combines Rothermel's rate of spread theory with a perimeter expansion model based on Huygens principle and solves the optimisation problem with a tangent linear approach and forward automatic differentiation. Its potential is investigated using synthetic data and evaluated in different wildfire scenarios. The results show the capacity of the method to quickly predict the location of the fire front with a positive lead time (ahead of the event in the order of 10 min for a spatial scale of 100 m. The greatest strengths of our method are lightness, speed and flexibility. We specifically tailor the forecast to be efficient and computationally cheap so it can be used in mobile systems for field deployment and operativeness. Thus, we put emphasis on producing a positive lead time and the means to maximise it.

  13. Data-driven battery product development: Turn battery performance into a competitive advantage.

    Energy Technology Data Exchange (ETDEWEB)

    Sholklapper, Tal [Voltaiq, Inc.

    2016-04-19

    Poor battery performance is a primary source of user dissatisfaction across a broad range of applications, and is a key bottleneck hindering the growth of mobile technology, wearables, electric vehicles, and grid energy storage. Engineering battery systems is difficult, requiring extensive testing for vendor selection, BMS programming, and application-specific lifetime testing. This work also generates huge quantities of data. This presentation will explain how to leverage this data to help ship quality products faster using fewer resources while ensuring safety and reliability in the field, ultimately turning battery performance into a competitive advantage.

  14. Performance of a data-driven technique to changes in wave height and its effect on beach response

    Directory of Open Access Journals (Sweden)

    Jose M. Horrillo-Caraballo

    2016-01-01

    Full Text Available In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the east coast of the USA, which is exposed to Atlantic Ocean swells and storm waves, and the latter is the Milford-on-Sea site at Christchurch Bay, on the south coast of England, which is partially sheltered from Atlantic swells but has a directionally bimodal wave exposure. The data sets comprise detailed bathymetric surveys of beach profiles covering a period of more than 25 years for the Duck site and over 18 years for the Milford-on-Sea site. The structure of the data sets and the data-driven methods are described. Canonical correlation analysis (CCA was used to find linkages between the wave characteristics and beach profiles. The sensitivity of the linkages was investigated by deploying a wave height threshold to filter out the smaller waves incrementally. The results of the analysis indicate that, for the gently sloping sandy beach, waves of all heights are important to the morphological response. For the mixed sand and gravel beach, filtering the smaller waves improves the statistical fit and it suggests that low-height waves do not play a primary role in the medium-term morphological response, which is primarily driven by the intermittent larger storm waves.

  15. On Lack of Robustness in Hydrological Model Development Due to Absence of Guidelines for Selecting Calibration and Evaluation Data: Demonstration for Data-Driven Models

    Science.gov (United States)

    Zheng, Feifei; Maier, Holger R.; Wu, Wenyan; Dandy, Graeme C.; Gupta, Hoshin V.; Zhang, Tuqiao

    2018-02-01

    Hydrological models are used for a wide variety of engineering purposes, including streamflow forecasting and flood-risk estimation. To develop such models, it is common to allocate the available data to calibration and evaluation data subsets. Surprisingly, the issue of how this allocation can affect model evaluation performance has been largely ignored in the research literature. This paper discusses the evaluation performance bias that can arise from how available data are allocated to calibration and evaluation subsets. As a first step to assessing this issue in a statistically rigorous fashion, we present a comprehensive investigation of the influence of data allocation on the development of data-driven artificial neural network (ANN) models of streamflow. Four well-known formal data splitting methods are applied to 754 catchments from Australia and the U.S. to develop 902,483 ANN models. Results clearly show that the choice of the method used for data allocation has a significant impact on model performance, particularly for runoff data that are more highly skewed, highlighting the importance of considering the impact of data splitting when developing hydrological models. The statistical behavior of the data splitting methods investigated is discussed and guidance is offered on the selection of the most appropriate data splitting methods to achieve representative evaluation performance for streamflow data with different statistical properties. Although our results are obtained for data-driven models, they highlight the fact that this issue is likely to have a significant impact on all types of hydrological models, especially conceptual rainfall-runoff models.

  16. Data-Driven User Feedback: An Improved Neurofeedback Strategy considering the Interindividual Variability of EEG Features.

    Science.gov (United States)

    Han, Chang-Hee; Lim, Jeong-Hwan; Lee, Jun-Hak; Kim, Kangsan; Im, Chang-Hwan

    2016-01-01

    It has frequently been reported that some users of conventional neurofeedback systems can experience only a small portion of the total feedback range due to the large interindividual variability of EEG features. In this study, we proposed a data-driven neurofeedback strategy considering the individual variability of electroencephalography (EEG) features to permit users of the neurofeedback system to experience a wider range of auditory or visual feedback without a customization process. The main idea of the proposed strategy is to adjust the ranges of each feedback level using the density in the offline EEG database acquired from a group of individuals. Twenty-two healthy subjects participated in offline experiments to construct an EEG database, and five subjects participated in online experiments to validate the performance of the proposed data-driven user feedback strategy. Using the optimized bin sizes, the number of feedback levels that each individual experienced was significantly increased to 139% and 144% of the original results with uniform bin sizes in the offline and online experiments, respectively. Our results demonstrated that the use of our data-driven neurofeedback strategy could effectively increase the overall range of feedback levels that each individual experienced during neurofeedback training.

  17. Data-driven classification of patients with primary progressive aphasia.

    Science.gov (United States)

    Hoffman, Paul; Sajjadi, Seyed Ahmad; Patterson, Karalyn; Nestor, Peter J

    2017-11-01

    Current diagnostic criteria classify primary progressive aphasia into three variants-semantic (sv), nonfluent (nfv) and logopenic (lv) PPA-though the adequacy of this scheme is debated. This study took a data-driven approach, applying k-means clustering to data from 43 PPA patients. The algorithm grouped patients based on similarities in language, semantic and non-linguistic cognitive scores. The optimum solution consisted of three groups. One group, almost exclusively those diagnosed as svPPA, displayed a selective semantic impairment. A second cluster, with impairments to speech production, repetition and syntactic processing, contained a majority of patients with nfvPPA but also some lvPPA patients. The final group exhibited more severe deficits to speech, repetition and syntax as well as semantic and other cognitive deficits. These results suggest that, amongst cases of non-semantic PPA, differentiation mainly reflects overall degree of language/cognitive impairment. The observed patterns were scarcely affected by inclusion/exclusion of non-linguistic cognitive scores. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  18. Academic Performance: An Approach From Data Mining

    Directory of Open Access Journals (Sweden)

    David L. La Red Martinez

    2012-02-01

    Full Text Available The relatively low% of students promoted and regularized in Operating Systems Course of the LSI (Bachelor’s Degree in Information Systems of FaCENA (Faculty of Sciences and Natural Surveying - Facultad de Ciencias Exactas, Naturales y Agrimensura of UNNE (academic success, prompted this work, whose objective is to determine the variables that affect the academic performance, whereas the final status of the student according to the Res. 185/03 CD (scheme for evaluation and promotion: promoted, regular or free1. The variables considered are: status of the student, educational level of parents, secondary education, socio-economic level, and others. Data warehouse (Data Warehouses: DW and data mining (Data Mining: DM techniques were used to search pro.les of students and determine success or failure academic potential situations. Classifications through techniques of clustering according to different criteria have become. Some criteria were the following: mining of classification according to academic program, according to final status of the student, according to importance given to the study, mining of demographic clustering and Kohonen clustering according to final status of the student. Were conducted statistics of partition, detail of partitions, details of clusters, detail of fields and frequency of fields, overall quality of each process and quality detailed (precision, classification, reliability, arrays of confusion, diagrams of gain / elevation, trees, distribution of nodes, of importance of fields, correspondence tables of fields and statistics of cluster. Once certain profiles of students with low academic performance, it may address actions aimed at avoiding potential academic failures. This work aims to provide a brief description of aspects related to the data warehouse built and some processes of data mining developed on the same.

  19. Data-driven gradient algorithm for high-precision quantum control

    Science.gov (United States)

    Wu, Re-Bing; Chu, Bing; Owens, David H.; Rabitz, Herschel

    2018-04-01

    In the quest to achieve scalable quantum information processing technologies, gradient-based optimal control algorithms (e.g., grape) are broadly used for implementing high-precision quantum gates, but their performance is often hindered by deterministic or random errors in the system model and the control electronics. In this paper, we show that grape can be taught to be more effective by jointly learning from the design model and the experimental data obtained from process tomography. The resulting data-driven gradient optimization algorithm (d-grape) can in principle correct all deterministic gate errors, with a mild efficiency loss. The d-grape algorithm may become more powerful with broadband controls that involve a large number of control parameters, while other algorithms usually slow down due to the increased size of the search space. These advantages are demonstrated by simulating the implementation of a two-qubit controlled-not gate.

  20. The Hypothesis-Driven Physical Examination.

    Science.gov (United States)

    Garibaldi, Brian T; Olson, Andrew P J

    2018-05-01

    The physical examination remains a vital part of the clinical encounter. However, physical examination skills have declined in recent years, in part because of decreased time at the bedside. Many clinicians question the relevance of physical examinations in the age of technology. A hypothesis-driven approach to teaching and practicing the physical examination emphasizes the performance of maneuvers that can alter the likelihood of disease. Likelihood ratios are diagnostic weights that allow clinicians to estimate the post-probability of disease. This hypothesis-driven approach to the physical examination increases its value and efficiency, while preserving its cultural role in the patient-physician relationship. Copyright © 2017 Elsevier Inc. All rights reserved.

  1. An Event-Driven Classifier for Spiking Neural Networks Fed with Synthetic or Dynamic Vision Sensor Data

    Directory of Open Access Journals (Sweden)

    Evangelos Stromatias

    2017-06-01

    Full Text Available This paper introduces a novel methodology for training an event-driven classifier within a Spiking Neural Network (SNN System capable of yielding good classification results when using both synthetic input data and real data captured from Dynamic Vision Sensor (DVS chips. The proposed supervised method uses the spiking activity provided by an arbitrary topology of prior SNN layers to build histograms and train the classifier in the frame domain using the stochastic gradient descent algorithm. In addition, this approach can cope with leaky integrate-and-fire neuron models within the SNN, a desirable feature for real-world SNN applications, where neural activation must fade away after some time in the absence of inputs. Consequently, this way of building histograms captures the dynamics of spikes immediately before the classifier. We tested our method on the MNIST data set using different synthetic encodings and real DVS sensory data sets such as N-MNIST, MNIST-DVS, and Poker-DVS using the same network topology and feature maps. We demonstrate the effectiveness of our approach by achieving the highest classification accuracy reported on the N-MNIST (97.77% and Poker-DVS (100% real DVS data sets to date with a spiking convolutional network. Moreover, by using the proposed method we were able to retrain the output layer of a previously reported spiking neural network and increase its performance by 2%, suggesting that the proposed classifier can be used as the output layer in works where features are extracted using unsupervised spike-based learning methods. In addition, we also analyze SNN performance figures such as total event activity and network latencies, which are relevant for eventual hardware implementations. In summary, the paper aggregates unsupervised-trained SNNs with a supervised-trained SNN classifier, combining and applying them to heterogeneous sets of benchmarks, both synthetic and from real DVS chips.

  2. An Event-Driven Classifier for Spiking Neural Networks Fed with Synthetic or Dynamic Vision Sensor Data.

    Science.gov (United States)

    Stromatias, Evangelos; Soto, Miguel; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé

    2017-01-01

    This paper introduces a novel methodology for training an event-driven classifier within a Spiking Neural Network (SNN) System capable of yielding good classification results when using both synthetic input data and real data captured from Dynamic Vision Sensor (DVS) chips. The proposed supervised method uses the spiking activity provided by an arbitrary topology of prior SNN layers to build histograms and train the classifier in the frame domain using the stochastic gradient descent algorithm. In addition, this approach can cope with leaky integrate-and-fire neuron models within the SNN, a desirable feature for real-world SNN applications, where neural activation must fade away after some time in the absence of inputs. Consequently, this way of building histograms captures the dynamics of spikes immediately before the classifier. We tested our method on the MNIST data set using different synthetic encodings and real DVS sensory data sets such as N-MNIST, MNIST-DVS, and Poker-DVS using the same network topology and feature maps. We demonstrate the effectiveness of our approach by achieving the highest classification accuracy reported on the N-MNIST (97.77%) and Poker-DVS (100%) real DVS data sets to date with a spiking convolutional network. Moreover, by using the proposed method we were able to retrain the output layer of a previously reported spiking neural network and increase its performance by 2%, suggesting that the proposed classifier can be used as the output layer in works where features are extracted using unsupervised spike-based learning methods. In addition, we also analyze SNN performance figures such as total event activity and network latencies, which are relevant for eventual hardware implementations. In summary, the paper aggregates unsupervised-trained SNNs with a supervised-trained SNN classifier, combining and applying them to heterogeneous sets of benchmarks, both synthetic and from real DVS chips.

  3. A Simulation Approach for Performance Validation during Embedded Systems Design

    Science.gov (United States)

    Wang, Zhonglei; Haberl, Wolfgang; Herkersdorf, Andreas; Wechs, Martin

    Due to the time-to-market pressure, it is highly desirable to design hardware and software of embedded systems in parallel. However, hardware and software are developed mostly using very different methods, so that performance evaluation and validation of the whole system is not an easy task. In this paper, we propose a simulation approach to bridge the gap between model-driven software development and simulation based hardware design, by merging hardware and software models into a SystemC based simulation environment. An automated procedure has been established to generate software simulation models from formal models, while the hardware design is originally modeled in SystemC. As the simulation models are annotated with timing information, performance issues are tackled in the same pass as system functionality, rather than in a dedicated approach.

  4. High Performance Multivariate Visual Data Exploration for Extremely Large Data

    International Nuclear Information System (INIS)

    Ruebel, Oliver; Wu, Kesheng; Childs, Hank; Meredith, Jeremy; Geddes, Cameron G.R.; Cormier-Michel, Estelle; Ahern, Sean; Weber, Gunther H.; Messmer, Peter; Hagen, Hans; Hamann, Bernd; Bethel, E. Wes; Prabhat

    2008-01-01

    One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system

  5. High Performance Multivariate Visual Data Exploration for Extremely Large Data

    Energy Technology Data Exchange (ETDEWEB)

    Rubel, Oliver; Wu, Kesheng; Childs, Hank; Meredith, Jeremy; Geddes, Cameron G.R.; Cormier-Michel, Estelle; Ahern, Sean; Weber, Gunther H.; Messmer, Peter; Hagen, Hans; Hamann, Bernd; Bethel, E. Wes; Prabhat,

    2008-08-22

    One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated within the context of gaining insight from complex, time-varying datasets produced by a laser wakefield accelerator simulation. Our approach leverages histogram-based parallel coordinates for both visual information display as well as a vehicle for guiding a data mining operation. Data extraction and subsetting are implemented with state-of-the-art index/query technology. This approach, while applied here to accelerator science, is generally applicable to a broad set of science applications, and is implemented in a production-quality visual data analysis infrastructure. We conduct a detailed performance analysis and demonstrate good scalability on a distributed memory Cray XT4 system.

  6. A new practice-driven approach to develop software in a cyber-physical system environment

    Science.gov (United States)

    Jiang, Yiping; Chen, C. L. Philip; Duan, Junwei

    2016-02-01

    Cyber-physical system (CPS) is an emerging area, which cannot work efficiently without proper software handling of the data and business logic. Software and middleware is the soul of the CPS. The software development of CPS is a critical issue because of its complicity in a large scale realistic system. Furthermore, object-oriented approach (OOA) is often used to develop CPS software, which needs some improvements according to the characteristics of CPS. To develop software in a CPS environment, a new systematic approach is proposed in this paper. It comes from practice, and has been evolved from software companies. It consists of (A) Requirement analysis in event-oriented way, (B) architecture design in data-oriented way, (C) detailed design and coding in object-oriented way and (D) testing in event-oriented way. It is a new approach based on OOA; the difference when compared with OOA is that the proposed approach has different emphases and measures in every stage. It is more accord with the characteristics of event-driven CPS. In CPS software development, one should focus on the events more than the functions or objects. A case study of a smart home system is designed to reveal the effectiveness of the approach. It shows that the approach is also easy to be operated in the practice owing to some simplifications. The running result illustrates the validity of this approach.

  7. Big Data: An Opportunity for Collaboration with Computer Scientists on Data-Driven Science

    Science.gov (United States)

    Baru, C.

    2014-12-01

    Big data technologies are evolving rapidly, driven by the need to manage ever increasing amounts of historical data; process relentless streams of human and machine-generated data; and integrate data of heterogeneous structure from extremely heterogeneous sources of information. Big data is inherently an application-driven problem. Developing the right technologies requires an understanding of the applications domain. Though, an intriguing aspect of this phenomenon is that the availability of the data itself enables new applications not previously conceived of! In this talk, we will discuss how the big data phenomenon creates an imperative for collaboration among domain scientists (in this case, geoscientists) and computer scientists. Domain scientists provide the application requirements as well as insights about the data involved, while computer scientists help assess whether problems can be solved with currently available technologies or require adaptaion of existing technologies and/or development of new technologies. The synergy can create vibrant collaborations potentially leading to new science insights as well as development of new data technologies and systems. The area of interface between geosciences and computer science, also referred to as geoinformatics is, we believe, a fertile area for interdisciplinary research.

  8. A manifold learning approach to data-driven computational materials and processes

    Science.gov (United States)

    Ibañez, Ruben; Abisset-Chavanne, Emmanuelle; Aguado, Jose Vicente; Gonzalez, David; Cueto, Elias; Duval, Jean Louis; Chinesta, Francisco

    2017-10-01

    Standard simulation in classical mechanics is based on the use of two very different types of equations. The first one, of axiomatic character, is related to balance laws (momentum, mass, energy, …), whereas the second one consists of models that scientists have extracted from collected, natural or synthetic data. In this work we propose a new method, able to directly link data to computers in order to perform numerical simulations. These simulations will employ universal laws while minimizing the need of explicit, often phenomenological, models. They are based on manifold learning methodologies.

  9. Data and analytics to inform energy retrofit of high performance buildings

    International Nuclear Information System (INIS)

    Hong, Tianzhen; Yang, Le; Hill, David; Feng, Wei

    2014-01-01

    Highlights: • High performance buildings can be retrofitted using measured data and analytics. • Data of energy use, systems operating and environmental conditions are needed. • An energy data model based on the ISO Standard 12655 is key for energy benchmarking. • Three types of analytics are used: energy profiling, benchmarking, and diagnostics. • The case study shows 20% of electricity can be saved by retrofit. - Abstract: Buildings consume more than one-third of the world’s primary energy. Reducing energy use in buildings with energy efficient technologies is feasible and also driven by energy policies such as energy benchmarking, disclosure, rating, and labeling in both the developed and developing countries. Current energy retrofits focus on the existing building stocks, especially older buildings, but the growing number of new high performance buildings built around the world raises a question that how these buildings perform and whether there are retrofit opportunities to further reduce their energy use. This is a new and unique problem for the building industry. Traditional energy audit or analysis methods are inadequate to look deep into the energy use of the high performance buildings. This study aims to tackle this problem with a new holistic approach powered by building performance data and analytics. First, three types of measured data are introduced, including the time series energy use, building systems operating conditions, and indoor and outdoor environmental parameters. An energy data model based on the ISO Standard 12655 is used to represent the energy use in buildings in a three-level hierarchy. Secondly, a suite of analytics were proposed to analyze energy use and to identify retrofit measures for high performance buildings. The data-driven analytics are based on monitored data at short time intervals, and cover three levels of analysis – energy profiling, benchmarking and diagnostics. Thirdly, the analytics were applied to a high

  10. From Brand Management to Global Business Management in Market-Driven Companies

    OpenAIRE

    Emilio Zito

    2009-01-01

    Over the past several years, the most competitive mass-market companies (automobile, high-tech, consumer and retail, etc.) have been experiencing a new strategic approach around the concept of Market-Driven strategy, as opposed to a pure marketing-focused approach known as Customer-Driven strategy. A fast-moving, mass-market global company would likely have a precise performance measurement system in place with broad performance indicators based on: project economics, ratios analysis (ROI, in...

  11. Thermochemical performance analysis of solar driven CO_2 methane reforming

    International Nuclear Information System (INIS)

    Fuqiang, Wang; Jianyu, Tan; Huijian, Jin; Yu, Leng

    2015-01-01

    Increasing CO_2 emission problems create urgent challenges for alleviating global warming, and the capture of CO_2 has become an essential field of scientific research. In this study, a finite volume method (FVM) coupled with thermochemical kinetics was developed to analyze the solar driven CO_2 methane reforming process in a metallic foam reactor. The local thermal non-equilibrium (LTNE) model coupled with radiative heat transfer was developed to provide more temperature information. A joint inversion method based on chemical process software and the FVM coupled with thermochemical kinetics was developed to obtain the thermochemical reaction parameters and guarantee the calculation accuracy. The detailed thermal and thermochemical performance in the metal foam reactor was analyzed. In addition, the effects of heat flux distribution and porosity on the solar driven CO_2 methane reforming process were analyzed. The numerical results can serve as theoretical guidance for the solar driven CO_2 methane reforming application. - Highlights: • Solar driven CO_2 methane reforming process in metal foam reactor is analyzed. • FVM with chemical reactions was developed to analyze solar CO_2 methane reforming. • A joint inversion method was developed to obtain thermochemical reaction parameters. • Results can be a guidance for the solar driven CO_2 methane reforming application.

  12. A model-driven approach to information security compliance

    Science.gov (United States)

    Correia, Anacleto; Gonçalves, António; Teodoro, M. Filomena

    2017-06-01

    The availability, integrity and confidentiality of information are fundamental to the long-term survival of any organization. Information security is a complex issue that must be holistically approached, combining assets that support corporate systems, in an extended network of business partners, vendors, customers and other stakeholders. This paper addresses the conception and implementation of information security systems, conform the ISO/IEC 27000 set of standards, using the model-driven approach. The process begins with the conception of a domain level model (computation independent model) based on information security vocabulary present in the ISO/IEC 27001 standard. Based on this model, after embedding in the model mandatory rules for attaining ISO/IEC 27001 conformance, a platform independent model is derived. Finally, a platform specific model serves the base for testing the compliance of information security systems with the ISO/IEC 27000 set of standards.

  13. Interdisciplinary process driven performative morphologies : A morphogenomic approach towards developing context aware spatial formations

    NARCIS (Netherlands)

    Biloria, N.M.

    2011-01-01

    Architectural praxis is in continuous state of change. The introduction of information technology driven design techniques, constantly updating building information modeling protocols, new policy demands coupled together with environmental regulations and cultural fluctuations are all open-ended

  14. A Data-Driven Response Virtual Sensor Technique with Partial Vibration Measurements Using Convolutional Neural Network

    Science.gov (United States)

    Sun, Shan-Bin; He, Yuan-Yuan; Zhou, Si-Da; Yue, Zhen-Jiang

    2017-01-01

    Measurement of dynamic responses plays an important role in structural health monitoring, damage detection and other fields of research. However, in aerospace engineering, the physical sensors are limited in the operational conditions of spacecraft, due to the severe environment in outer space. This paper proposes a virtual sensor model with partial vibration measurements using a convolutional neural network. The transmissibility function is employed as prior knowledge. A four-layer neural network with two convolutional layers, one fully connected layer, and an output layer is proposed as the predicting model. Numerical examples of two different structural dynamic systems demonstrate the performance of the proposed approach. The excellence of the novel technique is further indicated using a simply supported beam experiment comparing to a modal-model-based virtual sensor, which uses modal parameters, such as mode shapes, for estimating the responses of the faulty sensors. The results show that the presented data-driven response virtual sensor technique can predict structural response with high accuracy. PMID:29231868

  15. A Data-Driven Response Virtual Sensor Technique with Partial Vibration Measurements Using Convolutional Neural Network.

    Science.gov (United States)

    Sun, Shan-Bin; He, Yuan-Yuan; Zhou, Si-Da; Yue, Zhen-Jiang

    2017-12-12

    Measurement of dynamic responses plays an important role in structural health monitoring, damage detection and other fields of research. However, in aerospace engineering, the physical sensors are limited in the operational conditions of spacecraft, due to the severe environment in outer space. This paper proposes a virtual sensor model with partial vibration measurements using a convolutional neural network. The transmissibility function is employed as prior knowledge. A four-layer neural network with two convolutional layers, one fully connected layer, and an output layer is proposed as the predicting model. Numerical examples of two different structural dynamic systems demonstrate the performance of the proposed approach. The excellence of the novel technique is further indicated using a simply supported beam experiment comparing to a modal-model-based virtual sensor, which uses modal parameters, such as mode shapes, for estimating the responses of the faulty sensors. The results show that the presented data-driven response virtual sensor technique can predict structural response with high accuracy.

  16. Data-driven Regulation and Governance in Smart Cities

    NARCIS (Netherlands)

    Ranchordás, Sofia; Klop, Abram; Mak, Vanessa; Berlee, Anna; Tjong Tjin Tai, Eric

    2018-01-01

    This chapter discusses the concept of data-driven regulation and governance in the context of smart cities by describing how these urban centres harness these technologies to collect and process information about citizens, traffic, urban planning or waste production. It describes how several smart

  17. Data-driven design of fault diagnosis and fault-tolerant control systems

    CERN Document Server

    Ding, Steven X

    2014-01-01

    Data-driven Design of Fault Diagnosis and Fault-tolerant Control Systems presents basic statistical process monitoring, fault diagnosis, and control methods, and introduces advanced data-driven schemes for the design of fault diagnosis and fault-tolerant control systems catering to the needs of dynamic industrial processes. With ever increasing demands for reliability, availability and safety in technical processes and assets, process monitoring and fault-tolerance have become important issues surrounding the design of automatic control systems. This text shows the reader how, thanks to the rapid development of information technology, key techniques of data-driven and statistical process monitoring and control can now become widely used in industrial practice to address these issues. To allow for self-contained study and facilitate implementation in real applications, important mathematical and control theoretical knowledge and tools are included in this book. Major schemes are presented in algorithm form and...

  18. Limited angle CT reconstruction by simultaneous spatial and Radon domain regularization based on TV and data-driven tight frame

    Science.gov (United States)

    Zhang, Wenkun; Zhang, Hanming; Wang, Linyuan; Cai, Ailong; Li, Lei; Yan, Bin

    2018-02-01

    Limited angle computed tomography (CT) reconstruction is widely performed in medical diagnosis and industrial testing because of the size of objects, engine/armor inspection requirements, and limited scan flexibility. Limited angle reconstruction necessitates usage of optimization-based methods that utilize additional sparse priors. However, most of conventional methods solely exploit sparsity priors of spatial domains. When CT projection suffers from serious data deficiency or various noises, obtaining reconstruction images that meet the requirement of quality becomes difficult and challenging. To solve this problem, this paper developed an adaptive reconstruction method for limited angle CT problem. The proposed method simultaneously uses spatial and Radon domain regularization model based on total variation (TV) and data-driven tight frame. Data-driven tight frame being derived from wavelet transformation aims at exploiting sparsity priors of sinogram in Radon domain. Unlike existing works that utilize pre-constructed sparse transformation, the framelets of the data-driven regularization model can be adaptively learned from the latest projection data in the process of iterative reconstruction to provide optimal sparse approximations for given sinogram. At the same time, an effective alternating direction method is designed to solve the simultaneous spatial and Radon domain regularization model. The experiments for both simulation and real data demonstrate that the proposed algorithm shows better performance in artifacts depression and details preservation than the algorithms solely using regularization model of spatial domain. Quantitative evaluations for the results also indicate that the proposed algorithm applying learning strategy performs better than the dual domains algorithms without learning regularization model

  19. Product design pattern based on big data-driven scenario

    Directory of Open Access Journals (Sweden)

    Conggang Yu

    2016-07-01

    Full Text Available This article discusses about new product design patterns in the big data era, gives designer a new rational thinking way, and is a new way to understand the design of the product. Based on the key criteria of the product design process, category, element, and product are used to input the data, which comprises concrete data and abstract data as an enlargement of the criteria of product design process for the establishment of a big data-driven product design pattern’s model. Moreover, an experiment and a product design case are conducted to verify the feasibility of the new pattern. Ultimately, we will conclude that the data-driven product design has two patterns: one is the concrete data supporting the product design, namely “product–data–product” pattern, and the second is based on the value of the abstract data for product design, namely “data–product–data” pattern. Through the data, users are involving themselves in the design development process. Data and product form a huge network, and data plays a role of connection or node. So the essence of the design is to find a new connection based on element, and to find a new node based on category.

  20. Data-Driven User Feedback: An Improved Neurofeedback Strategy considering the Interindividual Variability of EEG Features

    Directory of Open Access Journals (Sweden)

    Chang-Hee Han

    2016-01-01

    Full Text Available It has frequently been reported that some users of conventional neurofeedback systems can experience only a small portion of the total feedback range due to the large interindividual variability of EEG features. In this study, we proposed a data-driven neurofeedback strategy considering the individual variability of electroencephalography (EEG features to permit users of the neurofeedback system to experience a wider range of auditory or visual feedback without a customization process. The main idea of the proposed strategy is to adjust the ranges of each feedback level using the density in the offline EEG database acquired from a group of individuals. Twenty-two healthy subjects participated in offline experiments to construct an EEG database, and five subjects participated in online experiments to validate the performance of the proposed data-driven user feedback strategy. Using the optimized bin sizes, the number of feedback levels that each individual experienced was significantly increased to 139% and 144% of the original results with uniform bin sizes in the offline and online experiments, respectively. Our results demonstrated that the use of our data-driven neurofeedback strategy could effectively increase the overall range of feedback levels that each individual experienced during neurofeedback training.

  1. Data-driven remaining useful life prognosis techniques stochastic models, methods and applications

    CERN Document Server

    Si, Xiao-Sheng; Hu, Chang-Hua

    2017-01-01

    This book introduces data-driven remaining useful life prognosis techniques, and shows how to utilize the condition monitoring data to predict the remaining useful life of stochastic degrading systems and to schedule maintenance and logistics plans. It is also the first book that describes the basic data-driven remaining useful life prognosis theory systematically and in detail. The emphasis of the book is on the stochastic models, methods and applications employed in remaining useful life prognosis. It includes a wealth of degradation monitoring experiment data, practical prognosis methods for remaining useful life in various cases, and a series of applications incorporated into prognostic information in decision-making, such as maintenance-related decisions and ordering spare parts. It also highlights the latest advances in data-driven remaining useful life prognosis techniques, especially in the contexts of adaptive prognosis for linear stochastic degrading systems, nonlinear degradation modeling based pro...

  2. Data-driven analysis of collections of big datasets by the Bi-CoPaM method yields field-specific novel insights

    DEFF Research Database (Denmark)

    Abu-Jamous, Basel; Liu, Chao; Roberts, David, J.

    2017-01-01

    not commonly considered. To bridge this gap between the fast pace of data generation and the slower pace of data analysis, and to exploit the massive amounts of existing data, we suggest employing data-driven explorations to analyse collections of related big datasets. This approach aims at extracting field......Massive amounts of data have recently been, and are increasingly being, generated from various fields, such as bioinformatics, neuroscience and social networks. Many of these big datasets were generated to answer specific research questions, and were analysed accordingly. However, the scope...... clusters of consistently correlated objects. We demonstrate the power of data-driven explorations by applying the Bi-CoPaM to two collections of big datasets from two distinct fields, namely bioinformatics and neuroscience. In the first application, the collective analysis of forty yeast gene expression...

  3. A Data-Driven Noise Reduction Method and Its Application for the Enhancement of Stress Wave Signals

    Directory of Open Access Journals (Sweden)

    Hai-Lin Feng

    2012-01-01

    Full Text Available Ensemble empirical mode decomposition (EEMD has been recently used to recover a signal from observed noisy data. Typically this is performed by partial reconstruction or thresholding operation. In this paper we describe an efficient noise reduction method. EEMD is used to decompose a signal into several intrinsic mode functions (IMFs. The time intervals between two adjacent zero-crossings within the IMF, called instantaneous half period (IHP, are used as a criterion to detect and classify the noise oscillations. The undesirable waveforms with a larger IHP are set to zero. Furthermore, the optimum threshold in this approach can be derived from the signal itself using the consecutive mean square error (CMSE. The method is fully data driven, and it requires no prior knowledge of the target signals. This method can be verified with the simulative program by using Matlab. The denoising results are proper. In comparison with other EEMD based methods, it is concluded that the means adopted in this paper is suitable to preprocess the stress wave signals in the wood nondestructive testing.

  4. Dynamic model reduction using data-driven Loewner-framework applied to thermally morphing structures

    Science.gov (United States)

    Phoenix, Austin A.; Tarazaga, Pablo A.

    2017-05-01

    The work herein proposes the use of the data-driven Loewner-framework for reduced order modeling as applied to dynamic Finite Element Models (FEM) of thermally morphing structures. The Loewner-based modeling approach is computationally efficient and accurately constructs reduced models using analytical output data from a FEM. This paper details the two-step process proposed in the Loewner approach. First, a random vibration FEM simulation is used as the input for the development of a Single Input Single Output (SISO) data-based dynamic Loewner state space model. Second, an SVD-based truncation is used on the Loewner state space model, such that the minimal, dynamically representative, state space model is achieved. For this second part, varying levels of reduction are generated and compared. The work herein can be extended to model generation using experimental measurements by replacing the FEM output data in the first step and following the same procedure. This method will be demonstrated on two thermally morphing structures, a rigidly fixed hexapod in multiple geometric configurations and a low mass anisotropic morphing boom. This paper is working to detail the method and identify the benefits of the reduced model methodology.

  5. Data-Driven Approaches for Computation in Intelligent Biomedical Devices: A Case Study of EEG Monitoring for Chronic Seizure Detection

    Directory of Open Access Journals (Sweden)

    Naveen Verma

    2011-04-01

    Full Text Available Intelligent biomedical devices implies systems that are able to detect specific physiological processes in patients so that particular responses can be generated. This closed-loop capability can have enormous clinical value when we consider the unprecedented modalities that are beginning to emerge for sensing and stimulating patient physiology. Both delivering therapy (e.g., deep-brain stimulation, vagus nerve stimulation, etc. and treating impairments (e.g., neural prosthesis requires computational devices that can make clinically relevant inferences, especially using minimally-intrusive patient signals. The key to such devices is algorithms that are based on data-driven signal modeling as well as hardware structures that are specialized to these. This paper discusses the primary application-domain challenges that must be overcome and analyzes the most promising methods for this that are emerging. We then look at how these methods are being incorporated in ultra-low-energy computational platforms and systems. The case study for this is a seizure-detection SoC that includes instrumentation and computation blocks in support of a system that exploits patient-specific modeling to achieve accurate performance for chronic detection. The SoC samples each EEG channel at a rate of 600 Hz and performs processing to derive signal features on every two second epoch, consuming 9 μJ/epoch/channel. Signal feature extraction reduces the data rate by a factor of over 40×, permitting wireless communication from the patient’s head while reducing the total power on the head by 14×.

  6. Data-driven analysis of blood glucose management effectiveness

    NARCIS (Netherlands)

    Nannings, B.; Abu-Hanna, A.; Bosman, R. J.

    2005-01-01

    The blood-glucose-level (BGL) of Intensive Care (IC) patients requires close monitoring and control. In this paper we describe a general data-driven analytical method for studying the effectiveness of BGL management. The method is based on developing and studying a clinical outcome reflecting the

  7. CEREF: A hybrid data-driven model for forecasting annual streamflow from a socio-hydrological system

    Science.gov (United States)

    Zhang, Hongbo; Singh, Vijay P.; Wang, Bin; Yu, Yinghao

    2016-09-01

    Hydrological forecasting is complicated by flow regime alterations in a coupled socio-hydrologic system, encountering increasingly non-stationary, nonlinear and irregular changes, which make decision support difficult for future water resources management. Currently, many hybrid data-driven models, based on the decomposition-prediction-reconstruction principle, have been developed to improve the ability to make predictions of annual streamflow. However, there exist many problems that require further investigation, the chief among which is the direction of trend components decomposed from annual streamflow series and is always difficult to ascertain. In this paper, a hybrid data-driven model was proposed to capture this issue, which combined empirical mode decomposition (EMD), radial basis function neural networks (RBFNN), and external forces (EF) variable, also called the CEREF model. The hybrid model employed EMD for decomposition and RBFNN for intrinsic mode function (IMF) forecasting, and determined future trend component directions by regression with EF as basin water demand representing the social component in the socio-hydrologic system. The Wuding River basin was considered for the case study, and two standard statistical measures, root mean squared error (RMSE) and mean absolute error (MAE), were used to evaluate the performance of CEREF model and compare with other models: the autoregressive (AR), RBFNN and EMD-RBFNN. Results indicated that the CEREF model had lower RMSE and MAE statistics, 42.8% and 7.6%, respectively, than did other models, and provided a superior alternative for forecasting annual runoff in the Wuding River basin. Moreover, the CEREF model can enlarge the effective intervals of streamflow forecasting compared to the EMD-RBFNN model by introducing the water demand planned by the government department to improve long-term prediction accuracy. In addition, we considered the high-frequency component, a frequent subject of concern in EMD

  8. The Use of Linking Adverbials in Academic Essays by Non-Native Writers: How Data-Driven Learning Can Help

    Science.gov (United States)

    Garner, James Robert

    2013-01-01

    Over the past several decades, the TESOL community has seen an increased interest in the use of data-driven learning (DDL) approaches. Most studies of DDL have focused on the acquisition of vocabulary items, including a wide range of information necessary for their correct usage. One type of vocabulary that has yet to be properly investigated has…

  9. Performance of a data-driven technique applied to changes in wave height and its effect on beach response

    Directory of Open Access Journals (Sweden)

    José M. Horrillo-Caraballo

    2016-01-01

    Full Text Available In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the east coast of the USA, which is exposed to Atlantic Ocean swells and storm waves, and the latter is the Milford-on-Sea site at Christchurch Bay, on the south coast of England, which is partially sheltered from Atlantic swells but has a directionally bimodal wave exposure. The data sets comprise detailed bathymetric surveys of beach profiles covering a period of more than 25 years for the Duck site and over 18 years for the Milford-on-Sea site. The structure of the data sets and the data-driven methods are described. Canonical correlation analysis (CCA was used to find linkages between the wave characteristics and beach profiles. The sensitivity of the linkages was investigated by deploying a wave height threshold to filter out the smaller waves incrementally. The results of the analysis indicate that, for the gently sloping sandy beach, waves of all heights are important to the morphological response. For the mixed sand and gravel beach, filtering the smaller waves improves the statistical fit and it suggests that low-height waves do not play a primary role in the medium-term morphological response, which is primarily driven by the intermittent larger storm waves.

  10. Data Prospecting Framework - a new approach to explore "big data" in Earth Science

    Science.gov (United States)

    Ramachandran, R.; Rushing, J.; Lin, A.; Kuo, K.

    2012-12-01

    Due to advances in sensors, computation and storage, cost and effort required to produce large datasets have been significantly reduced. As a result, we are seeing a proliferation of large-scale data sets being assembled in almost every science field, especially in geosciences. Opportunities to exploit the "big data" are enormous as new hypotheses can be generated by combining and analyzing large amounts of data. However, such a data-driven approach to science discovery assumes that scientists can find and isolate relevant subsets from vast amounts of available data. Current Earth Science data systems only provide data discovery through simple metadata and keyword-based searches and are not designed to support data exploration capabilities based on the actual content. Consequently, scientists often find themselves downloading large volumes of data, struggling with large amounts of storage and learning new analysis technologies that will help them separate the wheat from the chaff. New mechanisms of data exploration are needed to help scientists discover the relevant subsets We present data prospecting, a new content-based data analysis paradigm to support data-intensive science. Data prospecting allows the researchers to explore big data in determining and isolating data subsets for further analysis. This is akin to geo-prospecting in which mineral sites of interest are determined over the landscape through screening methods. The resulting "data prospects" only provide an interaction with and feel for the data through first-look analytics; the researchers would still have to download the relevant datasets and analyze them deeply using their favorite analytical tools to determine if the datasets will yield new hypotheses. Data prospecting combines two traditional categories of data analysis, data exploration and data mining within the discovery step. Data exploration utilizes manual/interactive methods for data analysis such as standard statistical analysis and

  11. Data-Driven Diffusion Of Innovations: Successes And Challenges In 3 Large-Scale Innovative Delivery Models.

    Science.gov (United States)

    Dorr, David A; Cohen, Deborah J; Adler-Milstein, Julia

    2018-02-01

    Failed diffusion of innovations may be linked to an inability to use and apply data, information, and knowledge to change perceptions of current practice and motivate change. Using qualitative and quantitative data from three large-scale health care delivery innovations-accountable care organizations, advanced primary care practice, and EvidenceNOW-we assessed where data-driven innovation is occurring and where challenges lie. We found that implementation of some technological components of innovation (for example, electronic health records) has occurred among health care organizations, but core functions needed to use data to drive innovation are lacking. Deficits include the inability to extract and aggregate data from the records; gaps in sharing data; and challenges in adopting advanced data functions, particularly those related to timely reporting of performance data. The unexpectedly high costs and burden incurred during implementation of the innovations have limited organizations' ability to address these and other deficits. Solutions that could help speed progress in data-driven innovation include facilitating peer-to-peer technical assistance, providing tailored feedback reports to providers from data aggregators, and using practice facilitators skilled in using data technology for quality improvement to help practices transform. Policy efforts that promote these solutions may enable more rapid uptake of and successful participation in innovative delivery system reforms.

  12. A data-driven soft sensor for needle deflection in heterogeneous tissue using just-in-time modelling.

    Science.gov (United States)

    Rossa, Carlos; Lehmann, Thomas; Sloboda, Ronald; Usmani, Nawaid; Tavakoli, Mahdi

    2017-08-01

    Global modelling has traditionally been the approach taken to estimate needle deflection in soft tissue. In this paper, we propose a new method based on local data-driven modelling of needle deflection. External measurement of needle-tissue interactions is collected from several insertions in ex vivo tissue to form a cloud of data. Inputs to the system are the needle insertion depth, axial rotations, and the forces and torques measured at the needle base by a force sensor. When a new insertion is performed, the just-in-time learning method estimates the model outputs given the current inputs to the needle-tissue system and the historical database. The query is compared to every observation in the database and is given weights according to some similarity criteria. Only a subset of historical data that is most relevant to the query is selected and a local linear model is fit to the selected points to estimate the query output. The model outputs the 3D deflection of the needle tip and the needle insertion force. The proposed approach is validated in ex vivo multilayered biological tissue in different needle insertion scenarios. Experimental results in five different case studies indicate an accuracy in predicting needle deflection of 0.81 and 1.24 mm in the horizontal and vertical lanes, respectively, and an accuracy of 0.5 N in predicting the needle insertion force over 216 needle insertions.

  13. Prototype Development: Context-Driven Dynamic XML Ophthalmologic Data Capture Application.

    Science.gov (United States)

    Peissig, Peggy; Schwei, Kelsey M; Kadolph, Christopher; Finamore, Joseph; Cancel, Efrain; McCarty, Catherine A; Okorie, Asha; Thomas, Kate L; Allen Pacheco, Jennifer; Pathak, Jyotishman; Ellis, Stephen B; Denny, Joshua C; Rasmussen, Luke V; Tromp, Gerard; Williams, Marc S; Vrabec, Tamara R; Brilliant, Murray H

    2017-09-13

    The capture and integration of structured ophthalmologic data into electronic health records (EHRs) has historically been a challenge. However, the importance of this activity for patient care and research is critical. The purpose of this study was to develop a prototype of a context-driven dynamic extensible markup language (XML) ophthalmologic data capture application for research and clinical care that could be easily integrated into an EHR system. Stakeholders in the medical, research, and informatics fields were interviewed and surveyed to determine data and system requirements for ophthalmologic data capture. On the basis of these requirements, an ophthalmology data capture application was developed to collect and store discrete data elements with important graphical information. The context-driven data entry application supports several features, including ink-over drawing capability for documenting eye abnormalities, context-based Web controls that guide data entry based on preestablished dependencies, and an adaptable database or XML schema that stores Web form specifications and allows for immediate changes in form layout or content. The application utilizes Web services to enable data integration with a variety of EHRs for retrieval and storage of patient data. This paper describes the development process used to create a context-driven dynamic XML data capture application for optometry and ophthalmology. The list of ophthalmologic data elements identified as important for care and research can be used as a baseline list for future ophthalmologic data collection activities. ©Peggy Peissig, Kelsey M Schwei, Christopher Kadolph, Joseph Finamore, Efrain Cancel, Catherine A McCarty, Asha Okorie, Kate L Thomas, Jennifer Allen Pacheco, Jyotishman Pathak, Stephen B Ellis, Joshua C Denny, Luke V Rasmussen, Gerard Tromp, Marc S Williams, Tamara R Vrabec, Murray H Brilliant. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 13.09.2017.

  14. Dynamically adaptive data-driven simulation of extreme hydrological flows

    KAUST Repository

    Kumar Jain, Pushkar; Mandli, Kyle; Hoteit, Ibrahim; Knio, Omar; Dawson, Clint

    2017-01-01

    evacuation in real-time and through the development of resilient infrastructure based on knowledge of how systems respond to extreme events. Data-driven computational modeling is a critical technology underpinning these efforts. This investigation focuses

  15. Data-Driven Exercises for Chemistry: A New Digital Collection

    Science.gov (United States)

    Grubbs, W. Tandy

    2007-01-01

    The analysis presents a new digital collection for various data-driven exercises that are used for teaching chemistry to the students. Such methods are expected to help the students to think in a more scientific manner.

  16. Data-Driven Model Order Reduction for Bayesian Inverse Problems

    KAUST Repository

    Cui, Tiangang; Youssef, Marzouk; Willcox, Karen

    2014-01-01

    One of the major challenges in using MCMC for the solution of inverse problems is the repeated evaluation of computationally expensive numerical models. We develop a data-driven projection- based model order reduction technique to reduce

  17. Metadata-Driven SOA-Based Application for Facilitation of Real-Time Data Warehousing

    Science.gov (United States)

    Pintar, Damir; Vranić, Mihaela; Skočir, Zoran

    Service-oriented architecture (SOA) has already been widely recognized as an effective paradigm for achieving integration of diverse information systems. SOA-based applications can cross boundaries of platforms, operation systems and proprietary data standards, commonly through the usage of Web Services technology. On the other side, metadata is also commonly referred to as a potential integration tool given the fact that standardized metadata objects can provide useful information about specifics of unknown information systems with which one has interest in communicating with, using an approach commonly called "model-based integration". This paper presents the result of research regarding possible synergy between those two integration facilitators. This is accomplished with a vertical example of a metadata-driven SOA-based business process that provides ETL (Extraction, Transformation and Loading) and metadata services to a data warehousing system in need of a real-time ETL support.

  18. Neutron data for accelerator-driven transmutation technologies. Annual Report 2003/2004

    International Nuclear Information System (INIS)

    Blomgren, J.; Hildebrand, A.; Nilsson, L.; Mermod, P.; Olsson, N.; Pomp, S.; Oesterlund, M.

    2004-08-01

    The project NATT, Neutron data for Accelerator-driven Transmutation Technology, is performed within the nuclear reactions group of the Dept. of Neutron Research, Uppsala univ. The activities of the group are directed towards experimental studies of nuclear reaction probabilities of importance for various applications, like transmutation of nuclear waste, biomedical effects and electronics reliability. The experimental work is primarily undertaken at the The Svedberg Laboratory (TSL) in Uppsala, where the group has previously developed two world-unique instruments, MEDLEY and SCANDAL. Highlights from the past year: Analysis and documentation has been finalized of previously performed measurements of elastic neutron scattering from hydrogen at 96 MeV. The results corroborate the normalization of previously obtained data at TSL, which have been under debate. This is of importance since this reaction serves as reference for many other measurements. Compelling evidence of the existence of three-body forces in nuclei has been obtained. Within the project, one PhD exam and one licentiate exam has been awarded. One PhD exam and one licentiate exam has been awarded for work closely related to the project. A new neutron beam facility with significantly improved performance has been built and commissioned at TSL

  19. Neutron data for accelerator-driven transmutation technologies. Annual Report 2003/2004

    Energy Technology Data Exchange (ETDEWEB)

    Blomgren, J.; Hildebrand, A.; Nilsson, L.; Mermod, P.; Olsson, N.; Pomp, S.; Oesterlund, M. [Uppsala Univ. (Sweden). Dept. for Neutron Research

    2004-08-01

    The project NATT, Neutron data for Accelerator-driven Transmutation Technology, is performed within the nuclear reactions group of the Dept. of Neutron Research, Uppsala univ. The activities of the group are directed towards experimental studies of nuclear reaction probabilities of importance for various applications, like transmutation of nuclear waste, biomedical effects and electronics reliability. The experimental work is primarily undertaken at the The Svedberg Laboratory (TSL) in Uppsala, where the group has previously developed two world-unique instruments, MEDLEY and SCANDAL. Highlights from the past year: Analysis and documentation has been finalized of previously performed measurements of elastic neutron scattering from hydrogen at 96 MeV. The results corroborate the normalization of previously obtained data at TSL, which have been under debate. This is of importance since this reaction serves as reference for many other measurements. Compelling evidence of the existence of three-body forces in nuclei has been obtained. Within the project, one PhD exam and one licentiate exam has been awarded. One PhD exam and one licentiate exam has been awarded for work closely related to the project. A new neutron beam facility with significantly improved performance has been built and commissioned at TSL.

  20. Peripheral visual feedback: a powerful means of supporting effective attention allocation in event-driven, data-rich environments.

    Science.gov (United States)

    Nikolic, M I; Sarter, N B

    2001-01-01

    Breakdowns in human-automation coordination in data-rich, event-driven domains such as aviation can be explained in part by a mismatch between the high degree of autonomy yet low observability of modern technology. To some extent, the latter is the result of an increasing reliance in feedback design on foveal vision--an approach that fails to support pilots in tracking system-induced changes and events in parallel with performing concurrent flight-related tasks. One possible solution to the problem is the distribution of tasks and information across sensory modalities and processing channels. A simulator study is presented that compared the effectiveness of current foveal feedback and two implementations of peripheral visual feedback for keeping pilots informed about uncommanded changes in the status of an automated cockpit system. Both peripheral visual displays resulted in higher detection rates and faster response times, without interfering with the performance of concurrent visual tasks any more than does currently available automation feedback. Potential applications include improved display designs that support effective attention allocation in a variety of complex dynamic environments, such as aviation, process control, and medicine.

  1. PHYCAA: Data-driven measurement and removal of physiological noise in BOLD fMRI

    DEFF Research Database (Denmark)

    Churchill, Nathan W.; Yourganov, Grigori; Spring, Robyn

    2012-01-01

    , autocorrelated physiological noise sources with reproducible spatial structure, using an adaptation of Canonical Correlation Analysis performed in a split-half resampling framework. The technique is able to identify physiological effects with vascular-linked spatial structure, and an intrinsic dimensionality...... with physiological noise, and real data-driven model prediction and reproducibility, for both block and event-related task designs. This is demonstrated compared to no physiological noise correction, and to the widely used RETROICOR (Glover et al., 2000) physiological denoising algorithm, which uses externally...

  2. Central focused convolutional neural networks: Developing a data-driven model for lung nodule segmentation.

    Science.gov (United States)

    Wang, Shuo; Zhou, Mu; Liu, Zaiyi; Liu, Zhenyu; Gu, Dongsheng; Zang, Yali; Dong, Di; Gevaert, Olivier; Tian, Jie

    2017-08-01

    Accurate lung nodule segmentation from computed tomography (CT) images is of great importance for image-driven lung cancer analysis. However, the heterogeneity of lung nodules and the presence of similar visual characteristics between nodules and their surroundings make it difficult for robust nodule segmentation. In this study, we propose a data-driven model, termed the Central Focused Convolutional Neural Networks (CF-CNN), to segment lung nodules from heterogeneous CT images. Our approach combines two key insights: 1) the proposed model captures a diverse set of nodule-sensitive features from both 3-D and 2-D CT images simultaneously; 2) when classifying an image voxel, the effects of its neighbor voxels can vary according to their spatial locations. We describe this phenomenon by proposing a novel central pooling layer retaining much information on voxel patch center, followed by a multi-scale patch learning strategy. Moreover, we design a weighted sampling to facilitate the model training, where training samples are selected according to their degree of segmentation difficulty. The proposed method has been extensively evaluated on the public LIDC dataset including 893 nodules and an independent dataset with 74 nodules from Guangdong General Hospital (GDGH). We showed that CF-CNN achieved superior segmentation performance with average dice scores of 82.15% and 80.02% for the two datasets respectively. Moreover, we compared our results with the inter-radiologists consistency on LIDC dataset, showing a difference in average dice score of only 1.98%. Copyright © 2017. Published by Elsevier B.V.

  3. Performance comparison between a solar driven rotary desiccant cooling system and conventional vapor compression system (performance study of desiccant cooling)

    International Nuclear Information System (INIS)

    Ge, T.S.; Ziegler, F.; Wang, R.Z.; Wang, H.

    2010-01-01

    Solar driven rotary desiccant cooling systems have been widely recognized as alternatives to conventional vapor compression systems for their merits of energy-saving and being eco-friendly. In the previous paper, the basic performance features of desiccant wheel have been discussed. In this paper, a solar driven two-stage rotary desiccant cooling system and a vapor compression system are simulated to provide cooling for one floor in a commercial office building in two cities with different climates: Berlin and Shanghai. The model developed in the previous paper is adopted to predict the performance of the desiccant wheel. The objectives of this paper are to evaluate and compare the thermodynamic and economic performance of the two systems and to obtain useful data for practical application. Results show that the desiccant cooling system is able to meet the cooling demand and provide comfortable supply air in both of the two regions. The required regeneration temperatures are 55 deg. C in Berlin and 85 deg. C in Shanghai. As compared to the vapor compression system, the desiccant cooling system has better supply air quality and consumes less electricity. The results of the economic analysis demonstrate that the dynamic investment payback periods are 4.7 years in Berlin and 7.2 years in Shanghai.

  4. Developing Annotation Solutions for Online Data Driven Learning

    Science.gov (United States)

    Perez-Paredes, Pascual; Alcaraz-Calero, Jose M.

    2009-01-01

    Although "annotation" is a widely-researched topic in Corpus Linguistics (CL), its potential role in Data Driven Learning (DDL) has not been addressed in depth by Foreign Language Teaching (FLT) practitioners. Furthermore, most of the research in the use of DDL methods pays little attention to annotation in the design and implementation…

  5. Developing a Data Driven Process-Based Model for Remote Sensing of Ecosystem Production

    Science.gov (United States)

    Elmasri, B.; Rahman, A. F.

    2010-12-01

    Estimating ecosystem carbon fluxes at various spatial and temporal scales is essential for quantifying the global carbon cycle. Numerous models have been developed for this purpose using several environmental variables as well as vegetation indices derived from remotely sensed data. Here we present a data driven modeling approach for gross primary production (GPP) that is based on a process based model BIOME-BGC. The proposed model was run using available remote sensing data and it does not depend on look-up tables. Furthermore, this approach combines the merits of both empirical and process models, and empirical models were used to estimate certain input variables such as light use efficiency (LUE). This was achieved by using remotely sensed data to the mathematical equations that represent biophysical photosynthesis processes in the BIOME-BGC model. Moreover, a new spectral index for estimating maximum photosynthetic activity, maximum photosynthetic rate index (MPRI), is also developed and presented here. This new index is based on the ratio between the near infrared and the green bands (ρ858.5/ρ555). The model was tested and validated against MODIS GPP product and flux measurements from two eddy covariance flux towers located at Morgan Monroe State Forest (MMSF) in Indiana and Harvard Forest in Massachusetts. Satellite data acquired by the Advanced Microwave Scanning Radiometer (AMSR-E) and MODIS were used. The data driven model showed a strong correlation between the predicted and measured GPP at the two eddy covariance flux towers sites. This methodology produced better predictions of GPP than did the MODIS GPP product. Moreover, the proportion of error in the predicted GPP for MMSF and Harvard forest was dominated by unsystematic errors suggesting that the results are unbiased. The analysis indicated that maintenance respiration is one of the main factors that dominate the overall model outcome errors and improvement in maintenance respiration estimation

  6. Nuclear data requirements for accelerator driven sub-critical systems

    Indian Academy of Sciences (India)

    The development of accelerator driven sub-critical systems (ADSS) require significant amount of new nuclear data in extended energy regions as well as for a variety of new materials. This paper reviews these perspectives in the Indian context.

  7. A model-driven approach to designing cross-enterprise business processes

    OpenAIRE

    Bauer, Bernhard (Prof.)

    2004-01-01

    A model-driven approach to designing cross-enterprise business processes / Bernhard Bauer, Jörg P. Müller, Stephan Roser. - In: On the move to meaningful internet systems 2004: OTM 2004 workshops : OTM Confederated International Workshops and Posters, GADA, JTRES, MIOS, WORM, WOSE, PhDS, and INTEROP 2004, Agia Napa, Cyprus, October 25 - 29, 2004 ; proceedings / Robert Meersman ... (eds.). - Berlin u.a. : Springer, 2004. - S. 544-555. - (Lecture Notes in Computer Science ; 3292)

  8. Big Data Innovation Challenge : Pioneering Approaches to Data-Driven Development

    OpenAIRE

    World Bank Group

    2016-01-01

    Big data can sound remote and lacking a human dimension, with few obvious links to development and impacting the lives of the poor. Concepts such as anti-poverty targeting, market access or rural electrification seem far more relevant – and easier to grasp. And yet some of today’s most groundbreaking initiatives in these areas rely on big data. This publication profiles these and more, sho...

  9. KIPT accelerator-driven system design and performance

    International Nuclear Information System (INIS)

    Gohar, Y.; Bolshinsky, I.; Karnaukhov, I.

    2015-01-01

    Argonne National Laboratory (ANL) of the US is collaborating with the Kharkov Institute of Physics and Technology (KIPT) of Ukraine to develop and construct a neutron source facility. The facility is planned to produce medical isotopes, train young nuclear professionals, support Ukraine's nuclear industry and provide capability to perform reactor physics, material research, and basic science experiments. It consists of a subcritical assembly with low-enriched uranium fuel driven with an electron accelerator. The target design utilises tungsten or natural uranium for neutron production through photonuclear reactions from the Bremsstrahlung radiation generated by 100-MeV electrons. The accelerator electron beam power is 100 KW. The neutron source intensity, spectrum, and spatial distribution have been studied as a function of the electron beam parameters to maximise the neutron yield and satisfy different engineering requirements. Physics, thermal-hydraulics, and thermal-stress analyses were performed and iterated to maximise the neutron source strength and to minimise the maximum temperature and the thermal stress in the target materials. The subcritical assembly is designed to obtain the highest possible neutron flux intensity with an effective neutron multiplication factor of <0.98. Different fuel and reflector materials are considered for the subcritical assembly design. The mechanical design of the facility has been developed to maximise its utility and minimise the time for replacing the target, fuel, and irradiation cassettes by using simple and efficient procedures. Shielding analyses were performed to define the dose map around the facility during operation as a function of the heavy concrete shield thickness. Safety, reliability and environmental considerations are included in the facility design. The facility is configured to accommodate future design upgrades and new missions. In addition, it has unique features relative to the other international

  10. NERI PROJECT 99-119. TASK 2. DATA-DRIVEN PREDICTION OF PROCESS VARIABLES. FINAL REPORT

    Energy Technology Data Exchange (ETDEWEB)

    Upadhyaya, B.R.

    2003-04-10

    This report describes the detailed results for task 2 of DOE-NERI project number 99-119 entitled ''Automatic Development of Highly Reliable Control Architecture for Future Nuclear Power Plants''. This project is a collaboration effort between the Oak Ridge National Laboratory (ORNL,) The University of Tennessee, Knoxville (UTK) and the North Carolina State University (NCSU). UTK is the lead organization for Task 2 under contract number DE-FG03-99SF21906. Under task 2 we completed the development of data-driven models for the characterization of sub-system dynamics for predicting state variables, control functions, and expected control actions. We have also developed the ''Principal Component Analysis (PCA)'' approach for mapping system measurements, and a nonlinear system modeling approach called the ''Group Method of Data Handling (GMDH)'' with rational functions, and includes temporal data information for transient characterization. The majority of the results are presented in detailed reports for Phases 1 through 3 of our research, which are attached to this report.

  11. Data-driven simulation methodology using DES 4-layer architecture

    Directory of Open Access Journals (Sweden)

    Aida Saez

    2016-05-01

    Full Text Available In this study, we present a methodology to build data-driven simulation models of manufacturing plants. We go further than other research proposals and we suggest focusing simulation model development under a 4-layer architecture (network, logic, database and visual reality. The Network layer includes system infrastructure. The Logic layer covers operations planning and control system, and material handling equipment system. The Database holds all the information needed to perform the simulation, the results used to analyze and the values that the Logic layer is using to manage the Plant. Finally, the Visual Reality displays an augmented reality system including not only the machinery and the movement but also blackboards and other Andon elements. This architecture provides numerous advantages as helps to build a simulation model that consistently considers the internal logistics, in a very flexible way.

  12. Parallel Landscape Driven Data Reduction & Spatial Interpolation Algorithm for Big LiDAR Data

    Directory of Open Access Journals (Sweden)

    Rahil Sharma

    2016-06-01

    Full Text Available Airborne Light Detection and Ranging (LiDAR topographic data provide highly accurate digital terrain information, which is used widely in applications like creating flood insurance rate maps, forest and tree studies, coastal change mapping, soil and landscape classification, 3D urban modeling, river bank management, agricultural crop studies, etc. In this paper, we focus mainly on the use of LiDAR data in terrain modeling/Digital Elevation Model (DEM generation. Technological advancements in building LiDAR sensors have enabled highly accurate and highly dense LiDAR point clouds, which have made possible high resolution modeling of terrain surfaces. However, high density data result in massive data volumes, which pose computing issues. Computational time required for dissemination, processing and storage of these data is directly proportional to the volume of the data. We describe a novel technique based on the slope map of the terrain, which addresses the challenging problem in the area of spatial data analysis, of reducing this dense LiDAR data without sacrificing its accuracy. To the best of our knowledge, this is the first ever landscape-driven data reduction algorithm. We also perform an empirical study, which shows that there is no significant loss in accuracy for the DEM generated from a 52% reduced LiDAR dataset generated by our algorithm, compared to the DEM generated from an original, complete LiDAR dataset. For the accuracy of our statistical analysis, we perform Root Mean Square Error (RMSE comparing all of the grid points of the original DEM to the DEM generated by reduced data, instead of comparing a few random control points. Besides, our multi-core data reduction algorithm is highly scalable. We also describe a modified parallel Inverse Distance Weighted (IDW spatial interpolation method and show that the DEMs it generates are time-efficient and have better accuracy than the one’s generated by the traditional IDW method.

  13. Full field reservoir modeling of shale assets using advanced data-driven analytics

    Directory of Open Access Journals (Sweden)

    Soodabeh Esmaili

    2016-01-01

    Full Text Available Hydrocarbon production from shale has attracted much attention in the recent years. When applied to this prolific and hydrocarbon rich resource plays, our understanding of the complexities of the flow mechanism (sorption process and flow behavior in complex fracture systems - induced or natural leaves much to be desired. In this paper, we present and discuss a novel approach to modeling, history matching of hydrocarbon production from a Marcellus shale asset in southwestern Pennsylvania using advanced data mining, pattern recognition and machine learning technologies. In this new approach instead of imposing our understanding of the flow mechanism, the impact of multi-stage hydraulic fractures, and the production process on the reservoir model, we allow the production history, well log, completion and hydraulic fracturing data to guide our model and determine its behavior. The uniqueness of this technology is that it incorporates the so-called “hard data” directly into the reservoir model, so that the model can be used to optimize the hydraulic fracture process. The “hard data” refers to field measurements during the hydraulic fracturing process such as fluid and proppant type and amount, injection pressure and rate as well as proppant concentration. This novel approach contrasts with the current industry focus on the use of “soft data” (non-measured, interpretive data such as frac length, width, height and conductivity in the reservoir models. The study focuses on a Marcellus shale asset that includes 135 wells with multiple pads, different landing targets, well length and reservoir properties. The full field history matching process was successfully completed using this data driven approach thus capturing the production behavior with acceptable accuracy for individual wells and for the entire asset.

  14. Data driven information system for supervision of judicial open

    Directory of Open Access Journals (Sweden)

    Ming LI

    2016-08-01

    Full Text Available Aiming at the four outstanding problems of informationized supervision for judicial publicity, the judicial public data is classified based on data driven to form the finally valuable data. Then, the functional structure, technical structure and business structure of the data processing system are put forward, including data collection module, data reduction module, data analysis module, data application module and data security module, etc. The development of the data processing system based on these structures can effectively reduce work intensity of judicial open iformation management, summarize the work state, find the problems, and promote the level of judicial publicity.

  15. A new approach to configurable primary data collection.

    Science.gov (United States)

    Stanek, J; Babkin, E; Zubov, M

    2016-09-01

    The formats, semantics and operational rules of data processing tasks in genomics (and health in general) are highly divergent and can rapidly change. In such an environment, the problem of consistent transformation and loading of heterogeneous input data to various target repositories becomes a critical success factor. The objective of the project was to design a new conceptual approach to configurable data transformation, de-identification, and submission of health and genomic data sets. Main motivation was to facilitate automated or human-driven data uploading, as well as consolidation of heterogeneous sources in large genomic or health projects. Modern methods of on-demand specialization of generic software components were applied. For specification of input-output data and required data collection activities, we propose a simple data model of flat tables as well as a domain-oriented graphical interface and portable representation of transformations in XML. Using such methods, the prototype of the Configurable Data Collection System (CDCS) was implemented in Java programming language with Swing graphical interfaces. The core logic of transformations was implemented as a library of reusable plugins. The solution is implemented as a software prototype for a configurable service-oriented system for semi-automatic data collection, transformation, sanitization and safe uploading to heterogeneous data repositories-CDCS. To address the dynamic nature of data schemas and data collection processes, the CDCS prototype facilitates interactive, user-driven configuration of the data collection process and extends basic functionality with a wide range of third-party plugins. Notably, our solution also allows for the reduction of manual data entry for data originally missing in the output data sets. First experiments and feedback from domain experts confirm the prototype is flexible, configurable and extensible; runs well on data owner's systems; and is not dependent on

  16. Building Data-Driven Pathways From Routinely Collected Hospital Data: A Case Study on Prostate Cancer

    Science.gov (United States)

    Clark, Jeremy; Cooper, Colin S; Mills, Robert; Rayward-Smith, Victor J; de la Iglesia, Beatriz

    2015-01-01

    Background Routinely collected data in hospitals is complex, typically heterogeneous, and scattered across multiple Hospital Information Systems (HIS). This big data, created as a byproduct of health care activities, has the potential to provide a better understanding of diseases, unearth hidden patterns, and improve services and cost. The extent and uses of such data rely on its quality, which is not consistently checked, nor fully understood. Nevertheless, using routine data for the construction of data-driven clinical pathways, describing processes and trends, is a key topic receiving increasing attention in the literature. Traditional algorithms do not cope well with unstructured processes or data, and do not produce clinically meaningful visualizations. Supporting systems that provide additional information, context, and quality assurance inspection are needed. Objective The objective of the study is to explore how routine hospital data can be used to develop data-driven pathways that describe the journeys that patients take through care, and their potential uses in biomedical research; it proposes a framework for the construction, quality assessment, and visualization of patient pathways for clinical studies and decision support using a case study on prostate cancer. Methods Data pertaining to prostate cancer patients were extracted from a large UK hospital from eight different HIS, validated, and complemented with information from the local cancer registry. Data-driven pathways were built for each of the 1904 patients and an expert knowledge base, containing rules on the prostate cancer biomarker, was used to assess the completeness and utility of the pathways for a specific clinical study. Software components were built to provide meaningful visualizations for the constructed pathways. Results The proposed framework and pathway formalism enable the summarization, visualization, and querying of complex patient-centric clinical information, as well as the

  17. High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination

    International Nuclear Information System (INIS)

    Bouchard, Kristofer E.

    2016-01-01

    A lack of coherent plans to analyze, manage, and understand data threatens the various opportunities offered by new neuro-technologies. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.

  18. Product design pattern based on big data-driven scenario

    OpenAIRE

    Conggang Yu; Lusha Zhu

    2016-01-01

    This article discusses about new product design patterns in the big data era, gives designer a new rational thinking way, and is a new way to understand the design of the product. Based on the key criteria of the product design process, category, element, and product are used to input the data, which comprises concrete data and abstract data as an enlargement of the criteria of product design process for the establishment of a big data-driven product design pattern’s model. Moreover, an exper...

  19. Exploring Techniques of Developing Writing Skill in IELTS Preparatory Courses: A Data-Driven Study

    Science.gov (United States)

    Ostovar-Namaghi, Seyyed Ali; Safaee, Seyyed Esmail

    2017-01-01

    Being driven by the hypothetico-deductive mode of inquiry, previous studies have tested the effectiveness of theory-driven interventions under controlled experimental conditions to come up with universally applicable generalizations. To make a case in the opposite direction, this data-driven study aims at uncovering techniques and strategies…

  20. Testing the Accuracy of Data-driven MHD Simulations of Active Region Evolution

    Energy Technology Data Exchange (ETDEWEB)

    Leake, James E.; Linton, Mark G. [U.S. Naval Research Laboratory, 4555 Overlook Avenue, SW, Washington, DC 20375 (United States); Schuck, Peter W., E-mail: james.e.leake@nasa.gov [NASA Goddard Space Flight Center, 8800 Greenbelt Road, Greenbelt, MD 20771 (United States)

    2017-04-01

    Models for the evolution of the solar coronal magnetic field are vital for understanding solar activity, yet the best measurements of the magnetic field lie at the photosphere, necessitating the development of coronal models which are “data-driven” at the photosphere. We present an investigation to determine the feasibility and accuracy of such methods. Our validation framework uses a simulation of active region (AR) formation, modeling the emergence of magnetic flux from the convection zone to the corona, as a ground-truth data set, to supply both the photospheric information and to perform the validation of the data-driven method. We focus our investigation on how the accuracy of the data-driven model depends on the temporal frequency of the driving data. The Helioseismic and Magnetic Imager on NASA’s Solar Dynamics Observatory produces full-disk vector magnetic field measurements at a 12-minute cadence. Using our framework we show that ARs that emerge over 25 hr can be modeled by the data-driving method with only ∼1% error in the free magnetic energy, assuming the photospheric information is specified every 12 minutes. However, for rapidly evolving features, under-sampling of the dynamics at this cadence leads to a strobe effect, generating large electric currents and incorrect coronal morphology and energies. We derive a sampling condition for the driving cadence based on the evolution of these small-scale features, and show that higher-cadence driving can lead to acceptable errors. Future work will investigate the source of errors associated with deriving plasma variables from the photospheric magnetograms as well as other sources of errors, such as reduced resolution, instrument bias, and noise.

  1. Data-driven importance distributions for articulated tracking

    DEFF Research Database (Denmark)

    Hauberg, Søren; Pedersen, Kim Steenstrup

    2011-01-01

    We present two data-driven importance distributions for particle filterbased articulated tracking; one based on background subtraction, another on depth information. In order to keep the algorithms efficient, we represent human poses in terms of spatial joint positions. To ensure constant bone le...... filter, where they improve both accuracy and efficiency of the tracker. In fact, they triple the effective number of samples compared to the most commonly used importance distribution at little extra computational cost....

  2. An Open Framework for Dynamic Big-data-driven Application Systems (DBDDAS) Development

    KAUST Repository

    Douglas, Craig

    2014-01-01

    In this paper, we outline key features that dynamic data-driven application systems (DDDAS) have. A DDDAS is an application that has data assimilation that can change the models and/or scales of the computation and that the application controls the data collection based on the computational results. The term Big Data (BD) has come into being in recent years that is highly applicable to most DDDAS since most applications use networks of sensors that generate an overwhelming amount of data in the lifespan of the application runs. We describe what a dynamic big-data-driven application system (DBDDAS) toolkit must have in order to provide all of the essential building blocks that are necessary to easily create new DDDAS without re-inventing the building blocks.

  3. An Open Framework for Dynamic Big-data-driven Application Systems (DBDDAS) Development

    KAUST Repository

    Douglas, Craig

    2014-06-06

    In this paper, we outline key features that dynamic data-driven application systems (DDDAS) have. A DDDAS is an application that has data assimilation that can change the models and/or scales of the computation and that the application controls the data collection based on the computational results. The term Big Data (BD) has come into being in recent years that is highly applicable to most DDDAS since most applications use networks of sensors that generate an overwhelming amount of data in the lifespan of the application runs. We describe what a dynamic big-data-driven application system (DBDDAS) toolkit must have in order to provide all of the essential building blocks that are necessary to easily create new DDDAS without re-inventing the building blocks.

  4. EXPLORING DATA-DRIVEN SPECTRAL MODELS FOR APOGEE M DWARFS

    Science.gov (United States)

    Lua Birky, Jessica; Hogg, David; Burgasser, Adam J.; Jessica Birky

    2018-01-01

    The Cannon (Ness et al. 2015; Casey et al. 2016) is a flexible, data-driven spectral modeling and parameter inference framework, demonstrated on high-resolution Apache Point Galactic Evolution Experiment (APOGEE; λ/Δλ~22,500, 1.5-1.7µm) spectra of giant stars to estimate stellar labels (Teff, logg, [Fe/H], and chemical abundances) to precisions higher than the model-grid pipeline. The lack of reliable stellar parameters reported by the APOGEE pipeline for temperatures less than ~3550K, motivates extension of this approach to M dwarf stars. Using a training set of 51 M dwarfs with spectral types ranging M0-M9 obtained from SDSS optical spectra, we demonstrate that the Cannon can infer spectral types to a precision of +/-0.6 types, making it an effective tool for classifying high-resolution near-infrared spectra. We discuss the potential for extending this work to determine the physical stellar labels Teff, logg, and [Fe/H].This work is supported by the SDSS Faculty and Student (FAST) initiative.

  5. Minimizing cache misses in an event-driven network server: A case study of TUX

    DEFF Research Database (Denmark)

    Bhatia, Sapan; Consel, Charles; Lawall, Julia Laetitia

    2006-01-01

    We analyze the performance of CPU-bound network servers and demonstrate experimentally that the degradation in the performance of these servers under high-concurrency workloads is largely due to inefficient use of the hardware caches. We then describe an approach to speeding up event-driven network...... servers by optimizing their use of the L2 CPU cache in the context of the TUX Web server, known for its robustness to heavy load. Our approach is based on a novel cache-aware memory allocator and a specific scheduling strategy that together ensure that the total working data set of the server stays...

  6. Monitoring a robot swarm using a data-driven fault detection approach

    KAUST Repository

    Khaldi, Belkacem

    2017-06-30

    Using swarm robotics system, with one or more faulty robots, to accomplish specific tasks may lead to degradation in performances complying with the target requirements. In such circumstances, robot swarms require continuous monitoring to detect abnormal events and to sustain normal operations. In this paper, an innovative exogenous fault detection method for monitoring robots swarm is presented. The method merges the flexibility of principal component analysis (PCA) models and the greater sensitivity of the exponentially-weighted moving average (EWMA) and cumulative sum (CUSUM) control charts to insidious changes. The method is tested and evaluated on a swarm of simulated foot-bot robots performing a circle formation task, via the viscoelastic control model. We illustrate through simulated data collected from the ARGoS simulator that a significant improvement in fault detection can be obtained by using the proposed method where compared to the conventional PCA-based methods (i.e., T2 and Q).

  7. Experimental Study of a natural ventilation strategy in a Full-Scale Enclosure Under Meteorological Conditions: A Buoyancy-Driven Approach

    OpenAIRE

    Austin, Miguel Chen; Bruneau, Denis; Sempey, Alain; Mora, Laurent; Sommier, Alain

    2018-01-01

    The performance of a natural ventilation strategy, in a full-scale enclosure under meteorological conditions is studied through an experimental study, a buoyancy-driven approach, by means of the estimation of the air exchange rate per hour and ventilation power. A theoretical and an empirical model are proposed based on the airflow theory in buildings and blower-door tests. A preliminary validation, by comparing our results with standards in air leakage rate determination, is made. The experi...

  8. The Effects of Open Enrollment, Curriculum Alignment, and Data-Driven Instruction on the Test Performance of English Language Learners (ELLs) and Re-Designated Fluent English Proficient Students (RFEPs) at Shangri-La High School

    Science.gov (United States)

    Miles, Eva

    2013-01-01

    The purpose of this study was to examine the impact of open enrollment, curriculum alignment, and data-driven instruction on the test performance of English Language Learners (ELLs) and Re-designated Fluent English Proficient students (RFEPs) at Shangri-la High School. Participants of this study consisted of the student population enrolled in…

  9. Idiopathic Pulmonary Fibrosis: Data-driven Textural Analysis of Extent of Fibrosis at Baseline and 15-Month Follow-up.

    Science.gov (United States)

    Humphries, Stephen M; Yagihashi, Kunihiro; Huckleberry, Jason; Rho, Byung-Hak; Schroeder, Joyce D; Strand, Matthew; Schwarz, Marvin I; Flaherty, Kevin R; Kazerooni, Ella A; van Beek, Edwin J R; Lynch, David A

    2017-10-01

    Purpose To evaluate associations between pulmonary function and both quantitative analysis and visual assessment of thin-section computed tomography (CT) images at baseline and at 15-month follow-up in subjects with idiopathic pulmonary fibrosis (IPF). Materials and Methods This retrospective analysis of preexisting anonymized data, collected prospectively between 2007 and 2013 in a HIPAA-compliant study, was exempt from additional institutional review board approval. The extent of lung fibrosis at baseline inspiratory chest CT in 280 subjects enrolled in the IPF Network was evaluated. Visual analysis was performed by using a semiquantitative scoring system. Computer-based quantitative analysis included CT histogram-based measurements and a data-driven textural analysis (DTA). Follow-up CT images in 72 of these subjects were also analyzed. Univariate comparisons were performed by using Spearman rank correlation. Multivariate and longitudinal analyses were performed by using a linear mixed model approach, in which models were compared by using asymptotic χ 2 tests. Results At baseline, all CT-derived measures showed moderate significant correlation (P pulmonary function. At follow-up CT, changes in DTA scores showed significant correlation with changes in both forced vital capacity percentage predicted (ρ = -0.41, P pulmonary function (P fibrosis at CT yields an index of severity that correlates with visual assessment and functional change in subjects with IPF. © RSNA, 2017.

  10. Performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data.

    Science.gov (United States)

    Yelland, Lisa N; Salter, Amy B; Ryan, Philip

    2011-10-15

    Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.

  11. A Gold Standards Approach to Training Instructors to Evaluate Crew Performance

    Science.gov (United States)

    Baker, David P.; Dismukes, R. Key

    2003-01-01

    The Advanced Qualification Program requires that airlines evaluate crew performance in Line Oriented Simulation. For this evaluation to be meaningful, instructors must observe relevant crew behaviors and evaluate those behaviors consistently and accurately against standards established by the airline. The airline industry has largely settled on an approach in which instructors evaluate crew performance on a series of event sets, using standardized grade sheets on which behaviors specific to event set are listed. Typically, new instructors are given a class in which they learn to use the grade sheets and practice evaluating crew performance observed on videotapes. These classes emphasize reliability, providing detailed instruction and practice in scoring so that all instructors within a given class will give similar scores to similar performance. This approach has value but also has important limitations; (1) ratings within one class of new instructors may differ from those of other classes; (2) ratings may not be driven primarily by the specific behaviors on which the company wanted the crews to be scored; and (3) ratings may not be calibrated to company standards for level of performance skill required. In this paper we provide a method to extend the existing method of training instructors to address these three limitations. We call this method the "gold standards" approach because it uses ratings from the company's most experienced instructors as the basis for training rater accuracy. This approach ties the training to the specific behaviors on which the experienced instructors based their ratings.

  12. Enhanced Component Performance Study: Turbine-Driven Pumps 1998–2014

    Energy Technology Data Exchange (ETDEWEB)

    Schroeder, John Alton [Idaho National Lab. (INL), Idaho Falls, ID (United States)

    2015-11-01

    This report presents an enhanced performance evaluation of turbine-driven pumps (TDPs) at U.S. commercial nuclear power plants. The data used in this study are based on the operating experience failure reports from fiscal year 1998 through 2014 for the component reliability as reported in the Institute of Nuclear Power Operations (INPO) Consolidated Events Database (ICES). The TDP failure modes considered are failure to start (FTS), failure to run less than or equal to one hour (FTR=1H), failure to run more than one hour (FTR>1H), and normally running systems FTS and failure to run (FTR). The component reliability estimates and the reliability data are trended for the most recent 10-year period while yearly estimates for reliability are provided for the entire active period. Statistically significant increasing trends were identified for TDP unavailability, for frequency of start demands for standby TDPs, and for run hours in the first hour after start. Statistically significant decreasing trends were identified for start demands for normally running TDPs, and for run hours per reactor critical year for normally running TDPs.

  13. Data-Driven Model Order Reduction for Bayesian Inverse Problems

    KAUST Repository

    Cui, Tiangang

    2014-01-06

    One of the major challenges in using MCMC for the solution of inverse problems is the repeated evaluation of computationally expensive numerical models. We develop a data-driven projection- based model order reduction technique to reduce the computational cost of numerical PDE evaluations in this context.

  14. Robust Data-Driven Inference for Density-Weighted Average Derivatives

    DEFF Research Database (Denmark)

    Cattaneo, Matias D.; Crump, Richard K.; Jansson, Michael

    This paper presents a new data-driven bandwidth selector compatible with the small bandwidth asymptotics developed in Cattaneo, Crump, and Jansson (2009) for density- weighted average derivatives. The new bandwidth selector is of the plug-in variety, and is obtained based on a mean squared error...

  15. Data-driven modelling of LTI systems using symbolic regression

    NARCIS (Netherlands)

    Khandelwal, D.; Toth, R.; Van den Hof, P.M.J.

    2017-01-01

    The aim of this project is to automate the task of data-driven identification of dynamical systems. The underlying goal is to develop an identification tool that models a physical system without distinguishing between classes of systems such as linear, nonlinear or possibly even hybrid systems. Such

  16. Dynamic Data-Driven Prediction of Lean Blowout in a Swirl-Stabilized Combustor

    Directory of Open Access Journals (Sweden)

    Soumalya Sarkar

    2015-09-01

    Full Text Available This paper addresses dynamic data-driven prediction of lean blowout (LBO phenomena in confined combustion processes, which are prevalent in many physical applications (e.g., land-based and aircraft gas-turbine engines. The underlying concept is built upon pattern classification and is validated for LBO prediction with time series of chemiluminescence sensor data from a laboratory-scale swirl-stabilized dump combustor. The proposed method of LBO prediction makes use of the theory of symbolic dynamics, where (finite-length time series data are partitioned to produce symbol strings that, in turn, generate a special class of probabilistic finite state automata (PFSA. These PFSA, called D-Markov machines, have a deterministic algebraic structure and their states are represented by symbol blocks of length D or less, where D is a positive integer. The D-Markov machines are constructed in two steps: (i state splitting, i.e., the states are split based on their information contents, and (ii state merging, i.e., two or more states (of possibly different lengths are merged together to form a new state without any significant loss of the embedded information. The modeling complexity (e.g., number of states of a D-Markov machine model is observed to be drastically reduced as the combustor approaches LBO. An anomaly measure, based on Kullback-Leibler divergence, is constructed to predict the proximity of LBO. The problem of LBO prediction is posed in a pattern classification setting and the underlying algorithms have been tested on experimental data at different extents of fuel-air premixing and fuel/air ratio. It is shown that, over a wide range of fuel-air premixing, D-Markov machines with D > 1 perform better as predictors of LBO than those with D = 1.

  17. Off-equatorial current-driven instabilities ahead of approaching dipolarization fronts

    Science.gov (United States)

    Zhang, Xu; Angelopoulos, V.; Pritchett, P. L.; Liu, Jiang

    2017-05-01

    Recent kinetic simulations have revealed that electromagnetic instabilities near the ion gyrofrequency and slightly away from the equatorial plane can be driven by a current parallel to the magnetic field prior to the arrival of dipolarization fronts. Such instabilities are important because of their potential contribution to global electromagnetic energy conversion near dipolarization fronts. Of the several instabilities that may be consistent with such waves, the most notable are the current-driven electromagnetic ion cyclotron instability and the current-driven kink-like instability. To confirm the existence and characteristics of these instabilities, we used observations by two Time History of Events and Macroscale Interactions during Substorms satellites, one near the neutral sheet observing dipolarization fronts and the other at the boundary layer observing precursor waves and currents. We found that such instabilities with monochromatic signatures are rare, but one of the few cases was selected for further study. Two different instabilities, one at about 0.3 Hz and the other at a much lower frequency, 0.02 Hz, were seen in the data from the off-equatorial spacecraft. A parallel current attributed to an electron beam coexisted with the waves. Our instability analysis attributes the higher-frequency instability to a current-driven ion cyclotron instability and the lower frequency instability to a kink-like instability. The current-driven kink-like instability we observed is consistent with the instabilities observed in the simulation. We suggest that the currents needed to excite these low-frequency instabilities are so intense that the associated electron beams are easily thermalized and hence difficult to observe.

  18. Data driven fault detection and isolation: a wind turbine scenario

    Directory of Open Access Journals (Sweden)

    Rubén Francisco Manrique Piramanrique

    2015-04-01

    Full Text Available One of the greatest drawbacks in wind energy generation is the high maintenance cost associated to mechanical faults. This problem becomes more evident in utility scale wind turbines, where the increased size and nominal capacity comes with additional problems associated with structural vibrations and aeroelastic effects in the blades. Due to the increased operation capability, it is imperative to detect system degradation and faults in an efficient manner, maintaining system integrity, reliability and reducing operation costs. This paper presents a comprehensive comparison of four different Fault Detection and Isolation (FDI filters based on “Data Driven” (DD techniques. In order to enhance FDI performance, a multi-level strategy is used where:  the first level detects the occurrence of any given fault (detection, while  the second identifies the source of the fault (isolation. Four different DD classification techniques (namely Support Vector Machines, Artificial Neural Networks, K Nearest Neighbors and Gaussian Mixture Models were studied and compared for each of the proposed classification levels. The best strategy at each level could be selected to build the final data driven FDI system. The performance of the proposed scheme is evaluated on a benchmark model of a commercial wind turbine. 

  19. Towards Building a High Performance Spatial Query System for Large Scale Medical Imaging Data.

    Science.gov (United States)

    Aji, Ablimit; Wang, Fusheng; Saltz, Joel H

    2012-11-06

    Support of high performance queries on large volumes of scientific spatial data is becoming increasingly important in many applications. This growth is driven by not only geospatial problems in numerous fields, but also emerging scientific applications that are increasingly data- and compute-intensive. For example, digital pathology imaging has become an emerging field during the past decade, where examination of high resolution images of human tissue specimens enables more effective diagnosis, prediction and treatment of diseases. Systematic analysis of large-scale pathology images generates tremendous amounts of spatially derived quantifications of micro-anatomic objects, such as nuclei, blood vessels, and tissue regions. Analytical pathology imaging provides high potential to support image based computer aided diagnosis. One major requirement for this is effective querying of such enormous amount of data with fast response, which is faced with two major challenges: the "big data" challenge and the high computation complexity. In this paper, we present our work towards building a high performance spatial query system for querying massive spatial data on MapReduce. Our framework takes an on demand index building approach for processing spatial queries and a partition-merge approach for building parallel spatial query pipelines, which fits nicely with the computing model of MapReduce. We demonstrate our framework on supporting multi-way spatial joins for algorithm evaluation and nearest neighbor queries for microanatomic objects. To reduce query response time, we propose cost based query optimization to mitigate the effect of data skew. Our experiments show that the framework can efficiently support complex analytical spatial queries on MapReduce.

  20. Emotion-driven level generation

    OpenAIRE

    Togelius, Julian; Yannakakis, Georgios N.

    2016-01-01

    This chapter examines the relationship between emotions and level generation. Grounded in the experience-driven procedural content generation framework we focus on levels and introduce a taxonomy of approaches for emotion-driven level generation. We then review four characteristic level generators of our earlier work that exemplify each one of the approaches introduced. We conclude the chapter with our vision on the future of emotion-driven level generation.

  1. An applicable approach for performance auditing in ERP

    Directory of Open Access Journals (Sweden)

    Wan Jian Guo

    2016-01-01

    Full Text Available This paper aims at the realistic problem of performance auditing in ERP environment. Traditional performance auditing methods and existing approaches for performance evaluation of ERP implementation could not work well, because they are either difficult to work or contains certain subjective elements. This paper proposed an applicable performance auditing approach for SAP ERP based on quantitative analysis. This approach consists of 3 parts which are system utilization, data quality and the effectiveness of system control. In each part, we provide the main process to conduct the operation, especially how to calculate the online settlement rate of SAP system. This approach has played an important role in the practical auditing work. A practical case is provided at the end of this paper to describe the effectiveness of this approach. Implementation of this approach also has some significance to the performance auditing of other ERP products.

  2. Striving for Excellence Sometimes Hinders High Achievers: Performance-Approach Goals Deplete Arithmetical Performance in Students with High Working Memory Capacity

    Science.gov (United States)

    Crouzevialle, Marie; Smeding, Annique; Butera, Fabrizio

    2015-01-01

    We tested whether the goal to attain normative superiority over other students, referred to as performance-approach goals, is particularly distractive for high-Working Memory Capacity (WMC) students—that is, those who are used to being high achievers. Indeed, WMC is positively related to high-order cognitive performance and academic success, a record of success that confers benefits on high-WMC as compared to low-WMC students. We tested whether such benefits may turn out to be a burden under performance-approach goal pursuit. Indeed, for high achievers, aiming to rise above others may represent an opportunity to reaffirm their positive status—a stake susceptible to trigger disruptive outcome concerns that interfere with task processing. Results revealed that with performance-approach goals—as compared to goals with no emphasis on social comparison—the higher the students’ WMC, the lower their performance at a complex arithmetic task (Experiment 1). Crucially, this pattern appeared to be driven by uncertainty regarding the chances to outclass others (Experiment 2). Moreover, an accessibility measure suggested the mediational role played by status-related concerns in the observed disruption of performance. We discuss why high-stake situations can paradoxically lead high-achievers to sub-optimally perform when high-order cognitive performance is at play. PMID:26407097

  3. A priori data-driven multi-clustered reservoir generation algorithm for echo state network.

    Directory of Open Access Journals (Sweden)

    Xiumin Li

    Full Text Available Echo state networks (ESNs with multi-clustered reservoir topology perform better in reservoir computing and robustness than those with random reservoir topology. However, these ESNs have a complex reservoir topology, which leads to difficulties in reservoir generation. This study focuses on the reservoir generation problem when ESN is used in environments with sufficient priori data available. Accordingly, a priori data-driven multi-cluster reservoir generation algorithm is proposed. The priori data in the proposed algorithm are used to evaluate reservoirs by calculating the precision and standard deviation of ESNs. The reservoirs are produced using the clustering method; only the reservoir with a better evaluation performance takes the place of a previous one. The final reservoir is obtained when its evaluation score reaches the preset requirement. The prediction experiment results obtained using the Mackey-Glass chaotic time series show that the proposed reservoir generation algorithm provides ESNs with extra prediction precision and increases the structure complexity of the network. Further experiments also reveal the appropriate values of the number of clusters and time window size to obtain optimal performance. The information entropy of the reservoir reaches the maximum when ESN gains the greatest precision.

  4. Enabling High-performance Interactive Geoscience Data Analysis Through Data Placement and Movement Optimization

    Science.gov (United States)

    Zhu, F.; Yu, H.; Rilee, M. L.; Kuo, K. S.; Yu, L.; Pan, Y.; Jiang, H.

    2017-12-01

    Since the establishment of data archive centers and the standardization of file formats, scientists are required to search metadata catalogs for data needed and download the data files to their local machines to carry out data analysis. This approach has facilitated data discovery and access for decades, but it inevitably leads to data transfer from data archive centers to scientists' computers through low-bandwidth Internet connections. Data transfer becomes a major performance bottleneck in such an approach. Combined with generally constrained local compute/storage resources, they limit the extent of scientists' studies and deprive them of timely outcomes. Thus, this conventional approach is not scalable with respect to both the volume and variety of geoscience data. A much more viable solution is to couple analysis and storage systems to minimize data transfer. In our study, we compare loosely coupled approaches (exemplified by Spark and Hadoop) and tightly coupled approaches (exemplified by parallel distributed database management systems, e.g., SciDB). In particular, we investigate the optimization of data placement and movement to effectively tackle the variety challenge, and boost the popularization of parallelization to address the volume challenge. Our goal is to enable high-performance interactive analysis for a good portion of geoscience data analysis exercise. We show that tightly coupled approaches can concentrate data traffic between local storage systems and compute units, and thereby optimizing bandwidth utilization to achieve a better throughput. Based on our observations, we develop a geoscience data analysis system that tightly couples analysis engines with storages, which has direct access to the detailed map of data partition locations. Through an innovation data partitioning and distribution scheme, our system has demonstrated scalable and interactive performance in real-world geoscience data analysis applications.

  5. Data warehouse model for monitoring key performance indicators (KPIs) using goal oriented approach

    Science.gov (United States)

    Abdullah, Mohammed Thajeel; Ta'a, Azman; Bakar, Muhamad Shahbani Abu

    2016-08-01

    The growth and development of universities, just as other organizations, depend on their abilities to strategically plan and implement development blueprints which are in line with their vision and mission statements. The actualizations of these statements, which are often designed into goals and sub-goals and linked to their respective actors are better measured by defining key performance indicators (KPIs) of the university. The proposes ReGADaK, which is an extended the GRAnD approach highlights the facts, dimensions, attributes, measures and KPIs of the organization. The measures from the goal analysis of this unit serve as the basis of developing the related university's KPIs. The proposed data warehouse schema is evaluated through expert review, prototyping and usability evaluation. The findings from the evaluation processes suggest that the proposed data warehouse schema is suitable for monitoring the University's KPIs.

  6. Using the Dynamic Model to develop an evidence-based and theory-driven approach to school improvement

    NARCIS (Netherlands)

    Creemers, B.P.M.; Kyriakides, L.

    2010-01-01

    This paper refers to a dynamic perspective of educational effectiveness and improvement stressing the importance of using an evidence-based and theory-driven approach. Specifically, an approach to school improvement based on the dynamic model of educational effectiveness is offered. The recommended

  7. Data-Driven Modeling of Complex Systems by means of a Dynamical ANN

    Science.gov (United States)

    Seleznev, A.; Mukhin, D.; Gavrilov, A.; Loskutov, E.; Feigin, A.

    2017-12-01

    The data-driven methods for modeling and prognosis of complex dynamical systems become more and more popular in various fields due to growth of high-resolution data. We distinguish the two basic steps in such an approach: (i) determining the phase subspace of the system, or embedding, from available time series and (ii) constructing an evolution operator acting in this reduced subspace. In this work we suggest a novel approach combining these two steps by means of construction of an artificial neural network (ANN) with special topology. The proposed ANN-based model, on the one hand, projects the data onto a low-dimensional manifold, and, on the other hand, models a dynamical system on this manifold. Actually, this is a recurrent multilayer ANN which has internal dynamics and capable of generating time series. Very important point of the proposed methodology is the optimization of the model allowing us to avoid overfitting: we use Bayesian criterion to optimize the ANN structure and estimate both the degree of evolution operator nonlinearity and the complexity of nonlinear manifold which the data are projected on. The proposed modeling technique will be applied to the analysis of high-dimensional dynamical systems: Lorenz'96 model of atmospheric turbulence, producing high-dimensional space-time chaos, and quasi-geostrophic three-layer model of the Earth's atmosphere with the natural orography, describing the dynamics of synoptical vortexes as well as mesoscale blocking systems. The possibility of application of the proposed methodology to analyze real measured data is also discussed. The study was supported by the Russian Science Foundation (grant #16-12-10198).

  8. Management and Nonlinear Analysis of Disinfection System of Water Distribution Networks Using Data Driven Methods

    Directory of Open Access Journals (Sweden)

    Mohammad Zounemat-Kermani

    2018-03-01

    Full Text Available Chlorination unit is widely used to supply safe drinking water and removal of pathogens from water distribution networks. Data-driven approach is one appropriate method for analyzing performance of chlorine in water supply network. In this study, multi-layer perceptron neural network (MLP with three training algorithms (gradient descent, conjugate gradient and BFGS and support vector machine (SVM with RBF kernel function were used to predict the concentration of residual chlorine in water supply networks of Ahmadabad Dafeh and Ahruiyeh villages in Kerman Province. Daily data including discharge (flow, chlorine consumption and residual chlorine were employed from the beginning of 1391 Hijri until the end of 1393 Hijri (for 3 years. To assess the performance of studied models, the criteria such as Nash-Sutcliffe efficiency (NS, root mean square error (RMSE, mean absolute percentage error (MAPE and correlation coefficient (CORR were used that in best modeling situation were 0.9484, 0.0255, 1.081, and 0.974 respectively which resulted from BFGS algorithm. The criteria indicated that MLP model with BFGS and conjugate gradient algorithms were better than all other models in 90 and 10 percent of cases respectively; while the MLP model based on gradient descent algorithm and the SVM model were better in none of the cases. According to the results of this study, proper management of chlorine concentration can be implemented by predicted values of residual chlorine in water supply network. Thus, decreased performance of perceptron network and support vector machine in water supply network of Ahruiyeh in comparison to Ahmadabad Dafeh can be inferred from improper management of chlorination.

  9. Nursing Theory, Terminology, and Big Data: Data-Driven Discovery of Novel Patterns in Archival Randomized Clinical Trial Data.

    Science.gov (United States)

    Monsen, Karen A; Kelechi, Teresa J; McRae, Marion E; Mathiason, Michelle A; Martin, Karen S

    The growth and diversification of nursing theory, nursing terminology, and nursing data enable a convergence of theory- and data-driven discovery in the era of big data research. Existing datasets can be viewed through theoretical and terminology perspectives using visualization techniques in order to reveal new patterns and generate hypotheses. The Omaha System is a standardized terminology and metamodel that makes explicit the theoretical perspective of the nursing discipline and enables terminology-theory testing research. The purpose of this paper is to illustrate the approach by exploring a large research dataset consisting of 95 variables (demographics, temperature measures, anthropometrics, and standardized instruments measuring quality of life and self-efficacy) from a theory-based perspective using the Omaha System. Aims were to (a) examine the Omaha System dataset to understand the sample at baseline relative to Omaha System problem terms and outcome measures, (b) examine relationships within the normalized Omaha System dataset at baseline in predicting adherence, and (c) examine relationships within the normalized Omaha System dataset at baseline in predicting incident venous ulcer. Variables from a randomized clinical trial of a cryotherapy intervention for the prevention of venous ulcers were mapped onto Omaha System terms and measures to derive a theoretical framework for the terminology-theory testing study. The original dataset was recoded using the mapping to create an Omaha System dataset, which was then examined using visualization to generate hypotheses. The hypotheses were tested using standard inferential statistics. Logistic regression was used to predict adherence and incident venous ulcer. Findings revealed novel patterns in the psychosocial characteristics of the sample that were discovered to be drivers of both adherence (Mental health Behavior: OR = 1.28, 95% CI [1.02, 1.60]; AUC = .56) and incident venous ulcer (Mental health Behavior

  10. Data Albums: An Event Driven Search, Aggregation and Curation Tool for Earth Science

    Science.gov (United States)

    Ramachandran, Rahul; Kulkarni, Ajinkya; Maskey, Manil; Bakare, Rohan; Basyal, Sabin; Li, Xiang; Flynn, Shannon

    2014-01-01

    Approaches used in Earth science research such as case study analysis and climatology studies involve discovering and gathering diverse data sets and information to support the research goals. To gather relevant data and information for case studies and climatology analysis is both tedious and time consuming. Current Earth science data systems are designed with the assumption that researchers access data primarily by instrument or geophysical parameter. In cases where researchers are interested in studying a significant event, they have to manually assemble a variety of datasets relevant to it by searching the different distributed data systems. This paper presents a specialized search, aggregation and curation tool for Earth science to address these challenges. The search rool automatically creates curated 'Data Albums', aggregated collections of information related to a specific event, containing links to relevant data files [granules] from different instruments, tools and services for visualization and analysis, and information about the event contained in news reports, images or videos to supplement research analysis. Curation in the tool is driven via an ontology based relevancy ranking algorithm to filter out non relevant information and data.

  11. Overview of nonlinear theory of kinetically driven instabilities

    International Nuclear Information System (INIS)

    Berk, H.L.; Breizman, B.N.

    1998-09-01

    An overview is presented of the theory for the nonlinear behavior of instabilities driven by the resonant wave particle interaction. The approach should be applicable to a wide variety of kinetic systems in magnetic fusion devices and accelerators. Here the authors emphasize application to Alfven were driven instability, and the principles of the theory are used to interpret experimental data

  12. NASA Reverb: Standards-Driven Earth Science Data and Service Discovery

    Science.gov (United States)

    Cechini, M. F.; Mitchell, A.; Pilone, D.

    2011-12-01

    . After a yearlong design, development, and testing process, the ECHO team successfully released "Reverb - The Next Generation Earth Science Discovery Tool." Reverb relies heavily on the information contained in dataset and granule metadata, such as ISO 19115, to provide a dynamic experience to users based on identified search facet values extracted from science metadata. Such an approach allows users to perform cross-dataset correlation and searches, discovering additional data that they may not previously have been aware of. In addition to data discovery, Reverb users may discover services associated with their data of interest. When services utilize supported standards and/or protocols, Reverb can facilitate the invocation of both synchronous and asynchronous data processing services. This greatly enhances a users ability to discover data of interest and accomplish their research goals. Extrapolating on the current movement towards interoperable standards and an increase in available services, data service invocation and chaining will become a natural part of data discovery. Reverb is one example of a discovery tool that provides a mechanism for transforming the earth science data discovery paradigm.

  13. Input-driven versus turnover-driven controls of simulated changes in soil carbon due to land-use change

    Science.gov (United States)

    Nyawira, S. S.; Nabel, J. E. M. S.; Brovkin, V.; Pongratz, J.

    2017-08-01

    Historical changes in soil carbon associated with land-use change (LUC) result mainly from the changes in the quantity of litter inputs to the soil and the turnover of carbon in soils. We use a factor separation technique to assess how the input-driven and turnover-driven controls, as well as their synergies, have contributed to historical changes in soil carbon associated with LUC. We apply this approach to equilibrium simulations of present-day and pre-industrial land use performed using the dynamic global vegetation model JSBACH. Our results show that both the input-driven and turnover-driven changes generally contribute to a gain in soil carbon in afforested regions and a loss in deforested regions. However, in regions where grasslands have been converted to croplands, we find an input-driven loss that is partly offset by a turnover-driven gain, which stems from a decrease in the fire-related carbon losses. Omitting land management through crop and wood harvest substantially reduces the global losses through the input-driven changes. Our study thus suggests that the dominating control of soil carbon losses is via the input-driven changes, which are more directly accessible to human management than the turnover-driven ones.

  14. NOvA Event Building, Buffering and Data-Driven Triggering From Within the DAQ System

    Energy Technology Data Exchange (ETDEWEB)

    Fischler, M. [Fermilab; Green, C. [Fermilab; Kowalkowski, J. [Fermilab; Norman, A. [Fermilab; Paterno, M. [Fermilab; Rechenmacher, R. [Fermilab

    2012-06-22

    To make its core measurements, the NOvA experiment needs to make real-time data-driven decisions involving beam-spill time correlation and other triggering issues. NOvA-DDT is a prototype Data-Driven Triggering system, built using the Fermilab artdaq generic DAQ/Event-building tools set. This provides the advantages of sharing online software infrastructure with other Intensity Frontier experiments, and of being able to use any offline analysis module--unchanged--as a component of the online triggering decisions. The NOvA-artdaq architecture chosen has significant advantages, including graceful degradation if the triggering decision software fails or cannot be done quickly enough for some fraction of the time-slice ``events.'' We have tested and measured the performance and overhead of NOvA-DDT using an actual Hough transform based trigger decision module taken from the NOvA offline software. The results of these tests--98 ms mean time per event on only 1/16 of th e available processing power of a node, and overheads of about 2 ms per event--provide a proof of concept: NOvA-DDT is a viable strategy for data acquisition, event building, and trigger processing at the NOvA far detector.

  15. Data-driven simultaneous fault diagnosis for solid oxide fuel cell system using multi-label pattern identification

    Science.gov (United States)

    Li, Shuanghong; Cao, Hongliang; Yang, Yupu

    2018-02-01

    Fault diagnosis is a key process for the reliability and safety of solid oxide fuel cell (SOFC) systems. However, it is difficult to rapidly and accurately identify faults for complicated SOFC systems, especially when simultaneous faults appear. In this research, a data-driven Multi-Label (ML) pattern identification approach is proposed to address the simultaneous fault diagnosis of SOFC systems. The framework of the simultaneous-fault diagnosis primarily includes two components: feature extraction and ML-SVM classifier. The simultaneous-fault diagnosis approach can be trained to diagnose simultaneous SOFC faults, such as fuel leakage, air leakage in different positions in the SOFC system, by just using simple training data sets consisting only single fault and not demanding simultaneous faults data. The experimental result shows the proposed framework can diagnose the simultaneous SOFC system faults with high accuracy requiring small number training data and low computational burden. In addition, Fault Inference Tree Analysis (FITA) is employed to identify the correlations among possible faults and their corresponding symptoms at the system component level.

  16. Elucidate Innovation Performance of Technology-driven Mergers and Acquisitions

    Energy Technology Data Exchange (ETDEWEB)

    Huang, L.; Wang, K.; Yu, H.; Shang, L.; Mitkova, L.

    2016-07-01

    The importance and value of Mergers and Acquisitions (M&As) have increased with the expectancy to obtain key technology capabilities and rapid impact on innovation. This article develops an original analytical framework to elucidate the impact of the technology and product relatedness (similarity/complementarity) of the Technology-driven M&A’ partners on post-innovation performance. We present results drawing on a multiple case studies of Chinese High-Tech firms from three industries. (Author)

  17. Performance Evaluation of an Object Management Policy Approach for P2P Networks

    Directory of Open Access Journals (Sweden)

    Dario Vieira

    2012-01-01

    Full Text Available The increasing popularity of network-based multimedia applications poses many challenges for content providers to supply efficient and scalable services. Peer-to-peer (P2P systems have been shown to be a promising approach to provide large-scale video services over the Internet since, by nature, these systems show high scalability and robustness. In this paper, we propose and analyze an object management policy approach for video web cache in a P2P context, taking advantage of object's metadata, for example, video popularity, and object's encoding techniques, for example, scalable video coding (SVC. We carry out trace-driven simulations so as to evaluate the performance of our approach and compare it against traditional object management policy approaches. In addition, we study as well the impact of churn on our approach and on other object management policies that implement different caching strategies. A YouTube video collection which records over 1.6 million video's log was used in our experimental studies. The experiment results have showed that our proposed approach can improve the performance of the cache substantially. Moreover, we have found that neither the simply enlargement of peers' storage capacity nor a zero replicating strategy is effective actions to improve performance of an object management policy.

  18. A Control Approach for Performance of Big Data Systems

    OpenAIRE

    Berekmeri , Mihaly; Serrano , Damián; Bouchenak , Sara; Marchand , Nicolas; Robu , Bogdan

    2014-01-01

    International audience; We are at the dawn of a huge data explosion therefore companies have fast growing amounts of data to process. For this purpose Google developed MapReduce, a parallel programming paradigm which is slowly becoming the de facto tool for Big Data analytics. Although to some extent its use is already wide-spread in the industry, ensuring performance constraints for such a complex system poses great challenges and its management requires a high level of expertise. This paper...

  19. New data-driven estimation of terrestrial CO2 fluxes in Asia using a standardized database of eddy covariance measurements, remote sensing data, and support vector regression

    Science.gov (United States)

    Ichii, Kazuhito; Ueyama, Masahito; Kondo, Masayuki; Saigusa, Nobuko; Kim, Joon; Alberto, Ma. Carmelita; Ardö, Jonas; Euskirchen, Eugénie S.; Kang, Minseok; Hirano, Takashi; Joiner, Joanna; Kobayashi, Hideki; Marchesini, Luca Belelli; Merbold, Lutz; Miyata, Akira; Saitoh, Taku M.; Takagi, Kentaro; Varlagin, Andrej; Bret-Harte, M. Syndonia; Kitamura, Kenzo; Kosugi, Yoshiko; Kotani, Ayumi; Kumar, Kireet; Li, Sheng-Gong; Machimura, Takashi; Matsuura, Yojiro; Mizoguchi, Yasuko; Ohta, Takeshi; Mukherjee, Sandipan; Yanagi, Yuji; Yasuda, Yukio; Zhang, Yiping; Zhao, Fenghua

    2017-04-01

    The lack of a standardized database of eddy covariance observations has been an obstacle for data-driven estimation of terrestrial CO2 fluxes in Asia. In this study, we developed such a standardized database using 54 sites from various databases by applying consistent postprocessing for data-driven estimation of gross primary productivity (GPP) and net ecosystem CO2 exchange (NEE). Data-driven estimation was conducted by using a machine learning algorithm: support vector regression (SVR), with remote sensing data for 2000 to 2015 period. Site-level evaluation of the estimated CO2 fluxes shows that although performance varies in different vegetation and climate classifications, GPP and NEE at 8 days are reproduced (e.g., r2 = 0.73 and 0.42 for 8 day GPP and NEE). Evaluation of spatially estimated GPP with Global Ozone Monitoring Experiment 2 sensor-based Sun-induced chlorophyll fluorescence shows that monthly GPP variations at subcontinental scale were reproduced by SVR (r2 = 1.00, 0.94, 0.91, and 0.89 for Siberia, East Asia, South Asia, and Southeast Asia, respectively). Evaluation of spatially estimated NEE with net atmosphere-land CO2 fluxes of Greenhouse Gases Observing Satellite (GOSAT) Level 4A product shows that monthly variations of these data were consistent in Siberia and East Asia; meanwhile, inconsistency was found in South Asia and Southeast Asia. Furthermore, differences in the land CO2 fluxes from SVR-NEE and GOSAT Level 4A were partially explained by accounting for the differences in the definition of land CO2 fluxes. These data-driven estimates can provide a new opportunity to assess CO2 fluxes in Asia and evaluate and constrain terrestrial ecosystem models.

  20. A Model-Driven Approach for Telecommunications Network Services Definition

    Science.gov (United States)

    Chiprianov, Vanea; Kermarrec, Yvon; Alff, Patrick D.

    Present day Telecommunications market imposes a short concept-to-market time for service providers. To reduce it, we propose a computer-aided, model-driven, service-specific tool, with support for collaborative work and for checking properties on models. We started by defining a prototype of the Meta-model (MM) of the service domain. Using this prototype, we defined a simple graphical modeling language specific for service designers. We are currently enlarging the MM of the domain using model transformations from Network Abstractions Layers (NALs). In the future, we will investigate approaches to ensure the support for collaborative work and for checking properties on models.

  1. Automated Creation of Datamarts from a Clinical Data Warehouse, Driven by an Active Metadata Repository

    Science.gov (United States)

    Rogerson, Charles L.; Kohlmiller, Paul H.; Stutman, Harris

    1998-01-01

    A methodology and toolkit are described which enable the automated metadata-driven creation of datamarts from clinical data warehouses. The software uses schema-to-schema transformation driven by an active metadata repository. Tools for assessing datamart data quality are described, as well as methods for assessing the feasibility of implementing specific datamarts. A methodology for data remediation and the re-engineering of operational data capture is described.

  2. Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data.

    Directory of Open Access Journals (Sweden)

    Sergio Miranda Freire

    Full Text Available This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when

  3. Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data

    Science.gov (United States)

    Freire, Sergio Miranda; Teodoro, Douglas; Wei-Kleiner, Fang; Sundvall, Erik; Karlsson, Daniel; Lambrix, Patrick

    2016-01-01

    This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use

  4. Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data.

    Science.gov (United States)

    Freire, Sergio Miranda; Teodoro, Douglas; Wei-Kleiner, Fang; Sundvall, Erik; Karlsson, Daniel; Lambrix, Patrick

    2016-01-01

    This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use

  5. Protein engineering of Bacillus acidopullulyticus pullulanase for enhanced thermostability using in silico data driven rational design methods.

    Science.gov (United States)

    Chen, Ana; Li, Yamei; Nie, Jianqi; McNeil, Brian; Jeffrey, Laura; Yang, Yankun; Bai, Zhonghu

    2015-10-01

    Thermostability has been considered as a requirement in the starch processing industry to maintain high catalytic activity of pullulanase under high temperatures. Four data driven rational design methods (B-FITTER, proline theory, PoPMuSiC-2.1, and sequence consensus approach) were adopted to identify the key residue potential links with thermostability, and 39 residues of Bacillus acidopullulyticus pullulanase were chosen as mutagenesis targets. Single mutagenesis followed by combined mutagenesis resulted in the best mutant E518I-S662R-Q706P, which exhibited an 11-fold half-life improvement at 60 °C and a 9.5 °C increase in Tm. The optimum temperature of the mutant increased from 60 to 65 °C. Fluorescence spectroscopy results demonstrated that the tertiary structure of the mutant enzyme was more compact than that of the wild-type (WT) enzyme. Structural change analysis revealed that the increase in thermostability was most probably caused by a combination of lower stability free-energy and higher hydrophobicity of E518I, more hydrogen bonds of S662R, and higher rigidity of Q706P compared with the WT. The findings demonstrated the effectiveness of combined data-driven rational design approaches in engineering an industrial enzyme to improve thermostability. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. System driven technology selection for future European launch systems

    Science.gov (United States)

    Baiocco, P.; Ramusat, G.; Sirbi, A.; Bouilly, Th.; Lavelle, F.; Cardone, T.; Fischer, H.; Appel, S.

    2015-02-01

    In the framework of the next generation launcher activity at ESA, a top-down approach and a bottom-up approach have been performed for the identification of promising technologies and alternative conception of future European launch vehicles. The top-down approach consists in looking for system-driven design solutions and the bottom-up approach features design solutions leading to substantial advantages for the system. The main investigations have been focused on the future launch vehicle technologies. Preliminary specifications have been used in order to permit sub-system design to find the major benefit for the overall launch system. The development cost, non-recurring and recurring cost, industrialization and operational aspects have been considered as competitiveness factors for the identification and down-selection of the most interesting technologies. The recurring cost per unit payload mass has been evaluated. The TRL/IRL has been assessed and a preliminary development plan has been traced for the most promising technologies. The potentially applicable launch systems are Ariane and VEGA evolution. The main FLPP technologies aim at reducing overall structural mass, increasing structural margins for robustness, metallic and composite containment of cryogenic hydrogen and oxygen propellants, propellant management subsystems, elements significantly reducing fabrication and operational costs, avionics, pyrotechnics, etc. to derive performing upper and booster stages. Application of the system driven approach allows creating performing technology demonstrators in terms of need, demonstration objective, size and cost. This paper outlines the process of technology down selection using a system driven approach, the accomplishments already achieved in the various technology fields up to now, as well as the potential associated benefit in terms of competitiveness factors.

  7. Evaluation of Respondent-Driven Sampling

    Science.gov (United States)

    McCreesh, Nicky; Frost, Simon; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda Ndagire; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G

    2012-01-01

    Background Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex-workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total-population data. Methods Total-population data on age, tribe, religion, socioeconomic status, sexual activity and HIV status were available on a population of 2402 male household-heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, employing current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). Results We recruited 927 household-heads. Full and small RDS samples were largely representative of the total population, but both samples under-represented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven-sampling statistical-inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven-sampling bootstrap 95% confidence intervals included the population proportion. Conclusions Respondent-driven sampling produced a generally representative sample of this well-connected non-hidden population. However, current respondent-driven-sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience

  8. Evaluation of respondent-driven sampling.

    Science.gov (United States)

    McCreesh, Nicky; Frost, Simon D W; Seeley, Janet; Katongole, Joseph; Tarsh, Matilda N; Ndunguse, Richard; Jichi, Fatima; Lunel, Natasha L; Maher, Dermot; Johnston, Lisa G; Sonnenberg, Pam; Copas, Andrew J; Hayes, Richard J; White, Richard G

    2012-01-01

    Respondent-driven sampling is a novel variant of link-tracing sampling for estimating the characteristics of hard-to-reach groups, such as HIV prevalence in sex workers. Despite its use by leading health organizations, the performance of this method in realistic situations is still largely unknown. We evaluated respondent-driven sampling by comparing estimates from a respondent-driven sampling survey with total population data. Total population data on age, tribe, religion, socioeconomic status, sexual activity, and HIV status were available on a population of 2402 male household heads from an open cohort in rural Uganda. A respondent-driven sampling (RDS) survey was carried out in this population, using current methods of sampling (RDS sample) and statistical inference (RDS estimates). Analyses were carried out for the full RDS sample and then repeated for the first 250 recruits (small sample). We recruited 927 household heads. Full and small RDS samples were largely representative of the total population, but both samples underrepresented men who were younger, of higher socioeconomic status, and with unknown sexual activity and HIV status. Respondent-driven sampling statistical inference methods failed to reduce these biases. Only 31%-37% (depending on method and sample size) of RDS estimates were closer to the true population proportions than the RDS sample proportions. Only 50%-74% of respondent-driven sampling bootstrap 95% confidence intervals included the population proportion. Respondent-driven sampling produced a generally representative sample of this well-connected nonhidden population. However, current respondent-driven sampling inference methods failed to reduce bias when it occurred. Whether the data required to remove bias and measure precision can be collected in a respondent-driven sampling survey is unresolved. Respondent-driven sampling should be regarded as a (potentially superior) form of convenience sampling method, and caution is required

  9. DeDaL: Cytoscape 3 app for producing and morphing data-driven and structure-driven network layouts.

    Science.gov (United States)

    Czerwinska, Urszula; Calzone, Laurence; Barillot, Emmanuel; Zinovyev, Andrei

    2015-08-14

    Visualization and analysis of molecular profiling data together with biological networks are able to provide new mechanistic insights into biological functions. Currently, it is possible to visualize high-throughput data on top of pre-defined network layouts, but they are not always adapted to a given data analysis task. A network layout based simultaneously on the network structure and the associated multidimensional data might be advantageous for data visualization and analysis in some cases. We developed a Cytoscape app, which allows constructing biological network layouts based on the data from molecular profiles imported as values of node attributes. DeDaL is a Cytoscape 3 app, which uses linear and non-linear algorithms of dimension reduction to produce data-driven network layouts based on multidimensional data (typically gene expression). DeDaL implements several data pre-processing and layout post-processing steps such as continuous morphing between two arbitrary network layouts and aligning one network layout with respect to another one by rotating and mirroring. The combination of all these functionalities facilitates the creation of insightful network layouts representing both structural network features and correlation patterns in multivariate data. We demonstrate the added value of applying DeDaL in several practical applications, including an example of a large protein-protein interaction network. DeDaL is a convenient tool for applying data dimensionality reduction methods and for designing insightful data displays based on data-driven layouts of biological networks, built within Cytoscape environment. DeDaL is freely available for downloading at http://bioinfo-out.curie.fr/projects/dedal/.

  10. Data science approaches to pharmacogenetics.

    Science.gov (United States)

    Penrod, N M; Moore, J H

    2014-01-01

    Pharmacogenetic studies rely on applied statistics to evaluate genetic data describing natural variation in response to pharmacotherapeutics such as drugs and vaccines. In the beginning, these studies were based on candidate gene approaches that specifically focused on efficacy or adverse events correlated with variants of single genes. This hypothesis driven method required the researcher to have a priori knowledge of which genes or gene sets to investigate. According to rational design, the focus of these studies has been on drug metabolizing enzymes, drug transporters, and drug targets. As technology has progressed, these studies have transitioned to hypothesis-free explorations where markers across the entire genome can be measured in large scale, population based, genome-wide association studies (GWAS). This enables identification of novel genetic biomarkers, therapeutic targets, and analysis of gene-gene interactions, which may reveal molecular mechanisms of drug activities. Ultimately, the challenge is to utilize gene-drug associations to create dosing algorithms based individual genotypes, which will guide physicians and ensure they prescribe the correct dose of the correct drug the first time eliminating trial-and-error and adverse events. We review here basic concepts and applications of data science to the genetic analysis of pharmacologic outcomes.

  11. Data-driven haemodynamic response function extraction using Fourier-wavelet regularised deconvolution

    NARCIS (Netherlands)

    Wink, Alle Meije; Hoogduin, Hans; Roerdink, Jos B.T.M.

    2008-01-01

    Background: We present a simple, data-driven method to extract haemodynamic response functions (HRF) from functional magnetic resonance imaging (fMRI) time series, based on the Fourier-wavelet regularised deconvolution (ForWaRD) technique. HRF data are required for many fMRI applications, such as

  12. Data-driven haemodynamic response function extraction using Fourier-wavelet regularised deconvolution

    NARCIS (Netherlands)

    Wink, Alle Meije; Hoogduin, Hans; Roerdink, Jos B.T.M.

    2010-01-01

    Background: We present a simple, data-driven method to extract haemodynamic response functions (HRF) from functional magnetic resonance imaging (fMRI) time series, based on the Fourier-wavelet regularised deconvolution (ForWaRD) technique. HRF data are required for many fMRI applications, such as

  13. Efficient Feature-Driven Visualization of Large-Scale Scientific Data

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Aidong

    2012-12-12

    Very large, complex scientific data acquired in many research areas creates critical challenges for scientists to understand, analyze, and organize their data. The objective of this project is to expand the feature extraction and analysis capabilities to develop powerful and accurate visualization tools that can assist domain scientists with their requirements in multiple phases of scientific discovery. We have recently developed several feature-driven visualization methods for extracting different data characteristics of volumetric datasets. Our results verify the hypothesis in the proposal and will be used to develop additional prototype systems.

  14. A data-driven, mathematical model of mammalian cell cycle regulation.

    Directory of Open Access Journals (Sweden)

    Michael C Weis

    Full Text Available Few of >150 published cell cycle modeling efforts use significant levels of data for tuning and validation. This reflects the difficultly to generate correlated quantitative data, and it points out a critical uncertainty in modeling efforts. To develop a data-driven model of cell cycle regulation, we used contiguous, dynamic measurements over two time scales (minutes and hours calculated from static multiparametric cytometry data. The approach provided expression profiles of cyclin A2, cyclin B1, and phospho-S10-histone H3. The model was built by integrating and modifying two previously published models such that the model outputs for cyclins A and B fit cyclin expression measurements and the activation of B cyclin/Cdk1 coincided with phosphorylation of histone H3. The model depends on Cdh1-regulated cyclin degradation during G1, regulation of B cyclin/Cdk1 activity by cyclin A/Cdk via Wee1, and transcriptional control of the mitotic cyclins that reflects some of the current literature. We introduced autocatalytic transcription of E2F, E2F regulated transcription of cyclin B, Cdc20/Cdh1 mediated E2F degradation, enhanced transcription of mitotic cyclins during late S/early G2 phase, and the sustained synthesis of cyclin B during mitosis. These features produced a model with good correlation between state variable output and real measurements. Since the method of data generation is extensible, this model can be continually modified based on new correlated, quantitative data.

  15. A data-driven wavelet-based approach for generating jumping loads

    Science.gov (United States)

    Chen, Jun; Li, Guo; Racic, Vitomir

    2018-06-01

    This paper suggests an approach to generate human jumping loads using wavelet transform and a database of individual jumping force records. A total of 970 individual jumping force records of various frequencies were first collected by three experiments from 147 test subjects. For each record, every jumping pulse was extracted and decomposed into seven levels by wavelet transform. All the decomposition coefficients were stored in an information database. Probability distributions of jumping cycle period, contact ratio and energy of the jumping pulse were statistically analyzed. Inspired by the theory of DNA recombination, an approach was developed by interchanging the wavelet coefficients between different jumping pulses. To generate a jumping force time history with N pulses, wavelet coefficients were first selected randomly from the database at each level. They were then used to reconstruct N pulses by the inverse wavelet transform. Jumping cycle periods and contract ratios were then generated randomly based on their probabilistic functions. These parameters were assigned to each of the N pulses which were in turn scaled by the amplitude factors βi to account for energy relationship between successive pulses. The final jumping force time history was obtained by linking all the N cycles end to end. This simulation approach can preserve the non-stationary features of the jumping load force in time-frequency domain. Application indicates that this approach can be used to generate jumping force time history due to single people jumping and also can be extended further to stochastic jumping loads due to groups and crowds.

  16. SIDEKICK: Genomic data driven analysis and decision-making framework

    Directory of Open Access Journals (Sweden)

    Yoon Kihoon

    2010-12-01

    Full Text Available Abstract Background Scientists striving to unlock mysteries within complex biological systems face myriad barriers in effectively integrating available information to enhance their understanding. While experimental techniques and available data sources are rapidly evolving, useful information is dispersed across a variety of sources, and sources of the same information often do not use the same format or nomenclature. To harness these expanding resources, scientists need tools that bridge nomenclature differences and allow them to integrate, organize, and evaluate the quality of information without extensive computation. Results Sidekick, a genomic data driven analysis and decision making framework, is a web-based tool that provides a user-friendly intuitive solution to the problem of information inaccessibility. Sidekick enables scientists without training in computation and data management to pursue answers to research questions like "What are the mechanisms for disease X" or "Does the set of genes associated with disease X also influence other diseases." Sidekick enables the process of combining heterogeneous data, finding and maintaining the most up-to-date data, evaluating data sources, quantifying confidence in results based on evidence, and managing the multi-step research tasks needed to answer these questions. We demonstrate Sidekick's effectiveness by showing how to accomplish a complex published analysis in a fraction of the original time with no computational effort using Sidekick. Conclusions Sidekick is an easy-to-use web-based tool that organizes and facilitates complex genomic research, allowing scientists to explore genomic relationships and formulate hypotheses without computational effort. Possible analysis steps include gene list discovery, gene-pair list discovery, various enrichments for both types of lists, and convenient list manipulation. Further, Sidekick's ability to characterize pairs of genes offers new ways to

  17. Closed-loop suppression of chaos in nonlinear driven oscillators

    Science.gov (United States)

    Aguirre, L. A.; Billings, S. A.

    1995-05-01

    This paper discusses the suppression of chaos in nonlinear driven oscillators via the addition of a periodic perturbation. Given a system originally undergoing chaotic motions, it is desired that such a system be driven to some periodic orbit. This can be achieved by the addition of a weak periodic signal to the oscillator input. This is usually accomplished in open loop, but this procedure presents some difficulties which are discussed in the paper. To ensure that this is attained despite uncertainties and possible disturbances on the system, a procedure is suggested to perform control in closed loop. In addition, it is illustrated how a model, estimated from input/output data, can be used in the design. Numerical examples which use the Duffing-Ueda and modified van der Pol oscillators are included to illustrate some of the properties of the new approach.

  18. Re-assessing Present Day Global Mass Transport and Glacial Isostatic Adjustment From a Data Driven Approach

    Science.gov (United States)

    Wu, X.; Jiang, Y.; Simonsen, S.; van den Broeke, M. R.; Ligtenberg, S.; Kuipers Munneke, P.; van der Wal, W.; Vermeersen, B. L. A.

    2017-12-01

    Determining present-day mass transport (PDMT) is complicated by the fact that most observations contain signals from both present day ice melting and Glacial Isostatic Adjustment (GIA). Despite decades of progress in geodynamic modeling and new observations, significant uncertainties remain in both. The key to separate present-day ice mass change and signals from GIA is to include data of different physical characteristics. We designed an approach to separate PDMT and GIA signatures by estimating them simultaneously using globally distributed interdisciplinary data with distinct physical information and a dynamically constructed a priori GIA model. We conducted a high-resolution global reappraisal of present-day ice mass balance with focus on Earth's polar regions and its contribution to global sea-level rise using a combination of ICESat, GRACE gravity, surface geodetic velocity data, and an ocean bottom pressure model. Adding ice altimetry supplies critically needed dual data types over the interiors of ice covered regions to enhance separation of PDMT and GIA signatures, and achieve half an order of magnitude expected higher accuracies for GIA and consequently ice mass balance estimates. The global data based approach can adequately address issues of PDMT and GIA induced geocenter motion and long-wavelength signatures important for large areas such as Antarctica and global mean sea level. In conjunction with the dense altimetry data, we solved for PDMT coefficients up to degree and order 180 by using a higher-resolution GRACE data set, and a high-resolution a priori PDMT model that includes detailed geographic boundaries. The high-resolution approach solves the problem of multiple resolutions in various data types, greatly reduces aliased errors from a low-degree truncation, and at the same time, enhances separation of signatures from adjacent regions such as Greenland and Canadian Arctic territories.

  19. "Just Say It Like It Is!" Use of a Community-Based Participatory Approach to Develop a Technology-Driven Food Literacy Program for Adolescents.

    Science.gov (United States)

    Wickham, Catherine A; Carbone, Elena T

    2018-01-01

    FuelUp&Go! is a technology-driven food literacy program consisting of six in-person skill building sessions as well as fitness trackers, text messages, and a companion website. A community-based participatory research approach was used with adolescents who were recruited to participate in a Kid Council. Qualitative data were collected about the use of surveys, program activities, recipes, technology and text messages, and music and incentives. Changes suggested by Kid Councilmembers informed the design and development of a pilot program. Participants were recruited for the pilot program and completed pre- and postintervention surveys. The results indicated food-related knowledge remained low but increased from baseline to follow-up. Attitudes toward vegetables and physical activity increased slightly. Self-reported participation in physical activity and consumption of sugar-added beverages moved in positive directions. These findings suggest that community-based participatory research approach is an effective approach to engage adolescents in the development of a technology-driven food literacy program.

  20. Data Driven Modelling of the Dynamic Wake Between Two Wind Turbines

    DEFF Research Database (Denmark)

    Knudsen, Torben; Bak, Thomas

    2012-01-01

    turbine. This paper establishes flow models relating the wind speeds at turbines in a farm. So far, research in this area has been mainly based on first principles static models and the data driven modelling done has not included the loading of the upwind turbine and its impact on the wind speed downwind......Wind turbines in a wind farm, influence each other through the wind flow. Downwind turbines are in the wake of upwind turbines and the wind speed experienced at downwind turbines is hence a function of the wind speeds at upwind turbines but also the momentum extracted from the wind by the upwind....... This paper is the first where modern commercial mega watt turbines are used for data driven modelling including the upwind turbine loading by changing power reference. Obtaining the necessary data is difficult and data is therefore limited. A simple dynamic extension to the Jensen wake model is tested...

  1. User-driven sampling strategies in image exploitation

    Science.gov (United States)

    Harvey, Neal; Porter, Reid

    2013-12-01

    Visual analytics and interactive machine learning both try to leverage the complementary strengths of humans and machines to solve complex data exploitation tasks. These fields overlap most significantly when training is involved: the visualization or machine learning tool improves over time by exploiting observations of the human-computer interaction. This paper focuses on one aspect of the human-computer interaction that we call user-driven sampling strategies. Unlike relevance feedback and active learning sampling strategies, where the computer selects which data to label at each iteration, we investigate situations where the user selects which data is to be labeled at each iteration. User-driven sampling strategies can emerge in many visual analytics applications but they have not been fully developed in machine learning. User-driven sampling strategies suggest new theoretical and practical research questions for both visualization science and machine learning. In this paper we identify and quantify the potential benefits of these strategies in a practical image analysis application. We find user-driven sampling strategies can sometimes provide significant performance gains by steering tools towards local minima that have lower error than tools trained with all of the data. In preliminary experiments we find these performance gains are particularly pronounced when the user is experienced with the tool and application domain.

  2. Simulation on the Performance of a Driven Fan Made by Polyester/Epoxy interpenetrate polymer network (IPN)

    Science.gov (United States)

    Fahrul Hassan, Mohd; Jamri, Azmil; Nawawi, Azli; Zaini Yunos, Muhamad; Fauzi Ahmad, Md; Adzila, Sharifah; Nasrull Abdol Rahman, Mohd

    2017-08-01

    The main purpose of this study is to investigate the performance of a driven fan design made by Polyester/Epoxy interpenetrate polymer network (IPN) material that specifically used for turbocharger compressor. Polyester/Epoxy IPN is polymer plastics that was used as replacements for traditional polymers and has been widely used in a variety of applications because of their limitless conformations. Simulation based on several parameters which are air pressure, air velocity and air temperature have been carried out for a driven fan design performance of two different materials, aluminum alloy (existing driven fan design) and Polyester/Epoxy IPN using SolidWorks Flow Simulation software. Results from both simulations were analyzed and compared where both materials show similar performance in terms of air pressure and air velocity due to similar geometric and dimension, but Polyester/Epoxy IPN produces lower air temperature than aluminum alloy. This study shows a preliminary result of the potential Polyester/Epoxy IPN to be used as a driven fan design material. In the future, further studies will be conducted on detail simulation and experimental analysis.

  3. A multi-source satellite data approach for modelling Lake Turkana water level: calibration and validation using satellite altimetry data

    Directory of Open Access Journals (Sweden)

    N. M. Velpuri

    2012-01-01

    Full Text Available Lake Turkana is one of the largest desert lakes in the world and is characterized by high degrees of inter- and intra-annual fluctuations. The hydrology and water balance of this lake have not been well understood due to its remote location and unavailability of reliable ground truth datasets. Managing surface water resources is a great challenge in areas where in-situ data are either limited or unavailable. In this study, multi-source satellite-driven data such as satellite-based rainfall estimates, modelled runoff, evapotranspiration, and a digital elevation dataset were used to model Lake Turkana water levels from 1998 to 2009. Due to the unavailability of reliable lake level data, an approach is presented to calibrate and validate the water balance model of Lake Turkana using a composite lake level product of TOPEX/Poseidon, Jason-1, and ENVISAT satellite altimetry data. Model validation results showed that the satellite-driven water balance model can satisfactorily capture the patterns and seasonal variations of the Lake Turkana water level fluctuations with a Pearson's correlation coefficient of 0.90 and a Nash-Sutcliffe Coefficient of Efficiency (NSCE of 0.80 during the validation period (2004–2009. Model error estimates were within 10% of the natural variability of the lake. Our analysis indicated that fluctuations in Lake Turkana water levels are mainly driven by lake inflows and over-the-lake evaporation. Over-the-lake rainfall contributes only up to 30% of lake evaporative demand. During the modelling time period, Lake Turkana showed seasonal variations of 1–2 m. The lake level fluctuated in the range up to 4 m between the years 1998–2009. This study demonstrated the usefulness of satellite altimetry data to calibrate and validate the satellite-driven hydrological model for Lake Turkana without using any in-situ data. Furthermore, for Lake Turkana, we identified and outlined opportunities and challenges of using a calibrated

  4. Task-Driven Optimization of Fluence Field and Regularization for Model-Based Iterative Reconstruction in Computed Tomography.

    Science.gov (United States)

    Gang, Grace J; Siewerdsen, Jeffrey H; Stayman, J Webster

    2017-12-01

    This paper presents a joint optimization of dynamic fluence field modulation (FFM) and regularization in quadratic penalized-likelihood reconstruction that maximizes a task-based imaging performance metric. We adopted a task-driven imaging framework for prospective designs of the imaging parameters. A maxi-min objective function was adopted to maximize the minimum detectability index ( ) throughout the image. The optimization algorithm alternates between FFM (represented by low-dimensional basis functions) and local regularization (including the regularization strength and directional penalty weights). The task-driven approach was compared with three FFM strategies commonly proposed for FBP reconstruction (as well as a task-driven TCM strategy) for a discrimination task in an abdomen phantom. The task-driven FFM assigned more fluence to less attenuating anteroposterior views and yielded approximately constant fluence behind the object. The optimal regularization was almost uniform throughout image. Furthermore, the task-driven FFM strategy redistribute fluence across detector elements in order to prescribe more fluence to the more attenuating central region of the phantom. Compared with all strategies, the task-driven FFM strategy not only improved minimum by at least 17.8%, but yielded higher over a large area inside the object. The optimal FFM was highly dependent on the amount of regularization, indicating the importance of a joint optimization. Sample reconstructions of simulated data generally support the performance estimates based on computed . The improvements in detectability show the potential of the task-driven imaging framework to improve imaging performance at a fixed dose, or, equivalently, to provide a similar level of performance at reduced dose.

  5. On the data-driven inference of modulatory networks in climate science: an application to West African rainfall

    Science.gov (United States)

    González, D. L., II; Angus, M. P.; Tetteh, I. K.; Bello, G. A.; Padmanabhan, K.; Pendse, S. V.; Srinivas, S.; Yu, J.; Semazzi, F.; Kumar, V.; Samatova, N. F.

    2015-01-01

    Decades of hypothesis-driven and/or first-principles research have been applied towards the discovery and explanation of the mechanisms that drive climate phenomena, such as western African Sahel summer rainfall~variability. Although connections between various climate factors have been theorized, not all of the key relationships are fully understood. We propose a data-driven approach to identify candidate players in this climate system, which can help explain underlying mechanisms and/or even suggest new relationships, to facilitate building a more comprehensive and predictive model of the modulatory relationships influencing a climate phenomenon of interest. We applied coupled heterogeneous association rule mining (CHARM), Lasso multivariate regression, and dynamic Bayesian networks to find relationships within a complex system, and explored means with which to obtain a consensus result from the application of such varied methodologies. Using this fusion of approaches, we identified relationships among climate factors that modulate Sahel rainfall. These relationships fall into two categories: well-known associations from prior climate knowledge, such as the relationship with the El Niño-Southern Oscillation (ENSO) and putative links, such as North Atlantic Oscillation, that invite further research.

  6. A Disciplined Architectural Approach to Scaling Data Analysis for Massive, Scientific Data

    Science.gov (United States)

    Crichton, D. J.; Braverman, A. J.; Cinquini, L.; Turmon, M.; Lee, H.; Law, E.

    2014-12-01

    Data collections across remote sensing and ground-based instruments in astronomy, Earth science, and planetary science are outpacing scientists' ability to analyze them. Furthermore, the distribution, structure, and heterogeneity of the measurements themselves pose challenges that limit the scalability of data analysis using traditional approaches. Methods for developing science data processing pipelines, distribution of scientific datasets, and performing analysis will require innovative approaches that integrate cyber-infrastructure, algorithms, and data into more systematic approaches that can more efficiently compute and reduce data, particularly distributed data. This requires the integration of computer science, machine learning, statistics and domain expertise to identify scalable architectures for data analysis. The size of data returned from Earth Science observing satellites and the magnitude of data from climate model output, is predicted to grow into the tens of petabytes challenging current data analysis paradigms. This same kind of growth is present in astronomy and planetary science data. One of the major challenges in data science and related disciplines defining new approaches to scaling systems and analysis in order to increase scientific productivity and yield. Specific needs include: 1) identification of optimized system architectures for analyzing massive, distributed data sets; 2) algorithms for systematic analysis of massive data sets in distributed environments; and 3) the development of software infrastructures that are capable of performing massive, distributed data analysis across a comprehensive data science framework. NASA/JPL has begun an initiative in data science to address these challenges. Our goal is to evaluate how scientific productivity can be improved through optimized architectural topologies that identify how to deploy and manage the access, distribution, computation, and reduction of massive, distributed data, while

  7. Conducting requirements analyses for research using routinely collected health data: a model driven approach.

    Science.gov (United States)

    de Lusignan, Simon; Cashman, Josephine; Poh, Norman; Michalakidis, Georgios; Mason, Aaron; Desombre, Terry; Krause, Paul

    2012-01-01

    Medical research increasingly requires the linkage of data from different sources. Conducting a requirements analysis for a new application is an established part of software engineering, but rarely reported in the biomedical literature; and no generic approaches have been published as to how to link heterogeneous health data. Literature review, followed by a consensus process to define how requirements for research, using, multiple data sources might be modeled. We have developed a requirements analysis: i-ScheDULEs - The first components of the modeling process are indexing and create a rich picture of the research study. Secondly, we developed a series of reference models of progressive complexity: Data flow diagrams (DFD) to define data requirements; unified modeling language (UML) use case diagrams to capture study specific and governance requirements; and finally, business process models, using business process modeling notation (BPMN). These requirements and their associated models should become part of research study protocols.

  8. Assumption-versus data-based approaches to summarizing species' ranges.

    Science.gov (United States)

    Peterson, A Townsend; Navarro-Sigüenza, Adolfo G; Gordillo, Alejandro

    2018-06-01

    For conservation decision making, species' geographic distributions are mapped using various approaches. Some such efforts have downscaled versions of coarse-resolution extent-of-occurrence maps to fine resolutions for conservation planning. We examined the quality of the extent-of-occurrence maps as range summaries and the utility of refining those maps into fine-resolution distributional hypotheses. Extent-of-occurrence maps tend to be overly simple, omit many known and well-documented populations, and likely frequently include many areas not holding populations. Refinement steps involve typological assumptions about habitat preferences and elevational ranges of species, which can introduce substantial error in estimates of species' true areas of distribution. However, no model-evaluation steps are taken to assess the predictive ability of these models, so model inaccuracies are not noticed. Whereas range summaries derived by these methods may be useful in coarse-grained, global-extent studies, their continued use in on-the-ground conservation applications at fine spatial resolutions is not advisable in light of reliance on assumptions, lack of real spatial resolution, and lack of testing. In contrast, data-driven techniques that integrate primary data on biodiversity occurrence with remotely sensed data that summarize environmental dimensions (i.e., ecological niche modeling or species distribution modeling) offer data-driven solutions based on a minimum of assumptions that can be evaluated and validated quantitatively to offer a well-founded, widely accepted method for summarizing species' distributional patterns for conservation applications. © 2016 Society for Conservation Biology.

  9. A Data-Driven Air Transportation Delay Propagation Model Using Epidemic Process Models

    Directory of Open Access Journals (Sweden)

    B. Baspinar

    2016-01-01

    Full Text Available In air transport network management, in addition to defining the performance behavior of the system’s components, identification of their interaction dynamics is a delicate issue in both strategic and tactical decision-making process so as to decide which elements of the system are “controlled” and how. This paper introduces a novel delay propagation model utilizing epidemic spreading process, which enables the definition of novel performance indicators and interaction rates of the elements of the air transportation network. In order to understand the behavior of the delay propagation over the network at different levels, we have constructed two different data-driven epidemic models approximating the dynamics of the system: (a flight-based epidemic model and (b airport-based epidemic model. The flight-based epidemic model utilizing SIS epidemic model focuses on the individual flights where each flight can be in susceptible or infected states. The airport-centric epidemic model, in addition to the flight-to-flight interactions, allows us to define the collective behavior of the airports, which are modeled as metapopulations. In network model construction, we have utilized historical flight-track data of Europe and performed analysis for certain days involving certain disturbances. Through this effort, we have validated the proposed delay propagation models under disruptive events.

  10. Value-Driven Population Health: An Emerging Focus for Improving Stakeholder Role Performance.

    Science.gov (United States)

    Allen, Harris; Burton, Wayne N; Fabius, Raymond

    2017-12-01

    Health and health care in the United States are being jeopardized by top-end spending whose share of the gross domestic product continues to increase even as aggregate health outcomes remain mediocre. This paper focuses on a new approach for improving stakeholder role performance in the marketplace, value-driven population health (VDPH SM ). Devoted to maximizing the value of every dollar spent on population health, VDPH holds much promise for ameliorating this dilemma and exerting a constructive influence on the reshaping of the Affordable Care Act. This paper introduces VDPH and differentiates the science underlying it from the management that serves to make good on its potential. To highlight what VDPH brings to the table, comparisons are made with 3 like-minded approaches to health reform. Next, 2 areas are highlighted, workplace wellness and the quality and cost of health care, where without necessarily being recognized as such, VDPH has gained real traction among 2 groups: leading employers and, more recently, leading providers. Key findings with respect to workplace wellness are assessed in terms of psychometric performance to evaluate workplace wellness and to point out how VDPH can help direct future employer initiatives toward firmer scientific footing. Then, insights gleaned from the employer experience are applied to illustrate how VDPH can help guide future provider efforts to build on the model developed. This paper concludes with a framework for the use of VDPH by each of 5 stakeholder groups. The discussion centers on how VDPH transcends and differentiates these groups. Implications for health reform in the recently altered political landscape are explored.

  11. Data-Driven Healthcare: Challenges and Opportunities for Interactive Visualization.

    Science.gov (United States)

    Gotz, David; Borland, David

    2016-01-01

    The healthcare industry's widespread digitization efforts are reshaping one of the largest sectors of the world's economy. This transformation is enabling systems that promise to use ever-improving data-driven evidence to help doctors make more precise diagnoses, institutions identify at risk patients for intervention, clinicians develop more personalized treatment plans, and researchers better understand medical outcomes within complex patient populations. Given the scale and complexity of the data required to achieve these goals, advanced data visualization tools have the potential to play a critical role. This article reviews a number of visualization challenges unique to the healthcare discipline.

  12. A data driven approach for automating vehicle activated signs

    OpenAIRE

    Jomaa, Diala

    2016-01-01

    Vehicle activated signs (VAS) display a warning message when drivers exceed a particular threshold. VAS are often installed on local roads to display a warning message depending on the speed of the approaching vehicles. VAS are usually powered by electricity; however, battery and solar powered VAS are also commonplace. This thesis investigated devel-opment of an automatic trigger speed of vehicle activated signs in order to influence driver behaviour, the effect of which has been measured in ...

  13. Analysis on the heating performance of a gas engine driven air to water heat pump based on a steady-state model

    International Nuclear Information System (INIS)

    Zhang, R.R.; Lu, X.S.; Li, S.Z.; Lin, W.S.; Gu, A.Z.

    2005-01-01

    In this study, the heating performance of a gas engine driven air to water heat pump was analyzed using a steady state model. The thermodynamic model of a natural gas engine is identified by the experimental data and the compressor model is created by several empirical equations. The heat exchanger models are developed by the theory of heat balance. The system model is validated by comparing the experimental and simulation data, which shows good agreement. To understand the heating characteristic in detail, the performance of the system is analyzed in a wide range of operating conditions, and especially the effect of engine waste heat on the heating performance is discussed. The results show that engine waste heat can provide about 1/3 of the total heating capacity in this gas engine driven air to water heat pump. The performance of the engine, heat pump and integral system are analyzed under variations of engine speed and ambient temperature. It shows that engine speed has remarkable effects on both the engine and heat pump, but ambient temperature has little influence on the engine's performance. The system and component performances in variable speed operating conditions is also discussed at the end of the paper

  14. Automated quality control methods for sensor data: a novel observatory approach

    Directory of Open Access Journals (Sweden)

    J. R. Taylor

    2013-07-01

    Full Text Available National and international networks and observatories of terrestrial-based sensors are emerging rapidly. As such, there is demand for a standardized approach to data quality control, as well as interoperability of data among sensor networks. The National Ecological Observatory Network (NEON has begun constructing their first terrestrial observing sites, with 60 locations expected to be distributed across the US by 2017. This will result in over 14 000 automated sensors recording more than > 100 Tb of data per year. These data are then used to create other datasets and subsequent "higher-level" data products. In anticipation of this challenge, an overall data quality assurance plan has been developed and the first suite of data quality control measures defined. This data-driven approach focuses on automated methods for defining a suite of plausibility test parameter thresholds. Specifically, these plausibility tests scrutinize the data range and variance of each measurement type by employing a suite of binary checks. The statistical basis for each of these tests is developed, and the methods for calculating test parameter thresholds are explored here. While these tests have been used elsewhere, we apply them in a novel approach by calculating their relevant test parameter thresholds. Finally, implementing automated quality control is demonstrated with preliminary data from a NEON prototype site.

  15. Rainy Day: A Remote Sensing-Driven Extreme Rainfall Simulation Approach for Hazard Assessment

    Science.gov (United States)

    Wright, Daniel; Yatheendradas, Soni; Peters-Lidard, Christa; Kirschbaum, Dalia; Ayalew, Tibebu; Mantilla, Ricardo; Krajewski, Witold

    2015-04-01

    Progress on the assessment of rainfall-driven hazards such as floods and landslides has been hampered by the challenge of characterizing the frequency, intensity, and structure of extreme rainfall at the watershed or hillslope scale. Conventional approaches rely on simplifying assumptions and are strongly dependent on the location, the availability of long-term rain gage measurements, and the subjectivity of the analyst. Regional and global-scale rainfall remote sensing products provide an alternative, but are limited by relatively short (~15-year) observational records. To overcome this, we have coupled these remote sensing products with a space-time resampling framework known as stochastic storm transposition (SST). SST "lengthens" the rainfall record by resampling from a catalog of observed storms from a user-defined region, effectively recreating the regional extreme rainfall hydroclimate. This coupling has been codified in Rainy Day, a Python-based platform for quickly generating large numbers of probabilistic extreme rainfall "scenarios" at any point on the globe. Rainy Day is readily compatible with any gridded rainfall dataset. The user can optionally incorporate regional rain gage or weather radar measurements for bias correction using the Precipitation Uncertainties for Satellite Hydrology (PUSH) framework. Results from Rainy Day using the CMORPH satellite precipitation product are compared with local observations in two examples. The first example is peak discharge estimation in a medium-sized (~4000 square km) watershed in the central United States performed using CUENCAS, a parsimonious physically-based distributed hydrologic model. The second example is rainfall frequency analysis for Saint Lucia, a small volcanic island in the eastern Caribbean that is prone to landslides and flash floods. The distinct rainfall hydroclimates of the two example sites illustrate the flexibility of the approach and its usefulness for hazard analysis in data-poor regions.

  16. Electron versus proton accelerator driven sub-critical system performance using TRIGA reactors at power

    International Nuclear Information System (INIS)

    Carta, M.; Burgio, N.; D'Angelo, A.; Santagata, A.; Petrovich, C.; Schikorr, M.; Beller, D.; Felice, L. S.; Imel, G.; Salvatores, M.

    2006-01-01

    This paper provides a comparison of the performance of an electron accelerator-driven experiment, under discussion within the Reactor Accelerator Coupling Experiments (RACE) Project, being conducted within the U.S. Dept. of Energy's Advanced Fuel Cycle Initiative (AFCI), and of the proton-driven experiment TRADE (TRIGA Accelerator Driven Experiment) originally planned at ENEA-Casaccia in Italy. Both experiments foresee the coupling to sub-critical TRIGA core configurations, and are aimed to investigate the relevant kinetic and dynamic accelerator-driven systems (ADS) core behavior characteristics in the presence of thermal reactivity feedback effects. TRADE was based on the coupling of an upgraded proton cyclotron, producing neutrons via spallation reactions on a tantalum (Ta) target, with the core driven at a maximum power around 200 kW. RACE is based on the coupling of an Electron Linac accelerator, producing neutrons via photoneutron reactions on a tungsten-copper (W-Cu) or uranium (U) target, with the core driven at a maximum power around 50 kW. The paper is focused on analysis of expected dynamic power response of the RACE core following reactivity and/or source transients. TRADE and RACE target-core power coupling coefficients are compared and discussed. (authors)

  17. A new approach to enhance the performance of decision tree for classifying gene expression data.

    Science.gov (United States)

    Hassan, Md; Kotagiri, Ramamohanarao

    2013-12-20

    Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.

  18. Performance Analysis of Waste Heat Driven Pressurized Adsorption Chiller

    KAUST Repository

    LOH, Wai Soong

    2010-01-01

    This article presents the transient modeling and performance of waste heat driven pressurized adsorption chillers for refrigeration at subzero applications. This innovative adsorption chiller employs pitch-based activated carbon of type Maxsorb III (adsorbent) with refrigerant R134a as the adsorbent-adsorbate pair. It consists of an evaporator, a condenser and two adsorber/desorber beds, and it utilizes a low-grade heat source to power the batch-operated cycle. The ranges of heat source temperatures are between 55 to 90°C whilst the cooling water temperature needed to reject heat is at 30°C. A parametric analysis is presented in the study where the effects of inlet temperature, adsorption/desorption cycle time and switching time on the system performance are reported in terms of cooling capacity and coefficient of performance. © 2010 by JSME.

  19. A Novel Online Data-Driven Algorithm for Detecting UAV Navigation Sensor Faults.

    Science.gov (United States)

    Sun, Rui; Cheng, Qi; Wang, Guanyu; Ochieng, Washington Yotto

    2017-09-29

    The use of Unmanned Aerial Vehicles (UAVs) has increased significantly in recent years. On-board integrated navigation sensors are a key component of UAVs' flight control systems and are essential for flight safety. In order to ensure flight safety, timely and effective navigation sensor fault detection capability is required. In this paper, a novel data-driven Adaptive Neuron Fuzzy Inference System (ANFIS)-based approach is presented for the detection of on-board navigation sensor faults in UAVs. Contrary to the classic UAV sensor fault detection algorithms, based on predefined or modelled faults, the proposed algorithm combines an online data training mechanism with the ANFIS-based decision system. The main advantages of this algorithm are that it allows real-time model-free residual analysis from Kalman Filter (KF) estimates and the ANFIS to build a reliable fault detection system. In addition, it allows fast and accurate detection of faults, which makes it suitable for real-time applications. Experimental results have demonstrated the effectiveness of the proposed fault detection method in terms of accuracy and misdetection rate.

  20. A Novel Online Data-Driven Algorithm for Detecting UAV Navigation Sensor Faults

    Directory of Open Access Journals (Sweden)

    Rui Sun

    2017-09-01

    Full Text Available The use of Unmanned Aerial Vehicles (UAVs has increased significantly in recent years. On-board integrated navigation sensors are a key component of UAVs’ flight control systems and are essential for flight safety. In order to ensure flight safety, timely and effective navigation sensor fault detection capability is required. In this paper, a novel data-driven Adaptive Neuron Fuzzy Inference System (ANFIS-based approach is presented for the detection of on-board navigation sensor faults in UAVs. Contrary to the classic UAV sensor fault detection algorithms, based on predefined or modelled faults, the proposed algorithm combines an online data training mechanism with the ANFIS-based decision system. The main advantages of this algorithm are that it allows real-time model-free residual analysis from Kalman Filter (KF estimates and the ANFIS to build a reliable fault detection system. In addition, it allows fast and accurate detection of faults, which makes it suitable for real-time applications. Experimental results have demonstrated the effectiveness of the proposed fault detection method in terms of accuracy and misdetection rate.

  1. Information-Driven Inspections

    International Nuclear Information System (INIS)

    Laughter, Mark D.; Whitaker, J. Michael; Lockwood, Dunbar

    2010-01-01

    New uranium enrichment capacity is being built worldwide in response to perceived shortfalls in future supply. To meet increasing safeguards responsibilities with limited resources, the nonproliferation community is exploring next-generation concepts to increase the effectiveness and efficiency of safeguards, such as advanced technologies to enable unattended monitoring of nuclear material. These include attribute measurement technologies, data authentication tools, and transmission and security methods. However, there are several conceptual issues with how such data would be used to improve the ability of a safeguards inspectorate such as the International Atomic Energy Agency (IAEA) to reach better safeguards conclusions regarding the activities of a State. The IAEA is pursuing the implementation of information-driven safeguards, whereby all available sources of information are used to make the application of safeguards more effective and efficient. Data from continuous, unattended monitoring systems can be used to optimize on-site inspection scheduling and activities at declared facilities, resulting in fewer, better inspections. Such information-driven inspections are the logical evolution of inspection planning - making use of all available information to enhance scheduled and randomized inspections. Data collection and analysis approaches for unattended monitoring systems can be designed to protect sensitive information while enabling information-driven inspections. A number of such inspections within a predetermined range could reduce inspection frequency while providing an equal or greater level of deterrence against illicit activity, all while meeting operator and technology holder requirements and reducing inspector and operator burden. Three options for using unattended monitoring data to determine an information-driven inspection schedule are to (1) send all unattended monitoring data off-site, which will require advances in data analysis techniques to

  2. Helioseismic and neutrino data-driven reconstruction of solar properties

    Science.gov (United States)

    Song, Ningqiang; Gonzalez-Garcia, M. C.; Villante, Francesco L.; Vinyoles, Nuria; Serenelli, Aldo

    2018-06-01

    In this work, we use Bayesian inference to quantitatively reconstruct the solar properties most relevant to the solar composition problem using as inputs the information provided by helioseismic and solar neutrino data. In particular, we use a Gaussian process to model the functional shape of the opacity uncertainty to gain flexibility and become as free as possible from prejudice in this regard. With these tools we first readdress the statistical significance of the solar composition problem. Furthermore, starting from a composition unbiased set of standard solar models (SSMs) we are able to statistically select those with solar chemical composition and other solar inputs which better describe the helioseismic and neutrino observations. In particular, we are able to reconstruct the solar opacity profile in a data-driven fashion, independently of any reference opacity tables, obtaining a 4 per cent uncertainty at the base of the convective envelope and 0.8 per cent at the solar core. When systematic uncertainties are included, results are 7.5 per cent and 2 per cent, respectively. In addition, we find that the values of most of the other inputs of the SSMs required to better describe the helioseismic and neutrino data are in good agreement with those adopted as the standard priors, with the exception of the astrophysical factor S11 and the microscopic diffusion rates, for which data suggests a 1 per cent and 30 per cent reduction, respectively. As an output of the study we derive the corresponding data-driven predictions for the solar neutrino fluxes.

  3. NOvA Event Building, Buffering and Data-Driven Triggering From Within the DAQ System

    International Nuclear Information System (INIS)

    Fischler, M; Rechenmacher, R; Green, C; Kowalkowski, J; Norman, A; Paterno, M

    2012-01-01

    The NOvA experiment is a long baseline neutrino experiment design to make precision probes of the structure of neutrino mixing. The experiment features a unique deadtimeless data acquisition system that is capable acquiring and building an event data stream from the continuous readout of the more than 360,000 far detector channels. In order to achieve its physics goals the experiment must be able to buffer, correlate and extract the data in this stream with the beam-spills that occur that Fermilab. In addition the NOvA experiment seeks to enhance its data collection efficiencies for rare class of event topologies that are valuable for calibration through the use of data driven triggering. The NOvA-DDT is a prototype Data-Driven Triggering system. NOvA-DDT has been developed using the Fermilab artdaq generic DAQ/Event-building toolkit. This toolkit provides the advantages of sharing online software infrastructure with other Intensity Frontier experiments, and of being able to use any offline analysis module-unchanged-as a component of the online triggering decisions. We have measured the performance and overhead of NOvA-DDT framework using a Hough transform based trigger decision module developed for the NOvA detector to identify cosmic rays. The results of these tests which were run on the NOvA prototype near detector, yielded a mean processing time of 98 ms per event, while consuming only 1/16th of the available processing capacity. These results provide a proof of concept that a NOvA-DDT based processing system is a viable strategy for data acquisition and triggering for the NOvA far detector.

  4. Accelerator driven systems for energy production and waste incineration: Physics, design and related nuclear data

    International Nuclear Information System (INIS)

    Herman, M.; Stanculescu, A.; Paver, N.

    2003-01-01

    This volume contains the notes of lectures given at the workshops 'Hybrid Nuclear Systems for Energy Production, Utilisation of Actinides and Transmutation of Long-lived Radioactive Waste' and 'Nuclear Data for Science and Technology: Accelerator Driven Waste Incineration', held at the Abdus Salam ICTP in September 2001. The subject of the first workshop was focused on the so-called Accelerator Driven Systems, and covered the most important physics and technological aspects of this innovative field. The second workshop was devoted to an exhaustive survey on the acquisition, evaluation, retrieval and validation of the nuclear data relevant to the design of Accelerator Driven Systems

  5. Accelerator driven systems for energy production and waste incineration: Physics, design and related nuclear data

    Energy Technology Data Exchange (ETDEWEB)

    Herman, M; Stanculescu, A [International Atomic Energy Agency, Vienna (Austria); Paver, N [University of Trieste and INFN, Trieste (Italy)

    2003-06-15

    This volume contains the notes of lectures given at the workshops 'Hybrid Nuclear Systems for Energy Production, Utilisation of Actinides and Transmutation of Long-lived Radioactive Waste' and 'Nuclear Data for Science and Technology: Accelerator Driven Waste Incineration', held at the Abdus Salam ICTP in September 2001. The subject of the first workshop was focused on the so-called Accelerator Driven Systems, and covered the most important physics and technological aspects of this innovative field. The second workshop was devoted to an exhaustive survey on the acquisition, evaluation, retrieval and validation of the nuclear data relevant to the design of Accelerator Driven Systems.

  6. A novel SDN enabled hybrid oiptical packet/circuit switched data centre network - The LIGHTNESS approach

    NARCIS (Netherlands)

    Peng, S.; Simeonidou, D.; Zervas, G.; Nejabati, R.; Yan, Y; Shu, Yi; Spadaro, S.; Perelló, J.; Agraz, F.; Careglio, D.; Dorren, H.J.S.; Miao, W.; Calabretta, N.; Bernini, G.; Ciulli, N.; Sancho, J.C.; Iordache, S.; Becerra, Y.; Farreras, M.; Biancani, M.; Predieri, A.; Proietti, R.; Cao, Z.; Liu, L.; Yoo, S.J.B.

    2014-01-01

    Current over-provisioned and multi-tier data centre networks (DCN) deploy rigid control and management platforms, which are not able to accommodate the ever-growing workload driven by the increasing demand of high-performance data centre (DC) and cloud applications. In response to this, the EC FP7

  7. Educational Accountability: A Qualitatively Driven Mixed-Methods Approach

    Science.gov (United States)

    Hall, Jori N.; Ryan, Katherine E.

    2011-01-01

    This article discusses the importance of mixed-methods research, in particular the value of qualitatively driven mixed-methods research for quantitatively driven domains like educational accountability. The article demonstrates the merits of qualitative thinking by describing a mixed-methods study that focuses on a middle school's system of…

  8. Use case driven approach to develop simulation model for PCS of APR1400 simulator

    International Nuclear Information System (INIS)

    Dong Wook, Kim; Hong Soo, Kim; Hyeon Tae, Kang; Byung Hwan, Bae

    2006-01-01

    The full-scope simulator is being developed to evaluate specific design feature and to support the iterative design and validation in the Man-Machine Interface System (MMIS) design of Advanced Power Reactor (APR) 1400. The simulator consists of process model, control logic model, and MMI for the APR1400 as well as the Power Control System (PCS). In this paper, a use case driven approach is proposed to develop a simulation model for PCS. In this approach, a system is considered from the point of view of its users. User's view of the system is based on interactions with the system and the resultant responses. In use case driven approach, we initially consider the system as a black box and look at its interactions with the users. From these interactions, use cases of the system are identified. Then the system is modeled using these use cases as functions. Lower levels expand the functionalities of each of these use cases. Hence, starting from the topmost level view of the system, we proceeded down to the lowest level (the internal view of the system). The model of the system thus developed is use case driven. This paper will introduce the functionality of the PCS simulation model, including a requirement analysis based on use case and the validation result of development of PCS model. The PCS simulation model using use case will be first used during the full-scope simulator development for nuclear power plant and will be supplied to Shin-Kori 3 and 4 plant. The use case based simulation model development can be useful for the design and implementation of simulation models. (authors)

  9. Neutron data for accelerator-driven transmutation technologies. Annual Report 2002/2003

    International Nuclear Information System (INIS)

    Blomgren, J.; Hildebrand, A.; Mermod, P.; Olsson, N.; Pomp, S.; Oesterlund, M.

    2003-08-01

    The project NATT, Neutron data for Accelerator-driven Transmutation Technology, is performed within the nuclear reactions group of the Department for neutron research, Uppsala university. The activities of the group is directed towards experimental studies of nuclear reaction probabilities of importance for various applications, like transmutation of nuclear waste, biomedical effects and electronics reliability. The experimental work is primarily undertaken at the The Svedberg Laboratory (TSL) in Uppsala, where the group has previously developed two world-unique instruments, MEDLEY and SCANDAL. Highlights from the past year: Analysis and documentation has been finalized of previously performed measurements of elastic neutron scattering from carbon and lead at 96 MeV. The precision in the results surpasses all previous data by at least an order of magnitude. These measurements represent the highest energy in neutron scattering where the ground state has been resolved. The results show that all previous theory work has underestimated the probability for neutron scattering at the present energy by 0-30 %. A new method for measurements of absolute probabilities for neutron-induced nuclear reactions with experimental techniques only has been developed. Previously, only two such methods have been known. One student has reached his PhD exam. Two PhD students have been accepted. TSL has decided to build a new neutron beam facility with significantly improved performance for these, and similar, activities. A new instrument for measurements of inelastic neutron scattering has been built, tested and found to meet the specifications. This work has been performed in collaboration with two French research groups from Caen and Nantes. The instrument is intended to be used for a series of experiments during the coming years. Previous work by the group on nuclear data for assessment of electronics reliability has lead to a new industry standard in the USA

  10. Neutron data for accelerator-driven transmutation technologies. Annual Report 2002/2003

    Energy Technology Data Exchange (ETDEWEB)

    Blomgren, J.; Hildebrand, A.; Mermod, P.; Olsson, N.; Pomp, S.; Oesterlund, M. [Uppsala Univ. (Sweden). Dept. for Neutron Research

    2003-08-01

    The project NATT, Neutron data for Accelerator-driven Transmutation Technology, is performed within the nuclear reactions group of the Department for neutron research, Uppsala university. The activities of the group is directed towards experimental studies of nuclear reaction probabilities of importance for various applications, like transmutation of nuclear waste, biomedical effects and electronics reliability. The experimental work is primarily undertaken at the The Svedberg Laboratory (TSL) in Uppsala, where the group has previously developed two world-unique instruments, MEDLEY and SCANDAL. Highlights from the past year: Analysis and documentation has been finalized of previously performed measurements of elastic neutron scattering from carbon and lead at 96 MeV. The precision in the results surpasses all previous data by at least an order of magnitude. These measurements represent the highest energy in neutron scattering where the ground state has been resolved. The results show that all previous theory work has underestimated the probability for neutron scattering at the present energy by 0-30 %. A new method for measurements of absolute probabilities for neutron-induced nuclear reactions with experimental techniques only has been developed. Previously, only two such methods have been known. One student has reached his PhD exam. Two PhD students have been accepted. TSL has decided to build a new neutron beam facility with significantly improved performance for these, and similar, activities. A new instrument for measurements of inelastic neutron scattering has been built, tested and found to meet the specifications. This work has been performed in collaboration with two French research groups from Caen and Nantes. The instrument is intended to be used for a series of experiments during the coming years. Previous work by the group on nuclear data for assessment of electronics reliability has lead to a new industry standard in the USA.

  11. Reliability analysis - systematic approach based on limited data

    International Nuclear Information System (INIS)

    Bourne, A.J.

    1975-11-01

    The initial approaches required for reliability analysis are outlined. These approaches highlight the system boundaries, examine the conditions under which the system is required to operate, and define the overall performance requirements. The discussion is illustrated by a simple example of an automatic protective system for a nuclear reactor. It is then shown how the initial approach leads to a method of defining the system, establishing performance parameters of interest and determining the general form of reliability models to be used. The overall system model and the availability of reliability data at the system level are next examined. An iterative process is then described whereby the reliability model and data requirements are systematically refined at progressively lower hierarchic levels of the system. At each stage, the approach is illustrated with examples from the protective system previously described. The main advantages of the approach put forward are the systematic process of analysis, the concentration of assessment effort in the critical areas and the maximum use of limited reliability data. (author)

  12. Feasibility of a patient-driven approach to recruiting older adults, caregivers, and clinicians for provider–patient communication research

    Science.gov (United States)

    Lingler, Jennifer H.; Martire, Lynn M.; Hunsaker, Amanda E.; Greene, Michele G.; Dew, Mary Amanda; Schulz, Richard

    2009-01-01

    Purpose This report describes the implementation of a novel, patient-driven approach to recruitment for a study of interpersonal communication in a primary care setting involving persons with Alzheimer’s disease (AD), their family caregivers, and their primary care providers (PCPs). Data sources Patients and caregivers were centrally recruited from a university-based memory clinic, followed by the recruitment of patient’s individual PCPs. Recruitment tracking, naturalistic observation, and survey methods were used to evaluate recruitment success. Conclusions About half of the patients and caregivers (n = 54; 51%) and most of the PCPs (n = 31; 76%) who we approached agreed to an audiorecording of the patient’s next PCP visit. Characteristics of patient, caregiver, and PCP participants were compared to those of nonparticipants. Patient characteristics did not differ by participation status. Caregivers who volunteered for the study were more likely to be female and married than were those who declined to participate. Compared to nonparticipants, PCPs who agreed to the study were appraised slightly more favorably by patients’ caregivers on a measure of satisfaction with care on the day of the visit. The vast majority of participating PCPs (95%) reported that the study had little or no impact on the flow of routine clinical operations. Implications for research Findings support the feasibility of a patient-driven approach to recruitment for studies involving multiple linked participants. Our discussion highlights possible advantages of such an approach, including the potential to empower patient participants while achieving maximum variability within the pool of clinician participants. PMID:19594656

  13. Field Performance of Inverter-Driven Heat Pumps in Cold Climates

    Energy Technology Data Exchange (ETDEWEB)

    Williamson, James [Consortium of Advanced Residential Buildings, Norwalk, CT (United States); Aldrich, Robb [Consortium of Advanced Residential Buildings, Norwalk, CT (United States)

    2015-08-19

    Traditionally, air-source heat pumps (ASHPs) have been used more often in warmer climates; however, some new ASHPs are gaining ground in colder areas. These systems operate at subzero (Fahrenheit) temperatures and many do not include backup electric resistance elements. There are still uncertainties, however, about capacity and efficiency in cold weather. Also, questions such as “how cold is too cold?” do not have clear answers. These uncertainties could lead to skepticism among homeowners; poor energy savings estimates; suboptimal system selection by heating, ventilating, and air-conditioning contractors; and inconsistent energy modeling. In an effort to better understand and characterize the heating performance of these units in cold climates, the U.S. Department of Energy Building America team, Consortium for Advanced Residential Buildings (CARB), monitored seven inverter-driven, ductless ASHPs across the Northeast. Operating data were collected for three Mitsubishi FE18 units, three Mitsubishi FE12 units, and one Fujitsu 15RLS2 unit. The intent of this research was to assess heat output, electricity consumption, and coefficients of performance (COPs) at various temperatures and load conditions. This assessment was accomplished with long- and short-term tests that measured power consumption; supply, return, and outdoor air temperatures; and airflow through the indoor fan coil.

  14. Automatic sleep classification using a data-driven topic model reveals latent sleep states

    DEFF Research Database (Denmark)

    Koch, Henriette; Christensen, Julie Anja Engelhard; Frandsen, Rune

    2014-01-01

    Latent Dirichlet Allocation. Model application was tested on control subjects and patients with periodic leg movements (PLM) representing a non-neurodegenerative group, and patients with idiopathic REM sleep behavior disorder (iRBD) and Parkinson's Disease (PD) representing a neurodegenerative group......Background: The golden standard for sleep classification uses manual scoring of polysomnography despite points of criticism such as oversimplification, low inter-rater reliability and the standard being designed on young and healthy subjects. New method: To meet the criticism and reveal the latent...... sleep states, this study developed a general and automatic sleep classifier using a data-driven approach. Spectral EEG and EOG measures and eye correlation in 1 s windows were calculated and each sleep epoch was expressed as a mixture of probabilities of latent sleep states by using the topic model...

  15. Data-driven fault detection for industrial processes canonical correlation analysis and projection based methods

    CERN Document Server

    Chen, Zhiwen

    2017-01-01

    Zhiwen Chen aims to develop advanced fault detection (FD) methods for the monitoring of industrial processes. With the ever increasing demands on reliability and safety in industrial processes, fault detection has become an important issue. Although the model-based fault detection theory has been well studied in the past decades, its applications are limited to large-scale industrial processes because it is difficult to build accurate models. Furthermore, motivated by the limitations of existing data-driven FD methods, novel canonical correlation analysis (CCA) and projection-based methods are proposed from the perspectives of process input and output data, less engineering effort and wide application scope. For performance evaluation of FD methods, a new index is also developed. Contents A New Index for Performance Evaluation of FD Methods CCA-based FD Method for the Monitoring of Stationary Processes Projection-based FD Method for the Monitoring of Dynamic Processes Benchmark Study and Real-Time Implementat...

  16. A muscle-driven approach to restore stepping with an exoskeleton for individuals with paraplegia.

    Science.gov (United States)

    Chang, Sarah R; Nandor, Mark J; Li, Lu; Kobetic, Rudi; Foglyano, Kevin M; Schnellenberger, John R; Audu, Musa L; Pinault, Gilles; Quinn, Roger D; Triolo, Ronald J

    2017-05-30

    Functional neuromuscular stimulation, lower limb orthosis, powered lower limb exoskeleton, and hybrid neuroprosthesis (HNP) technologies can restore stepping in individuals with paraplegia due to spinal cord injury (SCI). However, a self-contained muscle-driven controllable exoskeleton approach based on an implanted neural stimulator to restore walking has not been previously demonstrated, which could potentially result in system use outside the laboratory and viable for long term use or clinical testing. In this work, we designed and evaluated an untethered muscle-driven controllable exoskeleton to restore stepping in three individuals with paralysis from SCI. The self-contained HNP combined neural stimulation to activate the paralyzed muscles and generate joint torques for limb movements with a controllable lower limb exoskeleton to stabilize and support the user. An onboard controller processed exoskeleton sensor signals, determined appropriate exoskeletal constraints and stimulation commands for a finite state machine (FSM), and transmitted data over Bluetooth to an off-board computer for real-time monitoring and data recording. The FSM coordinated stimulation and exoskeletal constraints to enable functions, selected with a wireless finger switch user interface, for standing up, standing, stepping, or sitting down. In the stepping function, the FSM used a sensor-based gait event detector to determine transitions between gait phases of double stance, early swing, late swing, and weight acceptance. The HNP restored stepping in three individuals with motor complete paralysis due to SCI. The controller appropriately coordinated stimulation and exoskeletal constraints using the sensor-based FSM for subjects with different stimulation systems. The average range of motion at hip and knee joints during walking were 8.5°-20.8° and 14.0°-43.6°, respectively. Walking speeds varied from 0.03 to 0.06 m/s, and cadences from 10 to 20 steps/min. A self-contained muscle-driven

  17. Tracer kinetic model-driven registration for dynamic contrast-enhanced MRI time-series data.

    Science.gov (United States)

    Buonaccorsi, Giovanni A; O'Connor, James P B; Caunce, Angela; Roberts, Caleb; Cheung, Sue; Watson, Yvonne; Davies, Karen; Hope, Lynn; Jackson, Alan; Jayson, Gordon C; Parker, Geoffrey J M

    2007-11-01

    Dynamic contrast-enhanced MRI (DCE-MRI) time series data are subject to unavoidable physiological motion during acquisition (e.g., due to breathing) and this motion causes significant errors when fitting tracer kinetic models to the data, particularly with voxel-by-voxel fitting approaches. Motion correction is problematic, as contrast enhancement introduces new features into postcontrast images and conventional registration similarity measures cannot fully account for the increased image information content. A methodology is presented for tracer kinetic model-driven registration that addresses these problems by explicitly including a model of contrast enhancement in the registration process. The iterative registration procedure is focused on a tumor volume of interest (VOI), employing a three-dimensional (3D) translational transformation that follows only tumor motion. The implementation accurately removes motion corruption in a DCE-MRI software phantom and it is able to reduce model fitting errors and improve localization in 3D parameter maps in patient data sets that were selected for significant motion problems. Sufficient improvement was observed in the modeling results to salvage clinical trial DCE-MRI data sets that would otherwise have to be rejected due to motion corruption. Copyright 2007 Wiley-Liss, Inc.

  18. Automatic translation of MPI source into a latency-tolerant, data-driven form

    International Nuclear Information System (INIS)

    Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric; Quinlan, Dan; Baden, Scott

    2017-01-01

    Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. We reformulate MPI source into a task dependency graph representation, which partially orders the tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotation for a variety of applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo’s performance meets or exceeds that of labor-intensive hand coding. As a result, the translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a well-known library.

  19. Service and Data Driven Multi Business Model Platform in a World of Persuasive Technologies

    DEFF Research Database (Denmark)

    Andersen, Troels Christian; Bjerrum, Torben Cæsar Bisgaard

    2016-01-01

    companies in establishing a service organization that delivers, creates and captures value through service and data driven business models by utilizing their network, resources and customers and/or users. Furthermore, based on literature and collaboration with the case company, the suggestion of a new...... framework provides the necessary construction of how the manufac- turing companies can evolve their current business to provide multi service and data driven business models, using the same resources, networks and customers....

  20. Data-Driven Robust RVFLNs Modeling of a Blast Furnace Iron-Making Process Using Cauchy Distribution Weighted M-Estimation

    Energy Technology Data Exchange (ETDEWEB)

    Zhou, Ping; Lv, Youbin; Wang, Hong; Chai, Tianyou

    2017-09-01

    Optimal operation of a practical blast furnace (BF) ironmaking process depends largely on a good measurement of molten iron quality (MIQ) indices. However, measuring the MIQ online is not feasible using the available techniques. In this paper, a novel data-driven robust modeling is proposed for online estimation of MIQ using improved random vector functional-link networks (RVFLNs). Since the output weights of traditional RVFLNs are obtained by the least squares approach, a robustness problem may occur when the training dataset is contaminated with outliers. This affects the modeling accuracy of RVFLNs. To solve this problem, a Cauchy distribution weighted M-estimation based robust RFVLNs is proposed. Since the weights of different outlier data are properly determined by the Cauchy distribution, their corresponding contribution on modeling can be properly distinguished. Thus robust and better modeling results can be achieved. Moreover, given that the BF is a complex nonlinear system with numerous coupling variables, the data-driven canonical correlation analysis is employed to identify the most influential components from multitudinous factors that affect the MIQ indices to reduce the model dimension. Finally, experiments using industrial data and comparative studies have demonstrated that the obtained model produces a better modeling and estimating accuracy and stronger robustness than other modeling methods.

  1. Approaching human performance the functionality-driven Awiwi robot hand

    CERN Document Server

    Grebenstein, Markus

    2014-01-01

    Humanoid robotics have made remarkable progress since the dawn of robotics. So why don't we have humanoid robot assistants in day-to-day life yet? This book analyzes the keys to building a successful humanoid robot for field robotics, where collisions become an unavoidable part of the game. The author argues that the design goal should be real anthropomorphism, as opposed to mere human-like appearance. He deduces three major characteristics to aim for when designing a humanoid robot, particularly robot hands: _ Robustness against impacts _ Fast dynamics _ Human-like grasping and manipulation performance   Instead of blindly copying human anatomy, this book opts for a holistic design me-tho-do-lo-gy. It analyzes human hands and existing robot hands to elucidate the important functionalities that are the building blocks toward these necessary characteristics.They are the keys to designing an anthropomorphic robot hand, as illustrated in the high performance anthropomorphic Awiwi Hand presented in this book.  ...

  2. Objective, Quantitative, Data-Driven Assessment of Chemical Probes.

    Science.gov (United States)

    Antolin, Albert A; Tym, Joseph E; Komianou, Angeliki; Collins, Ian; Workman, Paul; Al-Lazikani, Bissan

    2018-02-15

    Chemical probes are essential tools for understanding biological systems and for target validation, yet selecting probes for biomedical research is rarely based on objective assessment of all potential compounds. Here, we describe the Probe Miner: Chemical Probes Objective Assessment resource, capitalizing on the plethora of public medicinal chemistry data to empower quantitative, objective, data-driven evaluation of chemical probes. We assess >1.8 million compounds for their suitability as chemical tools against 2,220 human targets and dissect the biases and limitations encountered. Probe Miner represents a valuable resource to aid the identification of potential chemical probes, particularly when used alongside expert curation. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  3. Privacy in Sensor-Driven Human Data Collection: A Guide for Practitioners

    OpenAIRE

    Stopczynski, Arkadiusz; Pietri, Riccardo; Pentland, Alex; Lazer, David; Lehmann, Sune

    2014-01-01

    In recent years, the amount of information collected about human beings has increased dramatically. This development has been partially driven by individuals posting and storing data about themselves and friends using online social networks or collecting their data for self-tracking purposes (quantified-self movement). Across the sciences, researchers conduct studies collecting data with an unprecedented resolution and scale. Using computational power combined with mathematical models, such r...

  4. Data Driven Performance Evaluation of Wireless Sensor Networks

    Directory of Open Access Journals (Sweden)

    Antonio A. F. Loureiro

    2010-03-01

    Full Text Available Wireless Sensor Networks are presented as devices for signal sampling and reconstruction. Within this framework, the qualitative and quantitative influence of (i signal granularity, (ii spatial distribution of sensors, (iii sensors clustering, and (iv signal reconstruction procedure are assessed. This is done by defining an error metric and performing a Monte Carlo experiment. It is shown that all these factors have significant impact on the quality of the reconstructed signal. The extent of such impact is quantitatively assessed.

  5. No Evidence That Gratitude Enhances Neural Performance Monitoring or Conflict-Driven Control.

    Science.gov (United States)

    Saunders, Blair; He, Frank F H; Inzlicht, Michael

    2015-01-01

    It has recently been suggested that gratitude can benefit self-regulation by reducing impulsivity during economic decision making. We tested if comparable benefits of gratitude are observed for neural performance monitoring and conflict-driven self-control. In a pre-post design, 61 participants were randomly assigned to either a gratitude or happiness condition, and then performed a pre-induction flanker task. Subsequently, participants recalled an autobiographical event where they had felt grateful or happy, followed by a post-induction flanker task. Despite closely following existing protocols, participants in the gratitude condition did not report elevated gratefulness compared to the happy group. In regard to self-control, we found no association between gratitude--operationalized by experimental condition or as a continuous predictor--and any control metric, including flanker interference, post-error adjustments, or neural monitoring (the error-related negativity, ERN). Thus, while gratitude might increase economic patience, such benefits may not generalize to conflict-driven control processes.

  6. Data driven profiting from your most important business asset

    CERN Document Server

    Redman, Thomas C

    2008-01-01

    Your company's data has the potential to add enormous value to every facet of the organization -- from marketing and new product development to strategy to financial management. Yet if your company is like most, it's not using its data to create strategic advantage. Data sits around unused -- or incorrect data fouls up operations and decision making. In Data Driven, Thomas Redman, the "Data Doc," shows how to leverage and deploy data to sharpen your company's competitive edge and enhance its profitability. The author reveals: · The special properties that make data such a powerful asset · The hidden costs of flawed, outdated, or otherwise poor-quality data · How to improve data quality for competitive advantage · Strategies for exploiting your data to make better business decisions · The many ways to bring data to market · Ideas for dealing with political struggles over data and concerns about privacy rights Your company's data is a key business asset, and you need to manage it aggressively and professi...

  7. More powerful significant testing for time course gene expression data using functional principal component analysis approaches.

    Science.gov (United States)

    Wu, Shuang; Wu, Hulin

    2013-01-16

    One of the fundamental problems in time course gene expression data analysis is to identify genes associated with a biological process or a particular stimulus of interest, like a treatment or virus infection. Most of the existing methods for this problem are designed for data with longitudinal replicates. But in reality, many time course gene experiments have no replicates or only have a small number of independent replicates. We focus on the case without replicates and propose a new method for identifying differentially expressed genes by incorporating the functional principal component analysis (FPCA) into a hypothesis testing framework. The data-driven eigenfunctions allow a flexible and parsimonious representation of time course gene expression trajectories, leaving more degrees of freedom for the inference compared to that using a prespecified basis. Moreover, the information of all genes is borrowed for individual gene inferences. The proposed approach turns out to be more powerful in identifying time course differentially expressed genes compared to the existing methods. The improved performance is demonstrated through simulation studies and a real data application to the Saccharomyces cerevisiae cell cycle data.

  8. Level-set simulations of buoyancy-driven motion of single and multiple bubbles

    International Nuclear Information System (INIS)

    Balcázar, Néstor; Lehmkuhl, Oriol; Jofre, Lluís; Oliva, Assensi

    2015-01-01

    Highlights: • A conservative level-set method is validated and verified. • An extensive study of buoyancy-driven motion of single bubbles is performed. • The interactions of two spherical and ellipsoidal bubbles is studied. • The interaction of multiple bubbles is simulated in a vertical channel. - Abstract: This paper presents a numerical study of buoyancy-driven motion of single and multiple bubbles by means of the conservative level-set method. First, an extensive study of the hydrodynamics of single bubbles rising in a quiescent liquid is performed, including its shape, terminal velocity, drag coefficients and wake patterns. These results are validated against experimental and numerical data well established in the scientific literature. Then, a further study on the interaction of two spherical and ellipsoidal bubbles is performed for different orientation angles. Finally, the interaction of multiple bubbles is explored in a periodic vertical channel. The results show that the conservative level-set approach can be used for accurate modelling of bubble dynamics. Moreover, it is demonstrated that the present method is numerically stable for a wide range of Morton and Reynolds numbers.

  9. Dynamic Service Selection in Workflows Using Performance Data

    Directory of Open Access Journals (Sweden)

    David W. Walker

    2007-01-01

    Full Text Available An approach to dynamic workflow management and optimisation using near-realtime performance data is presented. Strategies are discussed for choosing an optimal service (based on user-specified criteria from several semantically equivalent Web services. Such an approach may involve finding "similar" services, by first pruning the set of discovered services based on service metadata, and subsequently selecting an optimal service based on performance data. The current implementation of the prototype workflow framework is described, and demonstrated with a simple workflow. Performance results are presented that show the performance benefits of dynamic service selection. A statistical analysis based on the first order statistic is used to investigate the likely improvement in service response time arising from dynamic service selection.

  10. A laser driven source of spin polarized atomic hydrogen and deuterium

    International Nuclear Information System (INIS)

    Poelker, M.; Coulter, K.P.; Holt, R.J.; Jones, C.E.; Kowalczyk, R.S.; Young, L.; Toporkov, D.

    1993-01-01

    Recent results from a laser-driven source of polarized hydrogen (H) and deuterium (D) are presented. The performance of the source is described as a function of atomic flow rate and magnetic field. The data suggest that because atomic densities in the source are high, the system can approach spin-temperature equilibrium although applied magnetic fields are much larger than the critical field of the atoms. The authors also observe that potassium contamination in the source emittance can be reduced to a negligible amount using a teflon-lined transport tube

  11. Data-free and data-driven spectral perturbations for RANS UQ

    Science.gov (United States)

    Edeling, Wouter; Mishra, Aashwin; Iaccarino, Gianluca

    2017-11-01

    Despite recent developments in high-fidelity turbulent flow simulations, RANS modeling is still vastly used by industry, due to its inherent low cost. Since accuracy is a concern in RANS modeling, model-form UQ is an essential tool for assessing the impacts of this uncertainty on quantities of interest. Applying the spectral decomposition to the modeled Reynolds-Stress Tensor (RST) allows for the introduction of decoupled perturbations into the baseline intensity (kinetic energy), shape (eigenvalues), and orientation (eigenvectors). This constitutes a natural methodology to evaluate the model form uncertainty associated to different aspects of RST modeling. In a predictive setting, one frequently encounters an absence of any relevant reference data. To make data-free predictions with quantified uncertainty we employ physical bounds to a-priori define maximum spectral perturbations. When propagated, these perturbations yield intervals of engineering utility. High-fidelity data opens up the possibility of inferring a distribution of uncertainty, by means of various data-driven machine-learning techniques. We will demonstrate our framework on a number of flow problems where RANS models are prone to failure. This research was partially supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo), and the DOE PSAAP-II program.

  12. Event-Driven Technology to Generate Relevant Collections of Near-Realtime Data

    Science.gov (United States)

    Graves, S. J.; Keiser, K.; Nair, U. S.; Beck, J. M.; Ebersole, S.

    2017-12-01

    Getting the right data when it is needed continues to be a challenge for researchers and decision makers. Event-Driven Data Delivery (ED3), funded by the NASA Applied Science program, is a technology that allows researchers and decision makers to pre-plan what data, information and processes they need to have collected or executed in response to future events. The Information Technology and Systems Center at the University of Alabama in Huntsville (UAH) has developed the ED3 framework in collaboration with atmospheric scientists at UAH, scientists at the Geological Survey of Alabama, and other federal, state and local stakeholders to meet the data preparedness needs for research, decisions and situational awareness. The ED3 framework supports an API that supports the addition of loosely-coupled, distributed event handlers and data processes. This approach allows the easy addition of new events and data processes so the system can scale to support virtually any type of event or data process. Using ED3's underlying services, applications have been developed that monitor for alerts of registered event types and automatically triggers subscriptions that match new events, providing users with a living "album" of results that can continued to be curated as more information for an event becomes available. This capability can allow users to improve capacity for the collection, creation and use of data and real-time processes (data access, model execution, product generation, sensor tasking, social media filtering, etc), in response to disaster (and other) events by preparing in advance for data and information needs for future events. This presentation will provide an update on the ED3 developments and deployments, and further explain the applicability for utilizing near-realtime data in hazards research, response and situational awareness.

  13. Data-Driven Model Reduction and Transfer Operator Approximation

    Science.gov (United States)

    Klus, Stefan; Nüske, Feliks; Koltai, Péter; Wu, Hao; Kevrekidis, Ioannis; Schütte, Christof; Noé, Frank

    2018-06-01

    In this review paper, we will present different data-driven dimension reduction techniques for dynamical systems that are based on transfer operator theory as well as methods to approximate transfer operators and their eigenvalues, eigenfunctions, and eigenmodes. The goal is to point out similarities and differences between methods developed independently by the dynamical systems, fluid dynamics, and molecular dynamics communities such as time-lagged independent component analysis, dynamic mode decomposition, and their respective generalizations. As a result, extensions and best practices developed for one particular method can be carried over to other related methods.

  14. A data driven nonlinear stochastic model for blood glucose dynamics.

    Science.gov (United States)

    Zhang, Yan; Holt, Tim A; Khovanova, Natalia

    2016-03-01

    The development of adequate mathematical models for blood glucose dynamics may improve early diagnosis and control of diabetes mellitus (DM). We have developed a stochastic nonlinear second order differential equation to describe the response of blood glucose concentration to food intake using continuous glucose monitoring (CGM) data. A variational Bayesian learning scheme was applied to define the number and values of the system's parameters by iterative optimisation of free energy. The model has the minimal order and number of parameters to successfully describe blood glucose dynamics in people with and without DM. The model accounts for the nonlinearity and stochasticity of the underlying glucose-insulin dynamic process. Being data-driven, it takes full advantage of available CGM data and, at the same time, reflects the intrinsic characteristics of the glucose-insulin system without detailed knowledge of the physiological mechanisms. We have shown that the dynamics of some postprandial blood glucose excursions can be described by a reduced (linear) model, previously seen in the literature. A comprehensive analysis demonstrates that deterministic system parameters belong to different ranges for diabetes and controls. Implications for clinical practice are discussed. This is the first study introducing a continuous data-driven nonlinear stochastic model capable of describing both DM and non-DM profiles. Copyright © 2015 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.

  15. Meeting performance goals by the use of experience data

    International Nuclear Information System (INIS)

    Salmon, M.W.; Kennedy, R.P.

    1993-01-01

    DOE Order 5480.28 requires that structures, systems and components (SSCs) be designed and constructed to withstand the effects of natural phenomena hazards. For SSCs to be acceptable, it must be demonstrated that there is a sufficiently low probability of failure of those SSCs consistent with established performance goals. For new design, NPH loads are taken from probabilistic hazard assessments and coupled with response and evaluation methods to control the levels of conservatism required to achieve performance goals. For components qualified by test, performance goals are achieved by specifying a test response spectrum that envelops a required response spectrum coupled with minimal acceptance standards. DOE Standard 1020-92 adapts both of these approaches to ensure that the required performance goals are met for new installations. For existing installations these approaches are generally not applicable. There is a need for a simple approach for use in verifying the performance of existing equipment subject to seismic hazards. The USNRC has adapted such an approach for the resolution of USI A-46 in the Generic Implementation Procedure (GIP). A simple set of screening rules, keyed to a generic bounding spectrum forms the basis of the USNRC approach. A similar approach is being adapted for use in the DOE. The DOE approach, however, must also ensure that appropriate performance goals are met when the general screens are met. This paper summarizes research to date on the topic of meeting performance goals by the use of experience data. The paper presents a review of the background material, a summary of the requirements for existing components, a summary of the approach used in establishing the performance goals associated with experience data approaches, and a summary of results to date. Simplified criteria are proposed

  16. Generalized approach to bilateral control for EMG driven exoskeleton

    Directory of Open Access Journals (Sweden)

    Gradetsky Valery

    2017-01-01

    Full Text Available The paper discusses a generalized approach to bilateral control for EMG driven exoskeleton systems. In this paper we consider a semi-automatic mechatronic system that is controlled via human muscle activity (EMG level. The problem is to understand how the movement of the exoskeleton effects on the control. The considered system can be described in terms of bilateral control. This means the existence of force feedback from the object via the exoskeleton links and drives to operator. The simulation of the considered model was held on the MATLAB Simulink. The mathematical model of the bilateral system with exoskeleton and operator was developed. Transient functions for different dynamic parameters were obtained. It was shown that force feedback is essential for the R&D of such systems.

  17. Data Science and its Relationship to Big Data and Data-Driven Decision Making.

    Science.gov (United States)

    Provost, Foster; Fawcett, Tom

    2013-03-01

    Companies have realized they need to hire data scientists, academic institutions are scrambling to put together data-science programs, and publications are touting data science as a hot-even "sexy"-career choice. However, there is confusion about what exactly data science is, and this confusion could lead to disillusionment as the concept diffuses into meaningless buzz. In this article, we argue that there are good reasons why it has been hard to pin down exactly what is data science. One reason is that data science is intricately intertwined with other important concepts also of growing importance, such as big data and data-driven decision making. Another reason is the natural tendency to associate what a practitioner does with the definition of the practitioner's field; this can result in overlooking the fundamentals of the field. We believe that trying to define the boundaries of data science precisely is not of the utmost importance. We can debate the boundaries of the field in an academic setting, but in order for data science to serve business effectively, it is important (i) to understand its relationships to other important related concepts, and (ii) to begin to identify the fundamental principles underlying data science. Once we embrace (ii), we can much better understand and explain exactly what data science has to offer. Furthermore, only once we embrace (ii) should we be comfortable calling it data science. In this article, we present a perspective that addresses all these concepts. We close by offering, as examples, a partial list of fundamental principles underlying data science.

  18. The test of data driven TDC application in high energy physics experiment

    International Nuclear Information System (INIS)

    Liu Shubin; Guo Jianhua; Zhang Yanli; Zhao Long; An Qi

    2006-01-01

    In the high energy physics domain there is a trend to use integrated, high resolution, multi-hit time-digital-converter for time measurement, of which the data driven TDC is an important direction. Study on the method of how to test high performance TDC's characters and how to improve these characters will help us to select the proper TDC. The authors have studied the testing of a new high resolution TDC, which is planned to use in the third modification project of Beijing Spectrometer (BESIII). This paper introduces the test platform we built for the TDC, and the method by which we tested for nonlinearity, resolution, double pulse resolution characters, etc. The paper also gives the test results and introduces the compensation way to achieve a very high resolution (24.4 ps). (authors)

  19. Development of a Data-Driven Predictive Model of Supply Air Temperature in an Air-Handling Unit for Conserving Energy

    Directory of Open Access Journals (Sweden)

    Goopyo Hong

    2018-02-01

    Full Text Available The purpose of this study was to develop a data-driven predictive model that can predict the supply air temperature (SAT in an air-handling unit (AHU by using a neural network. A case study was selected, and AHU operational data from December 2015 to November 2016 was collected. A data-driven predictive model was generated through an evolving process that consisted of an initial model, an optimal model, and an adaptive model. In order to develop the optimal model, input variables, the number of neurons and hidden layers, and the period of the training data set were considered. Since AHU data changes over time, an adaptive model, which has the ability to actively cope with constantly changing data, was developed. This adaptive model determined the model with the lowest mean square error (MSE of the 91 models, which had two hidden layers and sets up a 12-hour test set at every prediction. The adaptive model used recently collected data as training data and utilized the sliding window technique rather than the accumulative data method. Furthermore, additional testing was performed to validate the adaptive model using AHU data from another building. The final adaptive model predicts SAT to a root mean square error (RMSE of less than 0.6 °C.

  20. A multi-source satellite data approach for modelling Lake Turkana water level: Calibration and validation using satellite altimetry data

    Science.gov (United States)

    Velpuri, N.M.; Senay, G.B.; Asante, K.O.

    2012-01-01

    Lake Turkana is one of the largest desert lakes in the world and is characterized by high degrees of interand intra-annual fluctuations. The hydrology and water balance of this lake have not been well understood due to its remote location and unavailability of reliable ground truth datasets. Managing surface water resources is a great challenge in areas where in-situ data are either limited or unavailable. In this study, multi-source satellite-driven data such as satellite-based rainfall estimates, modelled runoff, evapotranspiration, and a digital elevation dataset were used to model Lake Turkana water levels from 1998 to 2009. Due to the unavailability of reliable lake level data, an approach is presented to calibrate and validate the water balance model of Lake Turkana using a composite lake level product of TOPEX/Poseidon, Jason-1, and ENVISAT satellite altimetry data. Model validation results showed that the satellitedriven water balance model can satisfactorily capture the patterns and seasonal variations of the Lake Turkana water level fluctuations with a Pearson's correlation coefficient of 0.90 and a Nash-Sutcliffe Coefficient of Efficiency (NSCE) of 0.80 during the validation period (2004-2009). Model error estimates were within 10% of the natural variability of the lake. Our analysis indicated that fluctuations in Lake Turkana water levels are mainly driven by lake inflows and over-the-lake evaporation. Over-the-lake rainfall contributes only up to 30% of lake evaporative demand. During the modelling time period, Lake Turkana showed seasonal variations of 1-2m. The lake level fluctuated in the range up to 4m between the years 1998-2009. This study demonstrated the usefulness of satellite altimetry data to calibrate and validate the satellite-driven hydrological model for Lake Turkana without using any in-situ data. Furthermore, for Lake Turkana, we identified and outlined opportunities and challenges of using a calibrated satellite-driven water balance

  1. Facilitating Data Driven Business Model Innovation - A Case study

    DEFF Research Database (Denmark)

    Bjerrum, Torben Cæsar Bisgaard; Andersen, Troels Christian; Aagaard, Annabeth

    2016-01-01

    . The businesses interdisciplinary capabilities come into play in the BMI process, where knowledge from the facilitation strategy and knowledge from phases of the BMI process needs to be present to create new knowledge, hence new BMs and innovations. Depending on the environment and shareholders, this also exposes......This paper aims to understand the barriers that businesses meet in understanding their current business models (BM) and in their attempt at innovating new data driven business models (DDBM) using data. The interdisciplinary challenge of knowledge exchange occurring outside and/or inside businesses......, that gathers knowledge is of great importance. The SMEs have little, if no experience, within data handling, data analytics, and working with structured Business Model Innovation (BMI), that relates to both new and conventional products, processes and services. This new frontier of data and BMI will have...

  2. Fault Detection for Nonlinear Process With Deterministic Disturbances: A Just-In-Time Learning Based Data Driven Method.

    Science.gov (United States)

    Yin, Shen; Gao, Huijun; Qiu, Jianbin; Kaynak, Okyay

    2017-11-01

    Data-driven fault detection plays an important role in industrial systems due to its applicability in case of unknown physical models. In fault detection, disturbances must be taken into account as an inherent characteristic of processes. Nevertheless, fault detection for nonlinear processes with deterministic disturbances still receive little attention, especially in data-driven field. To solve this problem, a just-in-time learning-based data-driven (JITL-DD) fault detection method for nonlinear processes with deterministic disturbances is proposed in this paper. JITL-DD employs JITL scheme for process description with local model structures to cope with processes dynamics and nonlinearity. The proposed method provides a data-driven fault detection solution for nonlinear processes with deterministic disturbances, and owns inherent online adaptation and high accuracy of fault detection. Two nonlinear systems, i.e., a numerical example and a sewage treatment process benchmark, are employed to show the effectiveness of the proposed method.

  3. IMPROVING BANDWIDTH OF FLIPPED VOLTAGE FOLLOWER USING GATE-BODY DRIVEN TECHNIQUE

    Directory of Open Access Journals (Sweden)

    VANDANA NIRANJAN

    2017-01-01

    Full Text Available In this paper, a new approach to enhance the bandwidth of flipped voltage follower is explored. The proposed approach is based on gate-body driven technique. This technique boosts the transconductance in a MOS transistor as both gate and body/bulk terminals are tied together and used as signal input. This novel technique appears as a good solution to merge the advantages of gate-driven and bulk-driven techniques and suppress their disadvantages. The gate-body driven technique utilizes body effect to enable low voltage low power operation and improves the overall performance of flipped voltage follower, providing it with low output impedance, high input impedance and bandwidth extension ratio of 2.614. The most attractive feature is that bandwidth enhancement has been achieved without use of any passive component or extra circuitry. Simulations in PSpice environment for 180 nm CMOS technology verified the predicted theoretical results. The improved flipped voltage follower is particularly interesting for high frequency low noise signal processing applications.

  4. Data-Driven Optimization of Incentive-based Demand Response System with Uncertain Responses of Customers

    Directory of Open Access Journals (Sweden)

    Jimyung Kang

    2017-10-01

    Full Text Available Demand response is nowadays considered as another type of generator, beyond just a simple peak reduction mechanism. A demand response service provider (DRSP can, through its subcontracts with many energy customers, virtually generate electricity with actual load reduction. However, in this type of virtual generator, the amount of load reduction includes inevitable uncertainty, because it consists of a very large number of independent energy customers. While they may reduce energy today, they might not tomorrow. In this circumstance, a DSRP must choose a proper set of these uncertain customers to achieve the exact preferred amount of load curtailment. In this paper, the customer selection problem for a service provider that consists of uncertain responses of customers is defined and solved. The uncertainty of energy reduction is fully considered in the formulation with data-driven probability distribution modeling and stochastic programming technique. The proposed optimization method that utilizes only the observed load data provides a realistic and applicable solution to a demand response system. The performance of the proposed optimization is verified with real demand response event data in Korea, and the results show increased and stabilized performance from the service provider’s perspective.

  5. Open-source chemogenomic data-driven algorithms for predicting drug-target interactions.

    Science.gov (United States)

    Hao, Ming; Bryant, Stephen H; Wang, Yanli

    2018-02-06

    While novel technologies such as high-throughput screening have advanced together with significant investment by pharmaceutical companies during the past decades, the success rate for drug development has not yet been improved prompting researchers looking for new strategies of drug discovery. Drug repositioning is a potential approach to solve this dilemma. However, experimental identification and validation of potential drug targets encoded by the human genome is both costly and time-consuming. Therefore, effective computational approaches have been proposed to facilitate drug repositioning, which have proved to be successful in drug discovery. Doubtlessly, the availability of open-accessible data from basic chemical biology research and the success of human genome sequencing are crucial to develop effective in silico drug repositioning methods allowing the identification of potential targets for existing drugs. In this work, we review several chemogenomic data-driven computational algorithms with source codes publicly accessible for predicting drug-target interactions (DTIs). We organize these algorithms by model properties and model evolutionary relationships. We re-implemented five representative algorithms in R programming language, and compared these algorithms by means of mean percentile ranking, a new recall-based evaluation metric in the DTI prediction research field. We anticipate that this review will be objective and helpful to researchers who would like to further improve existing algorithms or need to choose appropriate algorithms to infer potential DTIs in the projects. The source codes for DTI predictions are available at: https://github.com/minghao2016/chemogenomicAlg4DTIpred. Published by Oxford University Press 2018. This work is written by US Government employees and is in the public domain in the US.

  6. Framework for developing hybrid process-driven, artificial neural network and regression models for salinity prediction in river systems

    Science.gov (United States)

    Hunter, Jason M.; Maier, Holger R.; Gibbs, Matthew S.; Foale, Eloise R.; Grosvenor, Naomi A.; Harders, Nathan P.; Kikuchi-Miller, Tahali C.

    2018-05-01

    Salinity modelling in river systems is complicated by a number of processes, including in-stream salt transport and various mechanisms of saline accession that vary dynamically as a function of water level and flow, often at different temporal scales. Traditionally, salinity models in rivers have either been process- or data-driven. The primary problem with process-based models is that in many instances, not all of the underlying processes are fully understood or able to be represented mathematically. There are also often insufficient historical data to support model development. The major limitation of data-driven models, such as artificial neural networks (ANNs) in comparison, is that they provide limited system understanding and are generally not able to be used to inform management decisions targeting specific processes, as different processes are generally modelled implicitly. In order to overcome these limitations, a generic framework for developing hybrid process and data-driven models of salinity in river systems is introduced and applied in this paper. As part of the approach, the most suitable sub-models are developed for each sub-process affecting salinity at the location of interest based on consideration of model purpose, the degree of process understanding and data availability, which are then combined to form the hybrid model. The approach is applied to a 46 km reach of the Murray River in South Australia, which is affected by high levels of salinity. In this reach, the major processes affecting salinity include in-stream salt transport, accession of saline groundwater along the length of the reach and the flushing of three waterbodies in the floodplain during overbank flows of various magnitudes. Based on trade-offs between the degree of process understanding and data availability, a process-driven model is developed for in-stream salt transport, an ANN model is used to model saline groundwater accession and three linear regression models are used

  7. Noise-driven phenomena in hysteretic systems

    CERN Document Server

    Dimian, Mihai

    2014-01-01

    Noise-Driven Phenomena in Hysteretic Systems provides a general approach to nonlinear systems with hysteresis driven by noisy inputs, which leads to a unitary framework for the analysis of various stochastic aspects of hysteresis. This book includes integral, differential and algebraic models that are used to describe scalar and vector hysteretic nonlinearities originating from various areas of science and engineering. The universality of the authors approach is also reflected by the diversity of the models used to portray the input noise, from the classical Gaussian white noise to its impulsive forms, often encountered in economics and biological systems, and pink noise, ubiquitous in multi-stable electronic systems. The book is accompanied by HysterSoft© - a robust simulation environment designed to perform complex hysteresis modeling – that can be used by the reader to reproduce many of the results presented in the book as well as to research both disruptive and constructive effects of noise in hysteret...

  8. User-driven Cloud Implementation of environmental models and data for all

    Science.gov (United States)

    Gurney, R. J.; Percy, B. J.; Elkhatib, Y.; Blair, G. S.

    2014-12-01

    Environmental data and models come from disparate sources over a variety of geographical and temporal scales with different resolutions and data standards, often including terabytes of data and model simulations. Unfortunately, these data and models tend to remain solely within the custody of the private and public organisations which create the data, and the scientists who build models and generate results. Although many models and datasets are theoretically available to others, the lack of ease of access tends to keep them out of reach of many. We have developed an intuitive web-based tool that utilises environmental models and datasets located in a cloud to produce results that are appropriate to the user. Storyboards showing the interfaces and visualisations have been created for each of several exemplars. A library of virtual machine images has been prepared to serve these exemplars. Each virtual machine image has been tailored to run computer models appropriate to the end user. Two approaches have been used; first as RESTful web services conforming to the Open Geospatial Consortium (OGC) Web Processing Service (WPS) interface standard using the Python-based PyWPS; second, a MySQL database interrogated using PHP code. In all cases, the web client sends the server an HTTP GET request to execute the process with a number of parameter values and, once execution terminates, an XML or JSON response is sent back and parsed at the client side to extract the results. All web services are stateless, i.e. application state is not maintained by the server, reducing its operational overheads and simplifying infrastructure management tasks such as load balancing and failure recovery. A hybrid cloud solution has been used with models and data sited on both private and public clouds. The storyboards have been transformed into intuitive web interfaces at the client side using HTML, CSS and JavaScript, utilising plug-ins such as jQuery and Flot (for graphics), and Google Maps

  9. Toward Data-Driven Design of Educational Courses: A Feasibility Study

    Science.gov (United States)

    Agrawal, Rakesh; Golshan, Behzad; Papalexakis, Evangelos

    2016-01-01

    A study plan is the choice of concepts and the organization and sequencing of the concepts to be covered in an educational course. While a good study plan is essential for the success of any course offering, the design of study plans currently remains largely a manual task. We present a novel data-driven method, which given a list of concepts can…

  10. A K-means multivariate approach for clustering independent components from magnetoencephalographic data.

    Science.gov (United States)

    Spadone, Sara; de Pasquale, Francesco; Mantini, Dante; Della Penna, Stefania

    2012-09-01

    Independent component analysis (ICA) is typically applied on functional magnetic resonance imaging, electroencephalographic and magnetoencephalographic (MEG) data due to its data-driven nature. In these applications, ICA needs to be extended from single to multi-session and multi-subject studies for interpreting and assigning a statistical significance at the group level. Here a novel strategy for analyzing MEG independent components (ICs) is presented, Multivariate Algorithm for Grouping MEG Independent Components K-means based (MAGMICK). The proposed approach is able to capture spatio-temporal dynamics of brain activity in MEG studies by running ICA at subject level and then clustering the ICs across sessions and subjects. Distinctive features of MAGMICK are: i) the implementation of an efficient set of "MEG fingerprints" designed to summarize properties of MEG ICs as they are built on spatial, temporal and spectral parameters; ii) the implementation of a modified version of the standard K-means procedure to improve its data-driven character. This algorithm groups the obtained ICs automatically estimating the number of clusters through an adaptive weighting of the parameters and a constraint on the ICs independence, i.e. components coming from the same session (at subject level) or subject (at group level) cannot be grouped together. The performances of MAGMICK are illustrated by analyzing two sets of MEG data obtained during a finger tapping task and median nerve stimulation. The results demonstrate that the method can extract consistent patterns of spatial topography and spectral properties across sessions and subjects that are in good agreement with the literature. In addition, these results are compared to those from a modified version of affinity propagation clustering method. The comparison, evaluated in terms of different clustering validity indices, shows that our methodology often outperforms the clustering algorithm. Eventually, these results are

  11. Regulatory approach to enhanced human performance during accidents

    International Nuclear Information System (INIS)

    Palla, R.L. Jr.

    1990-01-01

    It has become increasingly clear in recent years that the risk associated with nuclear power is driven by human performance. Although human errors have contributed heavily to the two core-melt events that have occurred at power reactors, effective performance during an event can also prevent a degraded situation from progressing to a more serious accident, as in the loss-of-feedwater event at Davis-Besse. Sensitivity studies in which human error rates for various categories of errors in a probabilistic risk assessment (PRA) were varied confirm the importance of human performance. Moreover, these studies suggest that actions taken during an accident are at least as important as errors that occur prior to an initiating event. A program that will lead to enhanced accident management capabilities in the nuclear industry is being developed by the US Nuclear Regulatory Commission (NRC) and industry and is a key element in NRC's integration plan for closure of severe-accident issues. The focus of the accident management (AM) program is on human performance during accidents, with emphasis on in-plant response. The AM program extends the defense-in-depth principle to plant operating staff. The goal is to take advantage of existing plant equipment and operator skills and creativity to find ways to terminate accidents that are beyond the design basis. The purpose of this paper is to describe the NRC's objectives and approach in AM as well as to discuss several human performance issues that are central to AM

  12. Model-driven design using IEC 61499 a synchronous approach for embedded and automation systems

    CERN Document Server

    Yoong, Li Hsien; Bhatti, Zeeshan E; Kuo, Matthew M Y

    2015-01-01

    This book describes a novel approach for the design of embedded systems and industrial automation systems, using a unified model-driven approach that is applicable in both domains.  The authors illustrate their methodology, using the IEC 61499 standard as the main vehicle for specification, verification, static timing analysis and automated code synthesis.  The well-known synchronous approach is used as the main vehicle for defining an unambiguous semantics that ensures determinism and deadlock freedom. The proposed approach also ensures very efficient implementations either on small-scale embedded devices or on industry-scale programmable automation controllers (PACs). It can be used for both centralized and distributed implementations. Significantly, the proposed approach can be used without the need for any run-time support. This approach, for the first time, blurs the gap between embedded systems and automation systems and can be applied in wide-ranging applications in automotive, robotics, and industri...

  13. Cognitive Effects of Mindfulness Training: Results of a Pilot Study Based on a Theory Driven Approach

    OpenAIRE

    Wimmer, Lena; Bellingrath, Silja; von Stockhausen, Lisa

    2016-01-01

    The present paper reports a pilot study which tested cognitive effects of mindfulness practice in a theory-driven approach. Thirty-four fifth graders received either a mindfulness training which was based on the mindfulness-based stress reduction approach (experimental group), a concentration training (active control group), or no treatment (passive control group). Based on the operational definition of mindfulness by Bishop et al. (2004), effects on sustained attention, cognitive flexibility...

  14. Extension of a data-driven gating technique to 3D, whole body PET studies

    International Nuclear Information System (INIS)

    Schleyer, Paul J; O'Doherty, Michael J; Marsden, Paul K

    2011-01-01

    Respiratory gating can be used to separate a PET acquisition into a series of near motion-free bins. This is typically done using additional gating hardware; however, software-based methods can derive the respiratory signal from the acquired data itself. The aim of this work was to extend a data-driven respiratory gating method to acquire gated, 3D, whole body PET images of clinical patients. The existing method, previously demonstrated with 2D, single bed-position data, uses a spectral analysis to find regions in raw PET data which are subject to respiratory motion. The change in counts over time within these regions is then used to estimate the respiratory signal of the patient. In this work, the gating method was adapted to only accept lines of response from a reduced set of axial angles, and the respiratory frequency derived from the lung bed position was used to help identify the respiratory frequency in all other bed positions. As the respiratory signal does not identify the direction of motion, a registration-based technique was developed to align the direction for all bed positions. Data from 11 clinical FDG PET patients were acquired, and an optical respiratory monitor was used to provide a hardware-based signal for comparison. All data were gated using both the data-driven and hardware methods, and reconstructed. The centre of mass of manually defined regions on gated images was calculated, and the overall displacement was defined as the change in the centre of mass between the first and last gates. The mean displacement was 10.3 mm for the data-driven gated images and 9.1 mm for the hardware gated images. No significant difference was found between the two gating methods when comparing the displacement values. The adapted data-driven gating method was demonstrated to successfully produce respiratory gated, 3D, whole body, clinical PET acquisitions.

  15. An architecture for a continuous, user-driven, and data-driven application of clinical guidelines and its evaluation.

    Science.gov (United States)

    Shalom, Erez; Shahar, Yuval; Lunenfeld, Eitan

    2016-02-01

    Design, implement, and evaluate a new architecture for realistic continuous guideline (GL)-based decision support, based on a series of requirements that we have identified, such as support for continuous care, for multiple task types, and for data-driven and user-driven modes. We designed and implemented a new continuous GL-based support architecture, PICARD, which accesses a temporal reasoning engine, and provides several different types of application interfaces. We present the new architecture in detail in the current paper. To evaluate the architecture, we first performed a technical evaluation of the PICARD architecture, using 19 simulated scenarios in the preeclampsia/toxemia domain. We then performed a functional evaluation with the help of two domain experts, by generating patient records that simulate 60 decision points from six clinical guideline-based scenarios, lasting from two days to four weeks. Finally, 36 clinicians made manual decisions in half of the scenarios, and had access to the automated GL-based support in the other half. The measures used in all three experiments were correctness and completeness of the decisions relative to the GL. Mean correctness and completeness in the technical evaluation were 1±0.0 and 0.96±0.03 respectively. The functional evaluation produced only several minor comments from the two experts, mostly regarding the output's style; otherwise the system's recommendations were validated. In the clinically oriented evaluation, the 36 clinicians applied manually approximately 41% of the GL's recommended actions. Completeness increased to approximately 93% when using PICARD. Manual correctness was approximately 94.5%, and remained similar when using PICARD; but while 68% of the manual decisions included correct but redundant actions, only 3% of the actions included in decisions made when using PICARD were redundant. The PICARD architecture is technically feasible and is functionally valid, and addresses the realistic

  16. Applying dynamic data collection to improve dry electrode system performance for a P300-based brain-computer interface

    Science.gov (United States)

    Clements, J. M.; Sellers, E. W.; Ryan, D. B.; Caves, K.; Collins, L. M.; Throckmorton, C. S.

    2016-12-01

    Objective. Dry electrodes have an advantage over gel-based ‘wet’ electrodes by providing quicker set-up time for electroencephalography recording; however, the potentially poorer contact can result in noisier recordings. We examine the impact that this may have on brain-computer interface communication and potential approaches for mitigation. Approach. We present a performance comparison of wet and dry electrodes for use with the P300 speller system in both healthy participants and participants with communication disabilities (ALS and PLS), and investigate the potential for a data-driven dynamic data collection algorithm to compensate for the lower signal-to-noise ratio (SNR) in dry systems. Main results. Performance results from sixteen healthy participants obtained in the standard static data collection environment demonstrate a substantial loss in accuracy with the dry system. Using a dynamic stopping algorithm, performance may have been improved by collecting more data in the dry system for ten healthy participants and eight participants with communication disabilities; however, the algorithm did not fully compensate for the lower SNR of the dry system. An analysis of the wet and dry system recordings revealed that delta and theta frequency band power (0.1-4 Hz and 4-8 Hz, respectively) are consistently higher in dry system recordings across participants, indicating that transient and drift artifacts may be an issue for dry systems. Significance. Using dry electrodes is desirable for reduced set-up time; however, this study demonstrates that online performance is significantly poorer than for wet electrodes for users with and without disabilities. We test a new application of dynamic stopping algorithms to compensate for poorer SNR. Dynamic stopping improved dry system performance; however, further signal processing efforts are likely necessary for full mitigation.

  17. A Knowledge Model Sharing Based Approach to Privacy-Preserving Data Mining

    OpenAIRE

    Hongwei Tian; Weining Zhang; Shouhuai Xu; Patrick Sharkey

    2012-01-01

    Privacy-preserving data mining (PPDM) is an important problem and is currently studied in three approaches: the cryptographic approach, the data publishing, and the model publishing. However, each of these approaches has some problems. The cryptographic approach does not protect privacy of learned knowledge models and may have performance and scalability issues. The data publishing, although is popular, may suffer from too much utility loss for certain types of data mining applications. The m...

  18. Ability Grouping and Differentiated Instruction in an Era of Data-Driven Decision Making

    Science.gov (United States)

    Park, Vicki; Datnow, Amanda

    2017-01-01

    Despite data-driven decision making being a ubiquitous part of policy and school reform efforts, little is known about how teachers use data for instructional decision making. Drawing on data from a qualitative case study of four elementary schools, we examine the logic and patterns of teacher decision making about differentiation and ability…

  19. A Survey on Economic-driven Evaluations of Information Technology

    NARCIS (Netherlands)

    Mutschler, B.B.; Zarvic, N.; Reichert, M.U.

    2007-01-01

    The economic-driven evaluation of information technology (IT) has become an important instrument in the management of IT projects. Numerous approaches have been developed to quantify the costs of an IT investment and its assumed profit, to evaluate its impact on business process performance, and to

  20. Are Improvements in Measured Performance Driven by Better Treatment or "Denominator Management"?

    Science.gov (United States)

    Harris, Alex H S; Chen, Cheng; Rubinsky, Anna D; Hoggatt, Katherine J; Neuman, Matthew; Vanneman, Megan E

    2016-04-01

    Process measures of healthcare quality are usually formulated as the number of patients who receive evidence-based treatment (numerator) divided by the number of patients in the target population (denominator). When the systems being evaluated can influence which patients are included in the denominator, it is reasonable to wonder if improvements in measured quality are driven by expanding numerators or contracting denominators. In 2003, the US Department of Veteran Affairs (VA) based executive compensation in part on performance on a substance use disorder (SUD) continuity-of-care quality measure. The first goal of this study was to evaluate if implementing the measure in this way resulted in expected improvements in measured performance. The second goal was to examine if the proportion of patients with SUD who qualified for the denominator contracted after the quality measure was implemented, and to describe the facility-level variation in and correlates of denominator contraction or expansion. Using 40 quarters of data straddling the implementation of the performance measure, an interrupted time series design was used to evaluate changes in two outcomes. All veterans with an SUD diagnosis in all VA facilities from fiscal year 2000 to 2009. The two outcomes were 1) measured performance-patients retained/patients qualified and 2) denominator prevalence-patients qualified/patients with SUD program contact. Measured performance improved over time (P management, and also the exploration of "shadow measures" to monitor and reduce undesirable denominator management.