WorldWideScience

Sample records for ensemble prediction system

  1. Reliability of windstorm predictions in the ECMWF ensemble prediction system

    Science.gov (United States)

    Becker, Nico; Ulbrich, Uwe

    2016-04-01

    Windstorms caused by extratropical cyclones are one of the most dangerous natural hazards in the European region. Therefore, reliable predictions of such storm events are needed. Case studies have shown that ensemble prediction systems (EPS) are able to provide useful information about windstorms between two and five days prior to the event. In this work, ensemble predictions with the European Centre for Medium-Range Weather Forecasts (ECMWF) EPS are evaluated in a four year period. Within the 50 ensemble members, which are initialized every 12 hours and are run for 10 days, windstorms are identified and tracked in time and space. By using a clustering approach, different predictions of the same storm are identified in the different ensemble members and compared to reanalysis data. The occurrence probability of the predicted storms is estimated by fitting a bivariate normal distribution to the storm track positions. Our results show, for example, that predicted storm clusters with occurrence probabilities of more than 50% have a matching observed storm in 80% of all cases at a lead time of two days. The predicted occurrence probabilities are reliable up to 3 days lead time. At longer lead times the occurrence probabilities are overestimated by the EPS.

  2. Developing an Ensemble Prediction System based on COSMO-DE

    Science.gov (United States)

    Theis, S.; Gebhardt, C.; Buchhold, M.; Ben Bouallègue, Z.; Ohl, R.; Paulat, M.; Peralta, C.

    2010-09-01

    The numerical weather prediction model COSMO-DE is a configuration of the COSMO model with a horizontal grid size of 2.8 km. It has been running operationally at DWD since 2007, it covers the area of Germany and produces forecasts with a lead time of 0-21 hours. The model COSMO-DE is convection-permitting, which means that it does without a parametrisation of deep convection and simulates deep convection explicitly. One aim is an improved forecast of convective heavy rain events. Convection-permitting models are in operational use at several weather services, but currently not in ensemble mode. It is expected that an ensemble system could reveal the advantages of a convection-permitting model even better. The probabilistic approach is necessary, because the explicit simulation of convective processes for more than a few hours cannot be viewed as a deterministic forecast anymore. This is due to the chaotic behaviour and short life cycle of the processes which are simulated explicitly now. In the framework of the project COSMO-DE-EPS, DWD is developing and implementing an ensemble prediction system (EPS) for the model COSMO-DE. The project COSMO-DE-EPS comprises the generation of ensemble members, as well as the verification and visualization of the ensemble forecasts and also statistical postprocessing. A pre-operational mode of the EPS with 20 ensemble members is foreseen to start in 2010. Operational use is envisaged to start in 2012, after an upgrade to 40 members and inclusion of statistical postprocessing. The presentation introduces the project COSMO-DE-EPS and describes the design of the ensemble as it is planned for the pre-operational mode. In particular, the currently implemented method for the generation of ensemble members will be explained and discussed. The method includes variations of initial conditions, lateral boundary conditions, and model physics. At present, pragmatic methods are applied which resemble the basic ideas of a multi-model approach

  3. A short-range ensemble prediction system for southern Africa

    CSIR Research Space (South Africa)

    Park, R

    2012-10-01

    Full Text Available system for southern Africa R PARK, WA LANDMAN AND F ENGELBRECHT CSIR, PO Box 395, Pretoria, South Africa, 0001 Email: xxxxxxxxxxxxxx@csir.co.za ? www.csir.co.za INTRODUCTION This research has been conducted in order to develop a short-range ensemble... stream_source_info Park_2012.pdf.txt stream_content_type text/plain stream_size 7211 Content-Encoding ISO-8859-1 stream_name Park_2012.pdf.txt Content-Type text/plain; charset=ISO-8859-1 A short-range ensemble prediction...

  4. An evaluation of the Canadian global meteorological ensemble prediction system for short-term hydrological forecasting

    Directory of Open Access Journals (Sweden)

    F. Anctil

    2009-11-01

    Full Text Available Hydrological forecasting consists in the assessment of future streamflow. Current deterministic forecasts do not give any information concerning the uncertainty, which might be limiting in a decision-making process. Ensemble forecasts are expected to fill this gap.

    In July 2007, the Meteorological Service of Canada has improved its ensemble prediction system, which has been operational since 1998. It uses the GEM model to generate a 20-member ensemble on a 100 km grid, at mid-latitudes. This improved system is used for the first time for hydrological ensemble predictions. Five watersheds in Quebec (Canada are studied: Chaudière, Châteauguay, Du Nord, Kénogami and Du Lièvre. An interesting 17-day rainfall event has been selected in October 2007. Forecasts are produced in a 3 h time step for a 3-day forecast horizon. The deterministic forecast is also available and it is compared with the ensemble ones. In order to correct the bias of the ensemble, an updating procedure has been applied to the output data. Results showed that ensemble forecasts are more skilful than the deterministic ones, as measured by the Continuous Ranked Probability Score (CRPS, especially for 72 h forecasts. However, the hydrological ensemble forecasts are under dispersed: a situation that improves with the increasing length of the prediction horizons. We conjecture that this is due in part to the fact that uncertainty in the initial conditions of the hydrological model is not taken into account.

  5. A MITgcm/DART ensemble analysis and prediction system with application to the Gulf of Mexico

    KAUST Repository

    Hoteit, Ibrahim

    2013-09-01

    This paper describes the development of an advanced ensemble Kalman filter (EnKF)-based ocean data assimilation system for prediction of the evolution of the loop current in the Gulf of Mexico (GoM). The system integrates the Data Assimilation Research Testbed (DART) assimilation package with the Massachusetts Institute of Technology ocean general circulation model (MITgcm). The MITgcm/DART system supports the assimilation of a wide range of ocean observations and uses an ensemble approach to solve the nonlinear assimilation problems. The GoM prediction system was implemented with an eddy-resolving 1/10th degree configuration of the MITgcm. Assimilation experiments were performed over a 6-month period between May and October during a strong loop current event in 1999. The model was sequentially constrained with weekly satellite sea surface temperature and altimetry data. Experiments results suggest that the ensemble-based assimilation system shows a high predictive skill in the GoM, with estimated ensemble spread mainly concentrated around the front of the loop current. Further analysis of the system estimates demonstrates that the ensemble assimilation accurately reproduces the observed features without imposing any negative impact on the dynamical balance of the system. Results from sensitivity experiments with respect to the ensemble filter parameters are also presented and discussed. © 2013 Elsevier B.V.

  6. The Development of Storm Surge Ensemble Prediction System and Case Study of Typhoon Meranti in 2016

    Science.gov (United States)

    Tsai, Y. L.; Wu, T. R.; Terng, C. T.; Chu, C. H.

    2017-12-01

    Taiwan is under the threat of storm surge and associated inundation, which is located at a potentially severe storm generation zone. The use of ensemble prediction can help forecasters to know the characteristic of storm surge under the uncertainty of track and intensity. In addition, it can help the deterministic forecasting. In this study, the kernel of ensemble prediction system is based on COMCOT-SURGE (COrnell Multi-grid COupled Tsunami Model - Storm Surge). COMCOT-SURGE solves nonlinear shallow water equations in Open Ocean and coastal regions with the nested-grid scheme and adopts wet-dry-cell treatment to calculate potential inundation area. In order to consider tide-surge interaction, the global TPXO 7.1 tide model provides the tidal boundary conditions. After a series of validations and case studies, COMCOT-SURGE has become an official operating system of Central Weather Bureau (CWB) in Taiwan. In this study, the strongest typhoon in 2016, Typhoon Meranti, is chosen as a case study. We adopt twenty ensemble members from CWB WRF Ensemble Prediction System (CWB WEPS), which differs from parameters of microphysics, boundary layer, cumulus, and surface. From box-and-whisker results, maximum observed storm surges were located in the interval of the first and third quartile at more than 70 % gauge locations, e.g. Toucheng, Chengkung, and Jiangjyun. In conclusion, the ensemble prediction can effectively help forecasters to predict storm surge especially under the uncertainty of storm track and intensity

  7. A polynomial chaos ensemble hydrologic prediction system for efficient parameter inference and robust uncertainty assessment

    Science.gov (United States)

    Wang, S.; Huang, G. H.; Baetz, B. W.; Huang, W.

    2015-11-01

    This paper presents a polynomial chaos ensemble hydrologic prediction system (PCEHPS) for an efficient and robust uncertainty assessment of model parameters and predictions, in which possibilistic reasoning is infused into probabilistic parameter inference with simultaneous consideration of randomness and fuzziness. The PCEHPS is developed through a two-stage factorial polynomial chaos expansion (PCE) framework, which consists of an ensemble of PCEs to approximate the behavior of the hydrologic model, significantly speeding up the exhaustive sampling of the parameter space. Multiple hypothesis testing is then conducted to construct an ensemble of reduced-dimensionality PCEs with only the most influential terms, which is meaningful for achieving uncertainty reduction and further acceleration of parameter inference. The PCEHPS is applied to the Xiangxi River watershed in China to demonstrate its validity and applicability. A detailed comparison between the HYMOD hydrologic model, the ensemble of PCEs, and the ensemble of reduced PCEs is performed in terms of accuracy and efficiency. Results reveal temporal and spatial variations in parameter sensitivities due to the dynamic behavior of hydrologic systems, and the effects (magnitude and direction) of parametric interactions depending on different hydrological metrics. The case study demonstrates that the PCEHPS is capable not only of capturing both expert knowledge and probabilistic information in the calibration process, but also of implementing an acceleration of more than 10 times faster than the hydrologic model without compromising the predictive accuracy.

  8. A short-range multi-model ensemble weather prediction system for South Africa

    CSIR Research Space (South Africa)

    Landman, S

    2010-09-01

    Full Text Available prediction system (EPS) at the South African Weather Service (SAWS) are examined. The ensemble consists of different forecasts from the 12-km LAM of the UK Met Office Unified Model (UM) and the Conformal-Cubic Atmospheric Model (CCAM) covering the South...

  9. Relative effects of statistical preprocessing and postprocessing on a regional hydrological ensemble prediction system

    Science.gov (United States)

    Sharma, Sanjib; Siddique, Ridwan; Reed, Seann; Ahnert, Peter; Mendoza, Pablo; Mejia, Alfonso

    2018-03-01

    The relative roles of statistical weather preprocessing and streamflow postprocessing in hydrological ensemble forecasting at short- to medium-range forecast lead times (day 1-7) are investigated. For this purpose, a regional hydrologic ensemble prediction system (RHEPS) is developed and implemented. The RHEPS is comprised of the following components: (i) hydrometeorological observations (multisensor precipitation estimates, gridded surface temperature, and gauged streamflow); (ii) weather ensemble forecasts (precipitation and near-surface temperature) from the National Centers for Environmental Prediction 11-member Global Ensemble Forecast System Reforecast version 2 (GEFSRv2); (iii) NOAA's Hydrology Laboratory-Research Distributed Hydrologic Model (HL-RDHM); (iv) heteroscedastic censored logistic regression (HCLR) as the statistical preprocessor; (v) two statistical postprocessors, an autoregressive model with a single exogenous variable (ARX(1,1)) and quantile regression (QR); and (vi) a comprehensive verification strategy. To implement the RHEPS, 1 to 7 days weather forecasts from the GEFSRv2 are used to force HL-RDHM and generate raw ensemble streamflow forecasts. Forecasting experiments are conducted in four nested basins in the US Middle Atlantic region, ranging in size from 381 to 12 362 km2. Results show that the HCLR preprocessed ensemble precipitation forecasts have greater skill than the raw forecasts. These improvements are more noticeable in the warm season at the longer lead times (> 3 days). Both postprocessors, ARX(1,1) and QR, show gains in skill relative to the raw ensemble streamflow forecasts, particularly in the cool season, but QR outperforms ARX(1,1). The scenarios that implement preprocessing and postprocessing separately tend to perform similarly, although the postprocessing-alone scenario is often more effective. The scenario involving both preprocessing and postprocessing consistently outperforms the other scenarios. In some cases

  10. Towards an Australian ensemble streamflow forecasting system for flood prediction and water management

    Science.gov (United States)

    Bennett, J.; David, R. E.; Wang, Q.; Li, M.; Shrestha, D. L.

    2016-12-01

    Flood forecasting in Australia has historically relied on deterministic forecasting models run only when floods are imminent, with considerable forecaster input and interpretation. These now co-existed with a continually available 7-day streamflow forecasting service (also deterministic) aimed at operational water management applications such as environmental flow releases. The 7-day service is not optimised for flood prediction. We describe progress on developing a system for ensemble streamflow forecasting that is suitable for both flood prediction and water management applications. Precipitation uncertainty is handled through post-processing of Numerical Weather Prediction (NWP) output with a Bayesian rainfall post-processor (RPP). The RPP corrects biases, downscales NWP output, and produces reliable ensemble spread. Ensemble precipitation forecasts are used to force a semi-distributed conceptual rainfall-runoff model. Uncertainty in precipitation forecasts is insufficient to reliably describe streamflow forecast uncertainty, particularly at shorter lead-times. We characterise hydrological prediction uncertainty separately with a 4-stage error model. The error model relies on data transformation to ensure residuals are homoscedastic and symmetrically distributed. To ensure streamflow forecasts are accurate and reliable, the residuals are modelled using a mixture-Gaussian distribution with distinct parameters for the rising and falling limbs of the forecast hydrograph. In a case study of the Murray River in south-eastern Australia, we show ensemble predictions of floods generally have lower errors than deterministic forecasting methods. We also discuss some of the challenges in operationalising short-term ensemble streamflow forecasts in Australia, including meeting the needs for accurate predictions across all flow ranges and comparing forecasts generated by event and continuous hydrological models.

  11. NCAR's Experimental Real-time Convection-allowing Ensemble Prediction System

    Science.gov (United States)

    Schwartz, C. S.; Romine, G. S.; Sobash, R.; Fossell, K.

    2016-12-01

    Since April 2015, the National Center for Atmospheric Research's (NCAR's) Mesoscale and Microscale Meteorology (MMM) Laboratory, in collaboration with NCAR's Computational Information Systems Laboratory (CISL), has been producing daily, real-time, 10-member, 48-hr ensemble forecasts with 3-km horizontal grid spacing over the conterminous United States (http://ensemble.ucar.edu). These computationally-intensive, next-generation forecasts are produced on the Yellowstone supercomputer, have been embraced by both amateur and professional weather forecasters, are widely used by NCAR and university researchers, and receive considerable attention on social media. Initial conditions are supplied by NCAR's Data Assimilation Research Testbed (DART) software and the forecast model is NCAR's Weather Research and Forecasting (WRF) model; both WRF and DART are community tools. This presentation will focus on cutting-edge research results leveraging the ensemble dataset, including winter weather predictability, severe weather forecasting, and power outage modeling. Additionally, the unique design of the real-time analysis and forecast system and computational challenges and solutions will be described.

  12. Evaluation of the NMC regional ensemble prediction system during the Beijing 2008 Olympic Games

    Science.gov (United States)

    Li, Xiaoli; Tian, Hua; Deng, Guo

    2011-10-01

    Based on the B08RDP (Beijing 2008 Olympic Games Mesoscale Ensemble Prediction Research and Development Project) that was launched by the World Weather Research Programme (WWRP) in 2004, a regional ensemble prediction system (REPS) at a 15-km horizontal resolution was developed at the National Meteorological Center (NMC) of the China Meteorological Administration (CMA). Supplementing to the forecasters' subjective affirmation on the promising performance of the REPS during the 2008 Beijing Olympic Games (BOG), this paper focuses on the objective verification of the REPS for precipitation forecasts during the BOG period. By use of a set of advanced probabilistic verification scores, the value of the REPS compared to the quasi-operational global ensemble prediction system (GEPS) is assessed for a 36-day period (21 July-24 August 2008). The evaluation here involves different aspects of the REPS and GEPS, including their general forecast skills, specific attributes (reliability and resolution), and related economic values. The results indicate that the REPS generally performs significantly better for the short-range precipitation forecasts than the GEPS, and for light to heavy rainfall events, the REPS provides more skillful forecasts for accumulated 6- and 24-h precipitation. By further identifying the performance of the REPS through the attribute-focused measures, it is found that the advantages of the REPS over the GEPS come from better reliability (smaller biases and better dispersion) and increased resolution. Also, evaluation of a decision-making score reveals that a much larger group of users benefits from using the REPS forecasts than using the single model (the control run) forecasts, especially for the heavy rainfall events.

  13. The state of the art of flood forecasting - Hydrological Ensemble Prediction Systems

    Science.gov (United States)

    Thielen-Del Pozo, J.; Pappenberger, F.; Salamon, P.; Bogner, K.; Burek, P.; de Roo, A.

    2010-09-01

    Flood forecasting systems form a key part of ‘preparedness' strategies for disastrous floods and provide hydrological services, civil protection authorities and the public with information of upcoming events. Provided the warning leadtime is sufficiently long, adequate preparatory actions can be taken to efficiently reduce the impacts of the flooding. Because of the specific characteristics of each catchment, varying data availability and end-user demands, the design of the best flood forecasting system may differ from catchment to catchment. However, despite the differences in concept and data needs, there is one underlying issue that spans across all systems. There has been an growing awareness and acceptance that uncertainty is a fundamental issue of flood forecasting and needs to be dealt with at the different spatial and temporal scales as well as the different stages of the flood generating processes. Today, operational flood forecasting centres change increasingly from single deterministic forecasts to probabilistic forecasts with various representations of the different contributions of uncertainty. The move towards these so-called Hydrological Ensemble Prediction Systems (HEPS) in flood forecasting represents the state of the art in forecasting science, following on the success of the use of ensembles for weather forecasting (Buizza et al., 2005) and paralleling the move towards ensemble forecasting in other related disciplines such as climate change predictions. The use of HEPS has been internationally fostered by initiatives such as "The Hydrologic Ensemble Prediction Experiment" (HEPEX), created with the aim to investigate how best to produce, communicate and use hydrologic ensemble forecasts in hydrological short-, medium- und long term prediction of hydrological processes. The advantages of quantifying the different contributions of uncertainty as well as the overall uncertainty to obtain reliable and useful flood forecasts also for extreme events

  14. Prediction and Monitoring of Monsoon Intraseasonal Oscillations over Indian Monsoon Region in an Ensemble Prediction System using CFSv2

    Science.gov (United States)

    Borah, Nabanita; Sukumarpillai, Abhilash; Sahai, Atul Kumar; Chattopadhyay, Rajib; Joseph, Susmitha; De, Soumyendu; Nath Goswami, Bhupendra; Kumar, Arun

    2014-05-01

    An ensemble prediction system (EPS) is devised for the extended range prediction (ERP) of monsoon intraseasonal oscillations (MISO) of Indian summer monsoon (ISM) using NCEP Climate Forecast System model version2 at T126 horizontal resolution. The EPS is formulated by producing 11 member ensembles through the perturbation of atmospheric initial conditions. The hindcast experiments were conducted at every 5-day interval for 45 days lead time starting from 16th May to 28th September during 2001-2012. The general simulation of ISM characteristics and the ERP skill of the proposed EPS at pentad mean scale are evaluated in the present study. Though the EPS underestimates both the mean and variability of ISM rainfall, it simulates the northward propagation of MISO reasonably well. It is found that the signal-to-noise ratio becomes unity by about18 days and the predictability error saturates by about 25 days. Though useful deterministic forecasts could be generated up to 2nd pentad lead, significant correlations are observed even up to 4th pentad lead. The skill in predicting large-scale MISO, which is assessed by comparing the predicted and observed MISO indices, is found to be ~17 days. It is noted that the prediction skill of actual rainfall is closely related to the prediction of amplitude of large scale MISO as well as the initial conditions related to the different phases of MISO. Categorical prediction skills reveals that break is more skillfully predicted, followed by active and then normal. The categorical probability skill scores suggest that useful probabilistic forecasts could be generated even up to 4th pentad lead.

  15. A MITgcm/DART ensemble analysis and prediction system with application to the Gulf of Mexico

    KAUST Repository

    Hoteit, Ibrahim; Hoar, Timothy J.; Gopalakrishnan, Ganesh; Collins, Nancy S.; Anderson, Jeffrey L.; Cornuelle, Bruce D.; Kö hl, Armin; Heimbach, Patrick

    2013-01-01

    Research Testbed (DART) assimilation package with the Massachusetts Institute of Technology ocean general circulation model (MITgcm). The MITgcm/DART system supports the assimilation of a wide range of ocean observations and uses an ensemble approach

  16. Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System

    Directory of Open Access Journals (Sweden)

    Jinjian Jiang

    2017-07-01

    Full Text Available Hotspot residues are important in the determination of protein-protein interactions, and they always perform specific functions in biological processes. The determination of hotspot residues is by the commonly-used method of alanine scanning mutagenesis experiments, which is always costly and time consuming. To address this issue, computational methods have been developed. Most of them are structure based, i.e., using the information of solved protein structures. However, the number of solved protein structures is extremely less than that of sequences. Moreover, almost all of the predictors identified hotspots from the interfaces of protein complexes, seldom from the whole protein sequences. Therefore, determining hotspots from whole protein sequences by sequence information alone is urgent. To address the issue of hotspot predictions from the whole sequences of proteins, we proposed an ensemble system with random projections using statistical physicochemical properties of amino acids. First, an encoding scheme involving sequence profiles of residues and physicochemical properties from the AAindex1 dataset is developed. Then, the random projection technique was adopted to project the encoding instances into a reduced space. Then, several better random projections were obtained by training an IBk classifier based on the training dataset, which were thus applied to the test dataset. The ensemble of random projection classifiers is therefore obtained. Experimental results showed that although the performance of our method is not good enough for real applications of hotspots, it is very promising in the determination of hotspot residues from whole sequences.

  17. Ensemble method for dengue prediction.

    Science.gov (United States)

    Buczak, Anna L; Baugher, Benjamin; Moniz, Linda J; Bagley, Thomas; Babin, Steven M; Guven, Erhan

    2018-01-01

    In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico) during four dengue seasons: 1) peak height (i.e., maximum weekly number of cases during a transmission season; 2) peak week (i.e., week in which the maximum weekly number of cases occurred); and 3) total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date. Our approach used ensemble models created by combining three disparate types of component models: 1) two-dimensional Method of Analogues models incorporating both dengue and climate data; 2) additive seasonal Holt-Winters models with and without wavelet smoothing; and 3) simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations. Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week. The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.

  18. Ensemble method for dengue prediction.

    Directory of Open Access Journals (Sweden)

    Anna L Buczak

    Full Text Available In the 2015 NOAA Dengue Challenge, participants made three dengue target predictions for two locations (Iquitos, Peru, and San Juan, Puerto Rico during four dengue seasons: 1 peak height (i.e., maximum weekly number of cases during a transmission season; 2 peak week (i.e., week in which the maximum weekly number of cases occurred; and 3 total number of cases reported during a transmission season. A dengue transmission season is the 12-month period commencing with the location-specific, historical week with the lowest number of cases. At the beginning of the Dengue Challenge, participants were provided with the same input data for developing the models, with the prediction testing data provided at a later date.Our approach used ensemble models created by combining three disparate types of component models: 1 two-dimensional Method of Analogues models incorporating both dengue and climate data; 2 additive seasonal Holt-Winters models with and without wavelet smoothing; and 3 simple historical models. Of the individual component models created, those with the best performance on the prior four years of data were incorporated into the ensemble models. There were separate ensembles for predicting each of the three targets at each of the two locations.Our ensemble models scored higher for peak height and total dengue case counts reported in a transmission season for Iquitos than all other models submitted to the Dengue Challenge. However, the ensemble models did not do nearly as well when predicting the peak week.The Dengue Challenge organizers scored the dengue predictions of the Challenge participant groups. Our ensemble approach was the best in predicting the total number of dengue cases reported for transmission season and peak height for Iquitos, Peru.

  19. Verification of an ensemble prediction system for storm surge forecast in the Adriatic Sea

    Science.gov (United States)

    Mel, Riccardo; Lionello, Piero

    2014-12-01

    In the Adriatic Sea, storm surges present a significant threat to Venice and to the flat coastal areas of the northern coast of the basin. Sea level forecast is of paramount importance for the management of daily activities and for operating the movable barriers that are presently being built for the protection of the city. In this paper, an EPS (ensemble prediction system) for operational forecasting of storm surge in the northern Adriatic Sea is presented and applied to a 3-month-long period (October-December 2010). The sea level EPS is based on the HYPSE (hydrostatic Padua Sea elevation) model, which is a standard single-layer nonlinear shallow water model, whose forcings (mean sea level pressure and surface wind fields) are provided by the ensemble members of the ECMWF (European Center for Medium-Range Weather Forecasts) EPS. Results are verified against observations at five tide gauges located along the Croatian and Italian coasts of the Adriatic Sea. Forecast uncertainty increases with the predicted value of the storm surge and with the forecast lead time. The EMF (ensemble mean forecast) provided by the EPS has a rms (root mean square) error lower than the DF (deterministic forecast), especially for short (up to 3 days) lead times. Uncertainty for short lead times of the forecast and for small storm surges is mainly caused by uncertainty of the initial condition of the hydrodynamical model. Uncertainty for large lead times and large storm surges is mainly caused by uncertainty in the meteorological forcings. The EPS spread increases with the rms error of the forecast. For large lead times the EPS spread and the forecast error substantially coincide. However, the EPS spread in this study, which does not account for uncertainty in the initial condition, underestimates the error during the early part of the forecast and for small storm surge values. On the contrary, it overestimates the rms error for large surge values. The PF (probability forecast) of the EPS

  20. Assessing uncertainties in flood forecasts for decision making: prototype of an operational flood management system integrating ensemble predictions

    Directory of Open Access Journals (Sweden)

    J. Dietrich

    2009-08-01

    Full Text Available Ensemble forecasts aim at framing the uncertainties of the potential future development of the hydro-meteorological situation. A probabilistic evaluation can be used to communicate forecast uncertainty to decision makers. Here an operational system for ensemble based flood forecasting is presented, which combines forecasts from the European COSMO-LEPS, SRNWP-PEPS and COSMO-DE prediction systems. A multi-model lagged average super-ensemble is generated by recombining members from different runs of these meteorological forecast systems. A subset of the super-ensemble is selected based on a priori model weights, which are obtained from ensemble calibration. Flood forecasts are simulated by the conceptual rainfall-runoff-model ArcEGMO. Parameter uncertainty of the model is represented by a parameter ensemble, which is a priori generated from a comprehensive uncertainty analysis during model calibration. The use of a computationally efficient hydrological model within a flood management system allows us to compute the hydro-meteorological model chain for all members of the sub-ensemble. The model chain is not re-computed before new ensemble forecasts are available, but the probabilistic assessment of the output is updated when new information from deterministic short range forecasts or from assimilation of measured data becomes available. For hydraulic modelling, with the desired result of a probabilistic inundation map with high spatial resolution, a replacement model can help to overcome computational limitations. A prototype of the developed framework has been applied for a case study in the Mulde river basin. However these techniques, in particular the probabilistic assessment and the derivation of decision rules are still in their infancy. Further research is necessary and promising.

  1. The Hydrologic Ensemble Prediction Experiment (HEPEX)

    Science.gov (United States)

    Wood, A. W.; Thielen, J.; Pappenberger, F.; Schaake, J. C.; Hartman, R. K.

    2012-12-01

    The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF). With support from the US National Weather Service (NWS) and the European Commission (EC), the HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support in emergency management and water resources sectors. The strategy to meet this goal includes meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. HEPEX has organized about a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Today, the HEPEX mission is to demonstrate the added value of hydrological ensemble prediction systems (HEPS) for emergency management and water resources sectors to make decisions that have important consequences for economy, public health, safety, and the environment. HEPEX is now organised around six major themes that represent core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.

  2. A comparison between the ECMWF and COSMO Ensemble Prediction Systems applied to short-term wind power forecasting on real data

    DEFF Research Database (Denmark)

    Alessandrini, S.; Sperati, S.; Pinson, Pierre

    2013-01-01

    together with a single forecast power value for each future time horizon. A comparison between two different ensemble forecasting models, ECMWF EPS (Ensemble Prediction System in use at the European Centre for Medium-Range Weather Forecasts) and COSMO-LEPS (Limited-area Ensemble Prediction System developed...... ahead forecast horizon. A statistical calibration of the ensemble wind speed members based on the use of past wind speed measurements is explained. The two models are compared using common verification indices and diagrams. The higher horizontal resolution model (COSMO-LEPS) shows slightly better...

  3. DEWEPS - Development and Evaluation of new Wind forecasting tools with an Ensemble Prediction System

    Energy Technology Data Exchange (ETDEWEB)

    Moehrlen, C.; Joergensen, Jess

    2012-02-15

    There is an ongoing trend of increased privatization in the handling of renewable energy. This trend is required to ensure an efficient energy system, where improvements that make economic sense are prioritised. The reason why centralized forecasting can be a challenge in that matter is that the TSOs tend to optimize on physical error rather than cost. Consequently, the market is likely to speculate against the TSO, which in turn increases the cost of balancing. A privatized pool of wind and/or solar power is more difficult to speculate against, because the optimization criteria is unpredictable due to subjective risk considerations that may be taken into account at any time. Although there is and additional level of costs for the trading of the private volume, it can be argued that competition will accelerate efficiency from an economic perspective. The amount of power put into the market will become less predictable, when the wind power spot market bid takes place on the basis of a risk consideration in addition to the forecast information itself. The scope of this project is to contribute to more efficient wind power integration targeted both to centralised and decentralised cost efficient IT solutions, which will complement each other in market based energy systems. The DEWEPS project resulted in an extension of the number of Ensemble forecasts, an incremental trade strategy for balancing unpredictable power production, and an IT platform for efficient handling of power generation units. Together, these three elements contribute to less need for reserves, more capacity in the market, and thus more competition. (LN)

  4. A Hierarchical Method for Transient Stability Prediction of Power Systems Using the Confidence of a SVM-Based Ensemble Classifier

    Directory of Open Access Journals (Sweden)

    Yanzhen Zhou

    2016-09-01

    Full Text Available Machine learning techniques have been widely used in transient stability prediction of power systems. When using the post-fault dynamic responses, it is difficult to draw a definite conclusion about how long the duration of response data used should be in order to balance the accuracy and speed. Besides, previous studies have the problem of lacking consideration for the confidence level. To solve these problems, a hierarchical method for transient stability prediction based on the confidence of ensemble classifier using multiple support vector machines (SVMs is proposed. Firstly, multiple datasets are generated by bootstrap sampling, then features are randomly picked up to compress the datasets. Secondly, the confidence indices are defined and multiple SVMs are built based on these generated datasets. By synthesizing the probabilistic outputs of multiple SVMs, the prediction results and confidence of the ensemble classifier will be obtained. Finally, different ensemble classifiers with different response times are built to construct different layers of the proposed hierarchical scheme. The simulation results show that the proposed hierarchical method can balance the accuracy and rapidity of the transient stability prediction. Moreover, the hierarchical method can reduce the misjudgments of unstable instances and cooperate with the time domain simulation to insure the security and stability of power systems.

  5. HBC-Evo: predicting human breast cancer by exploiting amino acid sequence-based feature spaces and evolutionary ensemble system.

    Science.gov (United States)

    Majid, Abdul; Ali, Safdar

    2015-01-01

    We developed genetic programming (GP)-based evolutionary ensemble system for the early diagnosis, prognosis and prediction of human breast cancer. This system has effectively exploited the diversity in feature and decision spaces. First, individual learners are trained in different feature spaces using physicochemical properties of protein amino acids. Their predictions are then stacked to develop the best solution during GP evolution process. Finally, results for HBC-Evo system are obtained with optimal threshold, which is computed using particle swarm optimization. Our novel approach has demonstrated promising results compared to state of the art approaches.

  6. Various multistage ensembles for prediction of heating energy consumption

    Directory of Open Access Journals (Sweden)

    Radisa Jovanovic

    2015-04-01

    Full Text Available Feedforward neural network models are created for prediction of daily heating energy consumption of a NTNU university campus Gloshaugen using actual measured data for training and testing. Improvement of prediction accuracy is proposed by using neural network ensemble. Previously trained feed-forward neural networks are first separated into clusters, using k-means algorithm, and then the best network of each cluster is chosen as member of an ensemble. Two conventional averaging methods for obtaining ensemble output are applied; simple and weighted. In order to achieve better prediction results, multistage ensemble is investigated. As second level, adaptive neuro-fuzzy inference system with various clustering and membership functions are used to aggregate the selected ensemble members. Feedforward neural network in second stage is also analyzed. It is shown that using ensemble of neural networks can predict heating energy consumption with better accuracy than the best trained single neural network, while the best results are achieved with multistage ensemble.

  7. Simplifying a hydrological ensemble prediction system with a backward greedy selection of members – Part 1: Optimization criteria

    Directory of Open Access Journals (Sweden)

    D. Brochero

    2011-11-01

    Full Text Available Hydrological Ensemble Prediction Systems (HEPS, obtained by forcing rainfall-runoff models with Meteorological Ensemble Prediction Systems (MEPS, have been recognized as useful approaches to quantify uncertainties of hydrological forecasting systems. This task is complex both in terms of the coupling of information and computational time, which may create an operational barrier. The main objective of the current work is to assess the degree of simplification (reduction of the number of hydrological members that can be achieved with a HEPS configured using 16 lumped hydrological models driven by the 50 weather ensemble forecasts from the European Centre for Medium-range Weather Forecasts (ECMWF. Here, Backward Greedy Selection (BGS is proposed to assess the weight that each model must represent within a subset that offers similar or better performance than a reference set of 800 hydrological members. These hydrological models' weights represent the participation of each hydrological model within a simplified HEPS which would issue real-time forecasts in a relatively short computational time. The methodology uses a variation of the k-fold cross-validation, allowing an optimal use of the information, and employs a multi-criterion framework that represents the combination of resolution, reliability, consistency, and diversity. Results show that the degree of reduction of members can be established in terms of maximum number of members required (complexity of the HEPS or the maximization of the relationship between the different scores (performance.

  8. Ensemble Classifiers for Predicting HIV-1 Resistance from Three Rule-Based Genotypic Resistance Interpretation Systems.

    Science.gov (United States)

    Raposo, Letícia M; Nobre, Flavio F

    2017-08-30

    Resistance to antiretrovirals (ARVs) is a major problem faced by HIV-infected individuals. Different rule-based algorithms were developed to infer HIV-1 susceptibility to antiretrovirals from genotypic data. However, there is discordance between them, resulting in difficulties for clinical decisions about which treatment to use. Here, we developed ensemble classifiers integrating three interpretation algorithms: Agence Nationale de Recherche sur le SIDA (ANRS), Rega, and the genotypic resistance interpretation system from Stanford HIV Drug Resistance Database (HIVdb). Three approaches were applied to develop a classifier with a single resistance profile: stacked generalization, a simple plurality vote scheme and the selection of the interpretation system with the best performance. The strategies were compared with the Friedman's test and the performance of the classifiers was evaluated using the F-measure, sensitivity and specificity values. We found that the three strategies had similar performances for the selected antiretrovirals. For some cases, the stacking technique with naïve Bayes as the learning algorithm showed a statistically superior F-measure. This study demonstrates that ensemble classifiers can be an alternative tool for clinical decision-making since they provide a single resistance profile from the most commonly used resistance interpretation systems.

  9. Development of the Ensemble Navy Aerosol Analysis Prediction System (ENAAPS and its application of the Data Assimilation Research Testbed (DART in support of aerosol forecasting

    Directory of Open Access Journals (Sweden)

    J. I. Rubin

    2016-03-01

    Full Text Available An ensemble-based forecast and data assimilation system has been developed for use in Navy aerosol forecasting. The system makes use of an ensemble of the Navy Aerosol Analysis Prediction System (ENAAPS at 1 × 1°, combined with an ensemble adjustment Kalman filter from NCAR's Data Assimilation Research Testbed (DART. The base ENAAPS-DART system discussed in this work utilizes the Navy Operational Global Analysis Prediction System (NOGAPS meteorological ensemble to drive offline NAAPS simulations coupled with the DART ensemble Kalman filter architecture to assimilate bias-corrected MODIS aerosol optical thickness (AOT retrievals. This work outlines the optimization of the 20-member ensemble system, including consideration of meteorology and source-perturbed ensemble members as well as covariance inflation. Additional tests with 80 meteorological and source members were also performed. An important finding of this work is that an adaptive covariance inflation method, which has not been previously tested for aerosol applications, was found to perform better than a temporally and spatially constant covariance inflation. Problems were identified with the constant inflation in regions with limited observational coverage. The second major finding of this work is that combined meteorology and aerosol source ensembles are superior to either in isolation and that both are necessary to produce a robust system with sufficient spread in the ensemble members as well as realistic correlation fields for spreading observational information. The inclusion of aerosol source ensembles improves correlation fields for large aerosol source regions, such as smoke and dust in Africa, by statistically separating freshly emitted from transported aerosol species. However, the source ensembles have limited efficacy during long-range transport. Conversely, the meteorological ensemble generates sufficient spread at the synoptic scale to enable observational impact

  10. Seasonal prediction of East Asian summer rainfall using a multi-model ensemble system

    Science.gov (United States)

    Ahn, Joong-Bae; Lee, Doo-Young; Yoo, Jin‑Ho

    2015-04-01

    Using the retrospective forecasts of seven state-of-the-art coupled models and their multi-model ensemble (MME) for boreal summers, the prediction skills of climate models in the western tropical Pacific (WTP) and East Asian region are assessed. The prediction of summer rainfall anomalies in East Asia is difficult, while the WTP has a strong correlation between model prediction and observation. We focus on developing a new approach to further enhance the seasonal prediction skill for summer rainfall in East Asia and investigate the influence of convective activity in the WTP on East Asian summer rainfall. By analyzing the characteristics of the WTP convection, two distinct patterns associated with El Niño-Southern Oscillation developing and decaying modes are identified. Based on the multiple linear regression method, the East Asia Rainfall Index (EARI) is developed by using the interannual variability of the normalized Maritime continent-WTP Indices (MPIs), as potentially useful predictors for rainfall prediction over East Asia, obtained from the above two main patterns. For East Asian summer rainfall, the EARI has superior performance to the East Asia summer monsoon index or each MPI. Therefore, the regressed rainfall from EARI also shows a strong relationship with the observed East Asian summer rainfall pattern. In addition, we evaluate the prediction skill of the East Asia reconstructed rainfall obtained by hybrid dynamical-statistical approach using the cross-validated EARI from the individual models and their MME. The results show that the rainfalls reconstructed from simulations capture the general features of observed precipitation in East Asia quite well. This study convincingly demonstrates that rainfall prediction skill is considerably improved by using a hybrid dynamical-statistical approach compared to the dynamical forecast alone. Acknowledgements This work was carried out with the support of Rural Development Administration Cooperative Research

  11. Decadal climate predictions improved by ocean ensemble dispersion filtering

    Science.gov (United States)

    Kadow, C.; Illing, S.; Kröner, I.; Ulbrich, U.; Cubasch, U.

    2017-06-01

    Decadal predictions by Earth system models aim to capture the state and phase of the climate several years in advance. Atmosphere-ocean interaction plays an important role for such climate forecasts. While short-term weather forecasts represent an initial value problem and long-term climate projections represent a boundary condition problem, the decadal climate prediction falls in-between these two time scales. In recent years, more precise initialization techniques of coupled Earth system models and increased ensemble sizes have improved decadal predictions. However, climate models in general start losing the initialized signal and its predictive skill from one forecast year to the next. Here we show that the climate prediction skill of an Earth system model can be improved by a shift of the ocean state toward the ensemble mean of its individual members at seasonal intervals. We found that this procedure, called ensemble dispersion filter, results in more accurate results than the standard decadal prediction. Global mean and regional temperature, precipitation, and winter cyclone predictions show an increased skill up to 5 years ahead. Furthermore, the novel technique outperforms predictions with larger ensembles and higher resolution. Our results demonstrate how decadal climate predictions benefit from ocean ensemble dispersion filtering toward the ensemble mean.Plain Language SummaryDecadal predictions aim to predict the climate several years in advance. Atmosphere-ocean interaction plays an important role for such climate forecasts. The ocean memory due to its heat capacity holds big potential skill. In recent years, more precise initialization techniques of coupled Earth system models (incl. atmosphere and ocean) have improved decadal predictions. Ensembles are another important aspect. Applying slightly perturbed predictions to trigger the famous butterfly effect results in an ensemble. Instead of evaluating one prediction, but the whole ensemble with its

  12. Verification of Global Radiation Forecasts from the Ensemble Prediction System at DMI

    DEFF Research Database (Denmark)

    Lundholm, Sisse Camilla

    To comply with an increasing demand for sustainable energy sources, a solar heating unit is being developed at the Technical University of Denmark. To make optimal use — environmentally and economically —, this heating unit is equipped with an intelligent control system using forecasts of the heat...... consumption of the house and the amount of available solar energy. In order to make the most of this solar heating unit, accurate forecasts of the available solar radiation are esstential. However, because of its sensitivity to local meteorological conditions, the solar radiation received at the surface...... of the Earth can be highly fluctuating and challenging to forecast accurately. To comply with the accuracy requirements to forecasts of both global, direct, and diffuse radiation, the uncertainty of these forecasts is of interest. Forecast uncertainties can become accessible by running an ensemble of forecasts...

  13. Distinguishing high and low flow domains in urban drainage systems 2 days ahead using numerical weather prediction ensembles

    Science.gov (United States)

    Courdent, Vianney; Grum, Morten; Mikkelsen, Peter Steen

    2018-01-01

    Precipitation constitutes a major contribution to the flow in urban storm- and wastewater systems. Forecasts of the anticipated runoff flows, created from radar extrapolation and/or numerical weather predictions, can potentially be used to optimize operation in both wet and dry weather periods. However, flow forecasts are inevitably uncertain and their use will ultimately require a trade-off between the value of knowing what will happen in the future and the probability and consequence of being wrong. In this study we examine how ensemble forecasts from the HIRLAM-DMI-S05 numerical weather prediction (NWP) model subject to three different ensemble post-processing approaches can be used to forecast flow exceedance in a combined sewer for a wide range of ratios between the probability of detection (POD) and the probability of false detection (POFD). We use a hydrological rainfall-runoff model to transform the forecasted rainfall into forecasted flow series and evaluate three different approaches to establishing the relative operating characteristics (ROC) diagram of the forecast, which is a plot of POD against POFD for each fraction of concordant ensemble members and can be used to select the weight of evidence that matches the desired trade-off between POD and POFD. In the first approach, the rainfall input to the model is calculated for each of 25 ensemble members as a weighted average of rainfall from the NWP cells over the catchment where the weights are proportional to the areal intersection between the catchment and the NWP cells. In the second approach, a total of 2825 flow ensembles are generated using rainfall input from the neighbouring NWP cells up to approximately 6 cells in all directions from the catchment. In the third approach, the first approach is extended spatially by successively increasing the area covered and for each spatial increase and each time step selecting only the cell with the highest intensity resulting in a total of 175 ensemble

  14. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences.

    Science.gov (United States)

    Ali, Safdar; Majid, Abdul

    2015-04-01

    The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. The Operational Hydro-meteorological Ensemble Prediction System at Meteo-France and its representation interface for the French Service for Flood Prediction (SCHAPI) : description and undergoing developments.

    Science.gov (United States)

    Rousset-Regimbeau, F.; Martin, E.; Thirel, G.; Habets, F.; Coustau, M.; Roquelaure, S.; De Saint Aubin, C.; Ardilouze, C.

    2012-04-01

    The coupled physically-based hydro-meteorological model SAFRAN-ISBA-MODCOU (SIM) is developed at Meteo-France for many years. This fully distributed catchment model is used in a pre-operational mode since 2005 for producing mid-range ensemble streamflow forecasts based on the 51-member 10-day ECMWF EPS. Improvements have been made during the past few years.. First, a statistical adaptation has been performed to improve the meteorological ensemble predictions from the ECMWF. It has been developped over a 3-year archive, and assessed over a 1-year period. Its impact on the performance of the streamflow forecasts has been calculated over 8 months of predictions. Then, a past discharges assimilation system has been implemented in order to improve the initial states of these ensemble streamflow forecasts. It has been developped in the framework of a Phd thesis, and it is now evaluated in real-time conditions. Moreover, an improvement of the physics of the ISBA model (the exponential profile of the hydraulic conductivity in the soil) was implemented. Finally, this system provides ensemble 10-day streamflow prediction to the French National Service for Flood Prediction (SCHAPI). A collaboration between Meteo-France and SCHAPI led to the development of a new website. This website shows the streamflow predictions for about 200 selected river stations over France (selected regarding their interest for flood warning) , as well as alerts for high flows (two levels of high flows corresponding to the levels of risk of the French flood warning system). It aims at providing to the French hydrological forecaters a real-time tool for mid-range flood awareness.

  16. Seasonal-to-decadal predictions with the ensemble Kalman filter and the Norwegian Earth System Model: a twin experiment

    Directory of Open Access Journals (Sweden)

    Francois Counillon

    2014-03-01

    Full Text Available Here, we firstly demonstrate the potential of an advanced flow dependent data assimilation method for performing seasonal-to-decadal prediction and secondly, reassess the use of sea surface temperature (SST for initialisation of these forecasts. We use the Norwegian Climate Prediction Model (NorCPM, which is based on the Norwegian Earth System Model (NorESM and uses the deterministic ensemble Kalman filter to assimilate observations. NorESM is a fully coupled system based on the Community Earth System Model version 1, which includes an ocean, an atmosphere, a sea ice and a land model. A numerically efficient coarse resolution version of NorESM is used. We employ a twin experiment methodology to provide an upper estimate of predictability in our model framework (i.e. without considering model bias of NorCPM that assimilates synthetic monthly SST data (EnKF-SST. The accuracy of EnKF-SST is compared to an unconstrained ensemble run (FREE and ensemble predictions made with near perfect (i.e. microscopic SST perturbation initial conditions (PERFECT. We perform 10 cycles, each consisting of a 10-yr assimilation phase, followed by a 10-yr prediction. The results indicate that EnKF-SST improves sea level, ice concentration, 2 m atmospheric temperature, precipitation and 3-D hydrography compared to FREE. Improvements for the hydrography are largest near the surface and are retained for longer periods at depth. Benefits in salinity are retained for longer periods compared to temperature. Near-surface improvements are largest in the tropics, while improvements at intermediate depths are found in regions of large-scale currents, regions of deep convection, and at the Mediterranean Sea outflow. However, the benefits are often small compared to PERFECT, in particular, at depth suggesting that more observations should be assimilated in addition to SST. The EnKF-SST system is also tested for standard ocean circulation indices and demonstrates decadal

  17. Multi-Model Ensemble Wake Vortex Prediction

    Science.gov (United States)

    Koerner, Stephan; Holzaepfel, Frank; Ahmad, Nash'at N.

    2015-01-01

    Several multi-model ensemble methods are investigated for predicting wake vortex transport and decay. This study is a joint effort between National Aeronautics and Space Administration and Deutsches Zentrum fuer Luft- und Raumfahrt to develop a multi-model ensemble capability using their wake models. An overview of different multi-model ensemble methods and their feasibility for wake applications is presented. The methods include Reliability Ensemble Averaging, Bayesian Model Averaging, and Monte Carlo Simulations. The methodologies are evaluated using data from wake vortex field experiments.

  18. Numerical climate modeling and verification of selected areas for heat waves of Pakistan using ensemble prediction system

    International Nuclear Information System (INIS)

    Amna, S; Samreen, N; Khalid, B; Shamim, A

    2013-01-01

    Depending upon the topography, there is an extreme variation in the temperature of Pakistan. Heat waves are the Weather-related events, having significant impact on the humans, including all socioeconomic activities and health issues as well which changes according to the climatic conditions of the area. The forecasting climate is of prime importance for being aware of future climatic changes, in order to mitigate them. The study used the Ensemble Prediction System (EPS) for the purpose of modeling seasonal weather hind-cast of three selected areas i.e., Islamabad, Jhelum and Muzaffarabad. This research was purposely carried out in order to suggest the most suitable climate model for Pakistan. Real time and simulated data of five General Circulation Models i.e., ECMWF, ERA-40, MPI, Meteo France and UKMO for selected areas was acquired from Pakistan Meteorological Department. Data incorporated constituted the statistical temperature records of 32 years for the months of June, July and August. This study was based on EPS to calculate probabilistic forecasts produced by single ensembles. Verification was done out to assess the quality of the forecast t by using standard probabilistic measures of Brier Score, Brier Skill Score, Cross Validation and Relative Operating Characteristic curve. The results showed ECMWF the most suitable model for Islamabad and Jhelum; and Meteo France for Muzaffarabad. Other models have significant results by omitting particular initial conditions.

  19. Ensemble prediction of air quality using the WRF/CMAQ model system for health effect studies in China

    Science.gov (United States)

    Hu, Jianlin; Li, Xun; Huang, Lin; Ying, Qi; Zhang, Qiang; Zhao, Bin; Wang, Shuxiao; Zhang, Hongliang

    2017-11-01

    Accurate exposure estimates are required for health effect analyses of severe air pollution in China. Chemical transport models (CTMs) are widely used to provide spatial distribution, chemical composition, particle size fractions, and source origins of air pollutants. The accuracy of air quality predictions in China is greatly affected by the uncertainties of emission inventories. The Community Multiscale Air Quality (CMAQ) model with meteorological inputs from the Weather Research and Forecasting (WRF) model were used in this study to simulate air pollutants in China in 2013. Four simulations were conducted with four different anthropogenic emission inventories, including the Multi-resolution Emission Inventory for China (MEIC), the Emission Inventory for China by School of Environment at Tsinghua University (SOE), the Emissions Database for Global Atmospheric Research (EDGAR), and the Regional Emission inventory in Asia version 2 (REAS2). Model performance of each simulation was evaluated against available observation data from 422 sites in 60 cities across China. Model predictions of O3 and PM2.5 generally meet the model performance criteria, but performance differences exist in different regions, for different pollutants, and among inventories. Ensemble predictions were calculated by linearly combining the results from different inventories to minimize the sum of the squared errors between the ensemble results and the observations in all cities. The ensemble concentrations show improved agreement with observations in most cities. The mean fractional bias (MFB) and mean fractional errors (MFEs) of the ensemble annual PM2.5 in the 60 cities are -0.11 and 0.24, respectively, which are better than the MFB (-0.25 to -0.16) and MFE (0.26-0.31) of individual simulations. The ensemble annual daily maximum 1 h O3 (O3-1h) concentrations are also improved, with mean normalized bias (MNB) of 0.03 and mean normalized errors (MNE) of 0.14, compared to MNB of 0.06-0.19 and

  20. Ensemble prediction of air quality using the WRF/CMAQ model system for health effect studies in China

    Directory of Open Access Journals (Sweden)

    J. Hu

    2017-11-01

    Full Text Available Accurate exposure estimates are required for health effect analyses of severe air pollution in China. Chemical transport models (CTMs are widely used to provide spatial distribution, chemical composition, particle size fractions, and source origins of air pollutants. The accuracy of air quality predictions in China is greatly affected by the uncertainties of emission inventories. The Community Multiscale Air Quality (CMAQ model with meteorological inputs from the Weather Research and Forecasting (WRF model were used in this study to simulate air pollutants in China in 2013. Four simulations were conducted with four different anthropogenic emission inventories, including the Multi-resolution Emission Inventory for China (MEIC, the Emission Inventory for China by School of Environment at Tsinghua University (SOE, the Emissions Database for Global Atmospheric Research (EDGAR, and the Regional Emission inventory in Asia version 2 (REAS2. Model performance of each simulation was evaluated against available observation data from 422 sites in 60 cities across China. Model predictions of O3 and PM2.5 generally meet the model performance criteria, but performance differences exist in different regions, for different pollutants, and among inventories. Ensemble predictions were calculated by linearly combining the results from different inventories to minimize the sum of the squared errors between the ensemble results and the observations in all cities. The ensemble concentrations show improved agreement with observations in most cities. The mean fractional bias (MFB and mean fractional errors (MFEs of the ensemble annual PM2.5 in the 60 cities are −0.11 and 0.24, respectively, which are better than the MFB (−0.25 to −0.16 and MFE (0.26–0.31 of individual simulations. The ensemble annual daily maximum 1 h O3 (O3-1h concentrations are also improved, with mean normalized bias (MNB of 0.03 and mean normalized errors (MNE of 0.14, compared to MNB

  1. The use of different ensemble forecasting systems for wind power prediction on a real case in the South of Italy

    DEFF Research Database (Denmark)

    Alessandrini, Stefano; Sperati, Simone; Pinson, Pierre

    2012-01-01

    Short-term forecasting applied to wind energy is becoming increasingly important due to the constant growth of this renewable source, whose uncertainty requires a constant effort to meet the needs of the national electrical systems and their operators. Regarding to this, the probabilistic approach...... calibration performed on the wind speed EPS members allows an improvement from an over-confident situation observable from the rank histograms (in which the measurements fell quite always outside the bounds of the probability distribution) to a consistent ensemble spread. After that it is possible to convert...... the data to wind energy: the spread calculated on wind power can then be used as an accuracy predictor due to its level of correlation with the deterministic WPF error. In this presentation we investigate the performances for both wind power and accuracy prediction of the new EPS used at the ECMWF, whose...

  2. Distinguishing high and low flow domains in urban drainage systems 2 days ahead using numerical weather prediction ensembles

    DEFF Research Database (Denmark)

    Courdent, Vianney Augustin Thomas; Grum, Morten; Mikkelsen, Peter Steen

    2018-01-01

    Precipitation constitutes a major contribution to the flow in urban storm- and wastewater systems. Forecasts of the anticipated runoff flows, created from radar extrapolation and/or numerical weather predictions, can potentially be used to optimize operation in both wet and dry weather periods...... to transform the forecasted rainfall into forecasted flow series and evaluate three different approaches to establishing the relative operating characteristics (ROC) diagram of the forecast, which is a plot of POD against POFD for each fraction of concordant ensemble members and can be used to select...... itself from earlier research in being the first application to urban hydrology, with fast runoff and small catchments that are highly sensitive to local extremes. Furthermore, no earlier reference has been found on the highly efficient third approach using only neighbouring cells with the highest threat...

  3. An assessment of the ECMWF tropical cyclone ensemble forecasting system and its use for insurance loss predictions

    Science.gov (United States)

    Aemisegger, F.; Martius, O.; Wüest, M.

    2010-09-01

    Tropical cyclones (TC) are amongst the most impressive and destructive weather systems of Earth's atmosphere. The costs related to such intense natural disasters have been rising in recent years and may potentially continue to increase in the near future due to changes in magnitude, timing, duration or location of tropical storms. This is a challenging situation for numerical weather prediction, which should provide a decision basis for short term protective measures through high quality medium range forecasts on the one hand. On the other hand, the insurance system bears great responsibility in elaborating proactive plans in order to face these extreme events that individuals cannot manage independently. Real-time prediction and early warning systems are needed in the insurance sector in order to face an imminent hazard and minimise losses. Early loss estimates are important in order to allocate capital and to communicate to investors. The ECMWF TC identification algorithm delivers information on the track and intensity of storms based on the ensemble forecasting system. This provides a physically based framework to assess the uncertainty in the forecast of a specific event. The performance of the ECMWF TC ensemble forecasts is evaluated in terms of cyclone intensity and location in this study and the value of such a physically-based quantification of uncertainty in the meteorological forecast for the estimation of insurance losses is assessed. An evaluation of track and intensity forecasts of hurricanes in the North Atlantic during the years 2005 to 2009 is carried out. Various effects are studied like the differences in forecasts over land or sea, as well as links between storm intensity and forecast error statistics. The value of the ECMWF TC forecasting system for the global re-insurer Swiss Re was assessed by performing insurance loss predictions using their in-house loss model for several case studies of particularly devastating events. The generally known

  4. Wind Power Prediction using Ensembles

    DEFF Research Database (Denmark)

    Giebel, Gregor; Badger, Jake; Landberg, Lars

    2005-01-01

    offshore wind farm and the whole Jutland/Funen area. The utilities used these forecasts for maintenance planning, fuel consumption estimates and over-the-weekend trading on the Leipzig power exchange. Othernotable scientific results include the better accuracy of forecasts made up from a simple...... superposition of two NWP provider (in our case, DMI and DWD), an investigation of the merits of a parameterisation of the turbulent kinetic energy within thedelivered wind speed forecasts, and the finding that a “naïve” downscaling of each of the coarse ECMWF ensemble members with higher resolution HIRLAM did...

  5. A new ensemble model for short term wind power prediction

    DEFF Research Database (Denmark)

    Madsen, Henrik; Albu, Razvan-Daniel; Felea, Ioan

    2012-01-01

    As the objective of this study, a non-linear ensemble system is used to develop a new model for predicting wind speed in short-term time scale. Short-term wind power prediction becomes an extremely important field of research for the energy sector. Regardless of the recent advancements in the re-search...... of prediction models, it was observed that different models have different capabilities and also no single model is suitable under all situations. The idea behind EPS (ensemble prediction systems) is to take advantage of the unique features of each subsystem to detain diverse patterns that exist in the dataset...

  6. Analysis of the regional MiKlip decadal prediction system over Europe: skill, added value of regionalization, and ensemble size dependeny

    Science.gov (United States)

    Reyers, Mark; Moemken, Julia; Pinto, Joaquim; Feldmann, Hendrik; Kottmeier, Christoph; MiKlip Module-C Team

    2017-04-01

    Decadal climate predictions can provide a useful basis for decision making support systems for the public and private sectors. Several generations of decadal hindcasts and predictions have been generated throughout the German research program MiKlip. Together with the global climate predictions computed with MPI-ESM, the regional climate model (RCM) COSMO-CLM is used for regional downscaling by MiKlip Module-C. The RCMs provide climate information on spatial and temporal scales closer to the needs of potential users. In this study, two downscaled hindcast generations are analysed (named b0 and b1). The respective global generations are both initialized by nudging them towards different reanalysis anomaly fields. An ensemble of five starting years (1961, 1971, 1981, 1991, and 2001), each comprising ten ensemble members, is used for both generations in order to quantify the regional decadal prediction skill for precipitation and near-surface temperature and wind speed over Europe. All datasets (including hindcasts, observations, reanalysis, and historical MPI-ESM runs) are pre-processed in an analogue manner by (i) removing the long-term trend and (ii) re-gridding to a common grid. Our analysis shows that there is potential for skillful decadal predictions over Europe in the regional MiKlip ensemble, but the skill is not systematic and depends on the PRUDENCE region and the variable. Further, the differences between the two hindcast generations are mostly small. As we used detrended time series, the predictive skill found in our study can probably attributed to reasonable predictions of anomalies which are associated with the natural climate variability. In a sensitivity study, it is shown that the results may strongly change when the long-term trend is kept in the datasets, as here the skill of predicting the long-term trend (e.g. for temperature) also plays a major role. The regionalization of the global ensemble provides an added value for decadal predictions for

  7. Three-model ensemble wind prediction in southern Italy

    Science.gov (United States)

    Torcasio, Rosa Claudia; Federico, Stefano; Calidonna, Claudia Roberta; Avolio, Elenio; Drofa, Oxana; Landi, Tony Christian; Malguzzi, Piero; Buzzi, Andrea; Bonasoni, Paolo

    2016-03-01

    Quality of wind prediction is of great importance since a good wind forecast allows the prediction of available wind power, improving the penetration of renewable energies into the energy market. Here, a 1-year (1 December 2012 to 30 November 2013) three-model ensemble (TME) experiment for wind prediction is considered. The models employed, run operationally at National Research Council - Institute of Atmospheric Sciences and Climate (CNR-ISAC), are RAMS (Regional Atmospheric Modelling System), BOLAM (BOlogna Limited Area Model), and MOLOCH (MOdello LOCale in H coordinates). The area considered for the study is southern Italy and the measurements used for the forecast verification are those of the GTS (Global Telecommunication System). Comparison with observations is made every 3 h up to 48 h of forecast lead time. Results show that the three-model ensemble outperforms the forecast of each individual model. The RMSE improvement compared to the best model is between 22 and 30 %, depending on the season. It is also shown that the three-model ensemble outperforms the IFS (Integrated Forecasting System) of the ECMWF (European Centre for Medium-Range Weather Forecast) for the surface wind forecasts. Notably, the three-model ensemble forecast performs better than each unbiased model, showing the added value of the ensemble technique. Finally, the sensitivity of the three-model ensemble RMSE to the length of the training period is analysed.

  8. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng

    2015-12-03

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  9. A sequence-based dynamic ensemble learning system for protein ligand-binding site prediction

    KAUST Repository

    Chen, Peng; Hu, ShanShan; Zhang, Jun; Gao, Xin; Li, Jinyan; Xia, Junfeng; Wang, Bing

    2015-01-01

    Background: Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures

  10. A Hybrid Computer-aided-diagnosis System for Prediction of Breast Cancer Recurrence (HPBCR Using Optimized Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Mohammad R. Mohebian

    Full Text Available Cancer is a collection of diseases that involves growing abnormal cells with the potential to invade or spread to the body. Breast cancer is the second leading cause of cancer death among women. A method for 5-year breast cancer recurrence prediction is presented in this manuscript. Clinicopathologic characteristics of 579 breast cancer patients (recurrence prevalence of 19.3% were analyzed and discriminative features were selected using statistical feature selection methods. They were further refined by Particle Swarm Optimization (PSO as the inputs of the classification system with ensemble learning (Bagged Decision Tree: BDT. The proper combination of selected categorical features and also the weight (importance of the selected interval-measurement-scale features were identified by the PSO algorithm. The performance of HPBCR (hybrid predictor of breast cancer recurrence was assessed using the holdout and 4-fold cross-validation. Three other classifiers namely as supported vector machines, DT, and multilayer perceptron neural network were used for comparison. The selected features were diagnosis age, tumor size, lymph node involvement ratio, number of involved axillary lymph nodes, progesterone receptor expression, having hormone therapy and type of surgery. The minimum sensitivity, specificity, precision and accuracy of HPBCR were 77%, 93%, 95% and 85%, respectively in the entire cross-validation folds and the hold-out test fold. HPBCR outperformed the other tested classifiers. It showed excellent agreement with the gold standard (i.e. the oncologist opinion after blood tumor marker and imaging tests, and tissue biopsy. This algorithm is thus a promising online tool for the prediction of breast cancer recurrence. Keywords: Breast cancer, Cancer recurrence, Computer-assisted diagnosis, Machine learning, Prognosis

  11. A Hybrid Computer-aided-diagnosis System for Prediction of Breast Cancer Recurrence (HPBCR) Using Optimized Ensemble Learning.

    Science.gov (United States)

    Mohebian, Mohammad R; Marateb, Hamid R; Mansourian, Marjan; Mañanas, Miguel Angel; Mokarian, Fariborz

    2017-01-01

    Cancer is a collection of diseases that involves growing abnormal cells with the potential to invade or spread to the body. Breast cancer is the second leading cause of cancer death among women. A method for 5-year breast cancer recurrence prediction is presented in this manuscript. Clinicopathologic characteristics of 579 breast cancer patients (recurrence prevalence of 19.3%) were analyzed and discriminative features were selected using statistical feature selection methods. They were further refined by Particle Swarm Optimization (PSO) as the inputs of the classification system with ensemble learning (Bagged Decision Tree: BDT). The proper combination of selected categorical features and also the weight (importance) of the selected interval-measurement-scale features were identified by the PSO algorithm. The performance of HPBCR (hybrid predictor of breast cancer recurrence) was assessed using the holdout and 4-fold cross-validation. Three other classifiers namely as supported vector machines, DT, and multilayer perceptron neural network were used for comparison. The selected features were diagnosis age, tumor size, lymph node involvement ratio, number of involved axillary lymph nodes, progesterone receptor expression, having hormone therapy and type of surgery. The minimum sensitivity, specificity, precision and accuracy of HPBCR were 77%, 93%, 95% and 85%, respectively in the entire cross-validation folds and the hold-out test fold. HPBCR outperformed the other tested classifiers. It showed excellent agreement with the gold standard (i.e. the oncologist opinion after blood tumor marker and imaging tests, and tissue biopsy). This algorithm is thus a promising online tool for the prediction of breast cancer recurrence.

  12. The Hydrologic Ensemble Prediction Experiment (HEPEX)

    Science.gov (United States)

    Wood, Andy; Wetterhall, Fredrik; Ramos, Maria-Helena

    2015-04-01

    The Hydrologic Ensemble Prediction Experiment was established in March, 2004, at a workshop hosted by the European Center for Medium Range Weather Forecasting (ECMWF), and co-sponsored by the US National Weather Service (NWS) and the European Commission (EC). The HEPEX goal was to bring the international hydrological and meteorological communities together to advance the understanding and adoption of hydrological ensemble forecasts for decision support. HEPEX pursues this goal through research efforts and practical implementations involving six core elements of a hydrologic ensemble prediction enterprise: input and pre-processing, ensemble techniques, data assimilation, post-processing, verification, and communication and use in decision making. HEPEX has grown through meetings that connect the user, forecast producer and research communities to exchange ideas, data and methods; the coordination of experiments to address specific challenges; and the formation of testbeds to facilitate shared experimentation. In the last decade, HEPEX has organized over a dozen international workshops, as well as sessions at scientific meetings (including AMS, AGU and EGU) and special issues of scientific journals where workshop results have been published. Through these interactions and an active online blog (www.hepex.org), HEPEX has built a strong and active community of nearly 400 researchers & practitioners around the world. This poster presents an overview of recent and planned HEPEX activities, highlighting case studies that exemplify the focus and objectives of HEPEX.

  13. On the proper use of Ensembles for Predictive Uncertainty assessment

    Science.gov (United States)

    Todini, Ezio; Coccia, Gabriele; Ortiz, Enrique

    2015-04-01

    uncertainty of the ensemble mean and that of the ensemble spread. The results of this new approach are illustrated by using data and forecasts from an operational real time flood forecasting. Coccia, G. and Todini, E. 2011. Recent developments in predictive uncertainty assessment based on the Model Conditional Processor approach. Hydrology and Earth System Sciences, 15, 3253-3274. doi:10.5194/hess-15-3253-2011. Krzysztofowicz, R. 1999 Bayesian theory of probabilistic forecasting via deterministic hydrologic model, Water Resour. Res., 35, 2739-2750. Raftery, A. E., T. Gneiting, F. Balabdaoui, and M. Polakowski, 2005. Using Bayesian model averaging to calibrate forecast ensembles, Mon. Weather Rev., 133, 1155-1174. Reggiani, P., Renner, M., Weerts, A., and van Gelder, P., 2009. Uncertainty assessment via Bayesian revision of ensemble streamflow predictions in the operational river Rhine forecasting system, Water Resour. Res., 45, W02428, doi:10.1029/2007WR006758. Todini E. 2004. Role and treatment of uncertainty in real-time flood forecasting. Hydrological Processes 18(14), 2743_2746 Todini, E. 2008. A model conditional processor to assess predictive uncertainty in flood forecasting. Intl. J. River Basin Management, 6(2): 123-137.

  14. Fingerprint prediction using classifier ensembles

    CSIR Research Space (South Africa)

    Molale, P

    2011-11-01

    Full Text Available ); logistic discrimination (LgD), k-nearest neighbour (k-NN), artificial neural network (ANN), association rules (AR) decision tree (DT), naive Bayes classifier (NBC) and the support vector machine (SVM). The performance of several multiple classifier systems...

  15. Development of a European Ensemble System for Seasonal Prediction: Application to crop yield

    Science.gov (United States)

    Terres, J. M.; Cantelaube, P.

    2003-04-01

    Western European agriculture is highly intensive and the weather is the main source of uncertainty for crop yield assessment and for crop management. In the current system, at the time when a crop yield forecast is issued, the weather conditions leading up to harvest time are unknown and are therefore a major source of uncertainty. The use of seasonal weather forecast would bring additional information for the remaining crop season and has valuable benefit for improving the management of agricultural markets and environmentally sustainable farm practices. An innovative method for supplying seasonal forecast information to crop simulation models has been developed in the frame of the EU funded research project DEMETER. It consists in running a crop model on each individual member of the seasonal hindcasts to derive a probability distribution of crop yield. Preliminary results of cumulative probability function of wheat yield provides information on both the yield anomaly and the reliability of the forecast. Based on the spread of the probability distribution, the end-user can directly quantify the benefits and risks of taking weather-sensitive decisions.

  16. The influence of the new ECMWF Ensemble Prediction System resolution on wind power forecast accuracy and uncertainty estimation

    DEFF Research Database (Denmark)

    Alessandrini, S.; Pinson, Pierre; Sperati, S.

    2011-01-01

    The importance of wind power forecasting (WPF) is nowadays commonly recognized because it represents a useful tool to reduce problems of grid integration and to facilitate energy trading. If on one side the prediction accuracy is fundamental to these scopes, on the other it has become also clear...... by a recalibration procedure that allowed obtaining a more uniform distribution among the 51 intervals, making the ensemble spread large enough to include the observations. After that it was observed that the EPS power spread seemed to have enough correlation with the error calculated on the deterministic forecast...

  17. Towards a GME ensemble forecasting system: Ensemble initialization using the breeding technique

    Directory of Open Access Journals (Sweden)

    Jan D. Keller

    2008-12-01

    Full Text Available The quantitative forecast of precipitation requires a probabilistic background particularly with regard to forecast lead times of more than 3 days. As only ensemble simulations can provide useful information of the underlying probability density function, we built a new ensemble forecasting system (GME-EFS based on the GME model of the German Meteorological Service (DWD. For the generation of appropriate initial ensemble perturbations we chose the breeding technique developed by Toth and Kalnay (1993, 1997, which develops perturbations by estimating the regions of largest model error induced uncertainty. This method is applied and tested in the framework of quasi-operational forecasts for a three month period in 2007. The performance of the resulting ensemble forecasts are compared to the operational ensemble prediction systems ECMWF EPS and NCEP GFS by means of ensemble spread of free atmosphere parameters (geopotential and temperature and ensemble skill of precipitation forecasting. This comparison indicates that the GME ensemble forecasting system (GME-EFS provides reasonable forecasts with spread skill score comparable to that of the NCEP GFS. An analysis with the continuous ranked probability score exhibits a lack of resolution for the GME forecasts compared to the operational ensembles. However, with significant enhancements during the 3 month test period, the first results of our work with the GME-EFS indicate possibilities for further development as well as the potential for later operational usage.

  18. A hydro-meteorological ensemble prediction system for real-time flood forecasting purposes in the Milano area

    Science.gov (United States)

    Ravazzani, Giovanni; Amengual, Arnau; Ceppi, Alessandro; Romero, Romualdo; Homar, Victor; Mancini, Marco

    2015-04-01

    Analysis of forecasting strategies that can provide a tangible basis for flood early warning procedures and mitigation measures over the Western Mediterranean region is one of the fundamental motivations of the European HyMeX programme. Here, we examine a set of hydro-meteorological episodes that affected the Milano urban area for which the complex flood protection system of the city did not completely succeed before the occurred flash-floods. Indeed, flood damages have exponentially increased in the area during the last 60 years, due to industrial and urban developments. Thus, the improvement of the Milano flood control system needs a synergism between structural and non-structural approaches. The flood forecasting system tested in this work comprises the Flash-flood Event-based Spatially distributed rainfall-runoff Transformation, including Water Balance (FEST-WB) and the Weather Research and Forecasting (WRF) models, in order to provide a hydrological ensemble prediction system (HEPS). Deterministic and probabilistic quantitative precipitation forecasts (QPFs) have been provided by WRF model in a set of 48-hours experiments. HEPS has been generated by combining different physical parameterizations (i.e. cloud microphysics, moist convection and boundary-layer schemes) of the WRF model in order to better encompass the atmospheric processes leading to high precipitation amounts. We have been able to test the value of a probabilistic versus a deterministic framework when driving Quantitative Discharge Forecasts (QDFs). Results highlight (i) the benefits of using a high-resolution HEPS in conveying uncertainties for this complex orographic area and (ii) a better simulation of the most of extreme precipitation events, potentially enabling valuable probabilistic QDFs. Hence, the HEPS copes with the significant deficiencies found in the deterministic QPFs. These shortcomings would prevent to correctly forecast the location and timing of high precipitation rates and

  19. Finding diversity for building one-day ahead Hydrological Ensemble Prediction System based on artificial neural network stacks

    Science.gov (United States)

    Brochero, Darwin; Anctil, Francois; Gagné, Christian; López, Karol

    2013-04-01

    In this study, we addressed the application of Artificial Neural Networks (ANN) in the context of Hydrological Ensemble Prediction Systems (HEPS). Such systems have become popular in the past years as a tool to include the forecast uncertainty in the decision making process. HEPS considers fundamentally the uncertainty cascade model [4] for uncertainty representation. Analogously, the machine learning community has proposed models of multiple classifier systems that take into account the variability in datasets, input space, model structures, and parametric configuration [3]. This approach is based primarily on the well-known "no free lunch theorem" [1]. Consequently, we propose a framework based on two separate but complementary topics: data stratification and input variable selection (IVS). Thus, we promote an ANN prediction stack in which each predictor is trained based on input spaces defined by the IVS application on different stratified sub-samples. All this, added to the inherent variability of classical ANN optimization, leads us to our ultimate goal: diversity in the prediction, defined as the complementarity of the individual predictors. The stratification application on the 12 basins used in this study, which originate from the second and third workshop of the MOPEX project [2], shows that the informativeness of the data is far more important than the quantity used for ANN training. Additionally, the input space variability leads to ANN stacks that outperform an ANN stack model trained with 100% of the available information but with a random selection of dataset used in the early stopping method (scenario R100P). The results show that from a deterministic view, the main advantage focuses on the efficient selection of the training information, which is an equally important concept for the calibration of conceptual hydrological models. On the other hand, the diversity achieved is reflected in a substantial improvement in the scores that define the

  20. An operational hydrological ensemble prediction system for the city of Zurich (Switzerland: skill, case studies and scenarios

    Directory of Open Access Journals (Sweden)

    N. Addor

    2011-07-01

    Full Text Available The Sihl River flows through Zurich, Switzerland's most populated city, for which it represents the largest flood threat. To anticipate extreme discharge events and provide decision support in case of flood risk, a hydrometeorological ensemble prediction system (HEPS was launched operationally in 2008. This model chain relies on limited-area atmospheric forecasts provided by the deterministic model COSMO-7 and the probabilistic model COSMO-LEPS. These atmospheric forecasts are used to force a semi-distributed hydrological model (PREVAH, coupled to a hydraulic model (FLORIS. The resulting hydrological forecasts are eventually communicated to the stakeholders involved in the Sihl discharge management. This fully operational setting provides a real framework with which to compare the potential of deterministic and probabilistic discharge forecasts for flood mitigation.

    To study the suitability of HEPS for small-scale basins and to quantify the added-value conveyed by the probability information, a reforecast was made for the period June 2007 to December 2009 for the Sihl catchment (336 km2. Several metrics support the conclusion that the performance gain can be of up to 2 days lead time for the catchment considered. Brier skill scores show that overall COSMO-LEPS-based hydrological forecasts outperforms their COSMO-7-based counterparts for all the lead times and event intensities considered. The small size of the Sihl catchment does not prevent skillful discharge forecasts, but makes them particularly dependent on correct precipitation forecasts, as shown by comparisons with a reference run driven by observed meteorological parameters. Our evaluation stresses that the capacity of the model to provide confident and reliable mid-term probability forecasts for high discharges is limited. The two most intense events of the study period are investigated utilising a novel graphical representation of probability forecasts, and are used

  1. Global Ensemble Forecast System (GEFS) [1 Deg.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Global Ensemble Forecast System (GEFS) is a weather forecast model made up of 21 separate forecasts, or ensemble members. The National Centers for Environmental...

  2. Demonstrating the value of larger ensembles in forecasting physical systems

    Directory of Open Access Journals (Sweden)

    Reason L. Machete

    2016-12-01

    Full Text Available Ensemble simulation propagates a collection of initial states forward in time in a Monte Carlo fashion. Depending on the fidelity of the model and the properties of the initial ensemble, the goal of ensemble simulation can range from merely quantifying variations in the sensitivity of the model all the way to providing actionable probability forecasts of the future. Whatever the goal is, success depends on the properties of the ensemble, and there is a longstanding discussion in meteorology as to the size of initial condition ensemble most appropriate for Numerical Weather Prediction. In terms of resource allocation: how is one to divide finite computing resources between model complexity, ensemble size, data assimilation and other components of the forecast system. One wishes to avoid undersampling information available from the model's dynamics, yet one also wishes to use the highest fidelity model available. Arguably, a higher fidelity model can better exploit a larger ensemble; nevertheless it is often suggested that a relatively small ensemble, say ~16 members, is sufficient and that larger ensembles are not an effective investment of resources. This claim is shown to be dubious when the goal is probabilistic forecasting, even in settings where the forecast model is informative but imperfect. Probability forecasts for a ‘simple’ physical system are evaluated at different lead times; ensembles of up to 256 members are considered. The pure density estimation context (where ensemble members are drawn from the same underlying distribution as the target differs from the forecasting context, where one is given a high fidelity (but imperfect model. In the forecasting context, the information provided by additional members depends also on the fidelity of the model, the ensemble formation scheme (data assimilation, the ensemble interpretation and the nature of the observational noise. The effect of increasing the ensemble size is quantified by

  3. An operational ensemble prediction system for catchment rainfall over eastern Africa spanning multiple temporal and spatial scales

    Science.gov (United States)

    Riddle, E. E.; Hopson, T. M.; Gebremichael, M.; Boehnert, J.; Broman, D.; Sampson, K. M.; Rostkier-Edelstein, D.; Collins, D. C.; Harshadeep, N. R.; Burke, E.; Havens, K.

    2017-12-01

    While it is not yet certain how precipitation patterns will change over Africa in the future, it is clear that effectively managing the available water resources is going to be crucial in order to mitigate the effects of water shortages and floods that are likely to occur in a changing climate. One component of effective water management is the availability of state-of-the-art and easy to use rainfall forecasts across multiple spatial and temporal scales. We present a web-based system for displaying and disseminating ensemble forecast and observed precipitation data over central and eastern Africa. The system provides multi-model rainfall forecasts integrated to relevant hydrological catchments for timescales ranging from one day to three months. A zoom-in features is available to access high resolution forecasts for small-scale catchments. Time series plots and data downloads with forecasts, recent rainfall observations and climatological data are available by clicking on individual catchments. The forecasts are calibrated using a quantile regression technique and an optimal multi-model forecast is provided at each timescale. The forecast skill at the various spatial and temporal scales will discussed, as will current applications of this tool for managing water resources in Sudan and optimizing hydropower operations in Ethiopia and Tanzania.

  4. Urban runoff forecasting with ensemble weather predictions

    DEFF Research Database (Denmark)

    Pedersen, Jonas Wied; Courdent, Vianney Augustin Thomas; Vezzaro, Luca

    This research shows how ensemble weather forecasts can be used to generate urban runoff forecasts up to 53 hours into the future. The results highlight systematic differences between ensemble members that needs to be accounted for when these forecasts are used in practice.......This research shows how ensemble weather forecasts can be used to generate urban runoff forecasts up to 53 hours into the future. The results highlight systematic differences between ensemble members that needs to be accounted for when these forecasts are used in practice....

  5. Singular vectors, predictability and ensemble forecasting for weather and climate

    International Nuclear Information System (INIS)

    Palmer, T N; Zanna, Laure

    2013-01-01

    The local instabilities of a nonlinear dynamical system can be characterized by the leading singular vectors of its linearized operator. The leading singular vectors are perturbations with the greatest linear growth and are therefore key in assessing the system’s predictability. In this paper, the analysis of singular vectors for the predictability of weather and climate and ensemble forecasting is discussed. An overview of the role of singular vectors in informing about the error growth rate in numerical models of the atmosphere is given. This is followed by their use in the initialization of ensemble weather forecasts. Singular vectors for the ocean and coupled ocean–atmosphere system in order to understand the predictability of climate phenomena such as ENSO and meridional overturning circulation are reviewed and their potential use to initialize seasonal and decadal forecasts is considered. As stochastic parameterizations are being implemented, some speculations are made about the future of singular vectors for the predictability of weather and climate for theoretical applications and at the operational level. This article is part of a special issue of Journal of Physics A: Mathematical and Theoretical devoted to ‘Lyapunov analysis: from dynamical systems theory to applications’. (review)

  6. Ensemble ecosystem modeling for predicting ecosystem response to predator reintroduction.

    Science.gov (United States)

    Baker, Christopher M; Gordon, Ascelin; Bode, Michael

    2017-04-01

    Introducing a new or extirpated species to an ecosystem is risky, and managers need quantitative methods that can predict the consequences for the recipient ecosystem. Proponents of keystone predator reintroductions commonly argue that the presence of the predator will restore ecosystem function, but this has not always been the case, and mathematical modeling has an important role to play in predicting how reintroductions will likely play out. We devised an ensemble modeling method that integrates species interaction networks and dynamic community simulations and used it to describe the range of plausible consequences of 2 keystone-predator reintroductions: wolves (Canis lupus) to Yellowstone National Park and dingoes (Canis dingo) to a national park in Australia. Although previous methods for predicting ecosystem responses to such interventions focused on predicting changes around a given equilibrium, we used Lotka-Volterra equations to predict changing abundances through time. We applied our method to interaction networks for wolves in Yellowstone National Park and for dingoes in Australia. Our model replicated the observed dynamics in Yellowstone National Park and produced a larger range of potential outcomes for the dingo network. However, we also found that changes in small vertebrates or invertebrates gave a good indication about the potential future state of the system. Our method allowed us to predict when the systems were far from equilibrium. Our results showed that the method can also be used to predict which species may increase or decrease following a reintroduction and can identify species that are important to monitor (i.e., species whose changes in abundance give extra insight into broad changes in the system). Ensemble ecosystem modeling can also be applied to assess the ecosystem-wide implications of other types of interventions including assisted migration, biocontrol, and invasive species eradication. © 2016 Society for Conservation Biology.

  7. Using synchronization in multi-model ensembles to improve prediction

    Science.gov (United States)

    Hiemstra, P.; Selten, F.

    2012-04-01

    In recent decades, many climate models have been developed to understand and predict the behavior of the Earth's climate system. Although these models are all based on the same basic physical principles, they still show different behavior. This is for example caused by the choice of how to parametrize sub-grid scale processes. One method to combine these imperfect models, is to run a multi-model ensemble. The models are given identical initial conditions and are integrated forward in time. A multi-model estimate can for example be a weighted mean of the ensemble members. We propose to go a step further, and try to obtain synchronization between the imperfect models by connecting the multi-model ensemble, and exchanging information. The combined multi-model ensemble is also known as a supermodel. The supermodel has learned from observations how to optimally exchange information between the ensemble members. In this study we focused on the density and formulation of the onnections within the supermodel. The main question was whether we could obtain syn-chronization between two climate models when connecting only a subset of their state spaces. Limiting the connected subspace has two advantages: 1) it limits the transfer of data (bytes) between the ensemble, which can be a limiting factor in large scale climate models, and 2) learning the optimal connection strategy from observations is easier. To answer the research question, we connected two identical quasi-geostrohic (QG) atmospheric models to each other, where the model have different initial conditions. The QG model is a qualitatively realistic simulation of the winter flow on the Northern hemisphere, has three layers and uses a spectral imple-mentation. We connected the models in the original spherical harmonical state space, and in linear combinations of these spherical harmonics, i.e. Empirical Orthogonal Functions (EOFs). We show that when connecting through spherical harmonics, we only need to connect 28% of

  8. River Flow Prediction Using the Nearest Neighbor Probabilistic Ensemble Method

    Directory of Open Access Journals (Sweden)

    H. Sanikhani

    2016-02-01

    Full Text Available Introduction: In the recent years, researchers interested on probabilistic forecasting of hydrologic variables such river flow.A probabilistic approach aims at quantifying the prediction reliability through a probability distribution function or a prediction interval for the unknown future value. The evaluation of the uncertainty associated to the forecast is seen as a fundamental information, not only to correctly assess the prediction, but also to compare forecasts from different methods and to evaluate actions and decisions conditionally on the expected values. Several probabilistic approaches have been proposed in the literature, including (1 methods that use resampling techniques to assess parameter and model uncertainty, such as the Metropolis algorithm or the Generalized Likelihood Uncertainty Estimation (GLUE methodology for an application to runoff prediction, (2 methods based on processing the forecast errors of past data to produce the probability distributions of future values and (3 methods that evaluate how the uncertainty propagates from the rainfall forecast to the river discharge prediction, as the Bayesian forecasting system. Materials and Methods: In this study, two different probabilistic methods are used for river flow prediction.Then the uncertainty related to the forecast is quantified. One approach is based on linear predictors and in the other, nearest neighbor was used. The nonlinear probabilistic ensemble can be used for nonlinear time series analysis using locally linear predictors, while NNPE utilize a method adapted for one step ahead nearest neighbor methods. In this regard, daily river discharge (twelve years of Dizaj and Mashin Stations on Baranduz-Chay basin in west Azerbijan and Zard-River basin in Khouzestan provinces were used, respectively. The first six years of data was applied for fitting the model. The next three years was used to calibration and the remained three yeas utilized for testing the models

  9. SVM and SVM Ensembles in Breast Cancer Prediction.

    Science.gov (United States)

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  10. SVM and SVM Ensembles in Breast Cancer Prediction.

    Directory of Open Access Journals (Sweden)

    Min-Wei Huang

    Full Text Available Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.

  11. Can decadal climate predictions be improved by ocean ensemble dispersion filtering?

    Science.gov (United States)

    Kadow, C.; Illing, S.; Kröner, I.; Ulbrich, U.; Cubasch, U.

    2017-12-01

    Decadal predictions by Earth system models aim to capture the state and phase of the climate several years inadvance. Atmosphere-ocean interaction plays an important role for such climate forecasts. While short-termweather forecasts represent an initial value problem and long-term climate projections represent a boundarycondition problem, the decadal climate prediction falls in-between these two time scales. The ocean memorydue to its heat capacity holds big potential skill on the decadal scale. In recent years, more precise initializationtechniques of coupled Earth system models (incl. atmosphere and ocean) have improved decadal predictions.Ensembles are another important aspect. Applying slightly perturbed predictions results in an ensemble. Insteadof using and evaluating one prediction, but the whole ensemble or its ensemble average, improves a predictionsystem. However, climate models in general start losing the initialized signal and its predictive skill from oneforecast year to the next. Here we show that the climate prediction skill of an Earth system model can be improvedby a shift of the ocean state toward the ensemble mean of its individual members at seasonal intervals. Wefound that this procedure, called ensemble dispersion filter, results in more accurate results than the standarddecadal prediction. Global mean and regional temperature, precipitation, and winter cyclone predictions showan increased skill up to 5 years ahead. Furthermore, the novel technique outperforms predictions with largerensembles and higher resolution. Our results demonstrate how decadal climate predictions benefit from oceanensemble dispersion filtering toward the ensemble mean. This study is part of MiKlip (fona-miklip.de) - a major project on decadal climate prediction in Germany.We focus on the Max-Planck-Institute Earth System Model using the low-resolution version (MPI-ESM-LR) andMiKlip's basic initialization strategy as in 2017 published decadal climate forecast: http

  12. Ocean Predictability and Uncertainty Forecasts Using Local Ensemble Transfer Kalman Filter (LETKF)

    Science.gov (United States)

    Wei, M.; Hogan, P. J.; Rowley, C. D.; Smedstad, O. M.; Wallcraft, A. J.; Penny, S. G.

    2017-12-01

    Ocean predictability and uncertainty are studied with an ensemble system that has been developed based on the US Navy's operational HYCOM using the Local Ensemble Transfer Kalman Filter (LETKF) technology. One of the advantages of this method is that the best possible initial analysis states for the HYCOM forecasts are provided by the LETKF which assimilates operational observations using ensemble method. The background covariance during this assimilation process is implicitly supplied with the ensemble avoiding the difficult task of developing tangent linear and adjoint models out of HYCOM with the complicated hybrid isopycnal vertical coordinate for 4D-VAR. The flow-dependent background covariance from the ensemble will be an indispensable part in the next generation hybrid 4D-Var/ensemble data assimilation system. The predictability and uncertainty for the ocean forecasts are studied initially for the Gulf of Mexico. The results are compared with another ensemble system using Ensemble Transfer (ET) method which has been used in the Navy's operational center. The advantages and disadvantages are discussed.

  13. Skill forecasting from different wind power ensemble prediction methods

    International Nuclear Information System (INIS)

    Pinson, Pierre; Nielsen, Henrik A; Madsen, Henrik; Kariniotakis, George

    2007-01-01

    This paper presents an investigation on alternative approaches to the providing of uncertainty estimates associated to point predictions of wind generation. Focus is given to skill forecasts in the form of prediction risk indices, aiming at giving a comprehensive signal on the expected level of forecast uncertainty. Ensemble predictions of wind generation are used as input. A proposal for the definition of prediction risk indices is given. Such skill forecasts are based on the dispersion of ensemble members for a single prediction horizon, or over a set of successive look-ahead times. It is shown on the test case of a Danish offshore wind farm how prediction risk indices may be related to several levels of forecast uncertainty (and energy imbalances). Wind power ensemble predictions are derived from the transformation of ECMWF and NCEP ensembles of meteorological variables to power, as well as by a lagged average approach alternative. The ability of risk indices calculated from the various types of ensembles forecasts to resolve among situations with different levels of uncertainty is discussed

  14. An ensemble classifier to predict track geometry degradation

    International Nuclear Information System (INIS)

    Cárdenas-Gallo, Iván; Sarmiento, Carlos A.; Morales, Gilberto A.; Bolivar, Manuel A.; Akhavan-Tabatabaei, Raha

    2017-01-01

    Railway operations are inherently complex and source of several problems. In particular, track geometry defects are one of the leading causes of train accidents in the United States. This paper presents a solution approach which entails the construction of an ensemble classifier to forecast the degradation of track geometry. Our classifier is constructed by solving the problem from three different perspectives: deterioration, regression and classification. We considered a different model from each perspective and our results show that using an ensemble method improves the predictive performance. - Highlights: • We present an ensemble classifier to forecast the degradation of track geometry. • Our classifier considers three perspectives: deterioration, regression and classification. • We construct and test three models and our results show that using an ensemble method improves the predictive performance.

  15. Benchmarking ensemble streamflow prediction skill in the UK

    Science.gov (United States)

    Harrigan, Shaun; Prudhomme, Christel; Parry, Simon; Smith, Katie; Tanguy, Maliko

    2018-03-01

    ; correlation between catchment base flow index (BFI) and ESP skill was very strong (Spearman's rank correlation coefficient = 0.90 at 1-month lead time). This was in contrast to the more highly responsive catchments in the north and west which were generally not skilful at seasonal lead times. Overall, this work provides scientific justification for when and where use of such a relatively simple forecasting approach is appropriate in the UK. This study, furthermore, creates a low cost benchmark against which potential skill improvements from more sophisticated hydro-meteorological ensemble prediction systems can be judged.

  16. Skill of Global Raw and Postprocessed Ensemble Predictions of Rainfall over Northern Tropical Africa

    Science.gov (United States)

    Vogel, Peter; Knippertz, Peter; Fink, Andreas H.; Schlueter, Andreas; Gneiting, Tilmann

    2018-04-01

    Accumulated precipitation forecasts are of high socioeconomic importance for agriculturally dominated societies in northern tropical Africa. In this study, we analyze the performance of nine operational global ensemble prediction systems (EPSs) relative to climatology-based forecasts for 1 to 5-day accumulated precipitation based on the monsoon seasons 2007-2014 for three regions within northern tropical Africa. To assess the full potential of raw ensemble forecasts across spatial scales, we apply state-of-the-art statistical postprocessing methods in form of Bayesian Model Averaging (BMA) and Ensemble Model Output Statistics (EMOS), and verify against station and spatially aggregated, satellite-based gridded observations. Raw ensemble forecasts are uncalibrated, unreliable, and underperform relative to climatology, independently of region, accumulation time, monsoon season, and ensemble. Differences between raw ensemble and climatological forecasts are large, and partly stem from poor prediction for low precipitation amounts. BMA and EMOS postprocessed forecasts are calibrated, reliable, and strongly improve on the raw ensembles, but - somewhat disappointingly - typically do not outperform climatology. Most EPSs exhibit slight improvements over the period 2007-2014, but overall have little added value compared to climatology. We suspect that the parametrization of convection is a potential cause for the sobering lack of ensemble forecast skill in a region dominated by mesoscale convective systems.

  17. Ensemble-based Regional Climate Prediction: Political Impacts

    Science.gov (United States)

    Miguel, E.; Dykema, J.; Satyanath, S.; Anderson, J. G.

    2008-12-01

    Accurate forecasts of regional climate, including temperature and precipitation, have significant implications for human activities, not just economically but socially. Sub Saharan Africa is a region that has displayed an exceptional propensity for devastating civil wars. Recent research in political economy has revealed a strong statistical relationship between year to year fluctuations in precipitation and civil conflict in this region in the 1980s and 1990s. To investigate how climate change may modify the regional risk of civil conflict in the future requires a probabilistic regional forecast that explicitly accounts for the community's uncertainty in the evolution of rainfall under anthropogenic forcing. We approach the regional climate prediction aspect of this question through the application of a recently demonstrated method called generalized scalar prediction (Leroy et al. 2009), which predicts arbitrary scalar quantities of the climate system. This prediction method can predict change in any variable or linear combination of variables of the climate system averaged over a wide range spatial scales, from regional to hemispheric to global. Generalized scalar prediction utilizes an ensemble of model predictions to represent the community's uncertainty range in climate modeling in combination with a timeseries of any type of observational data that exhibits sensitivity to the scalar of interest. It is not necessary to prioritize models in deriving with the final prediction. We present the results of the application of generalized scalar prediction for regional forecasts of temperature and precipitation and Sub Saharan Africa. We utilize the climate predictions along with the established statistical relationship between year-to-year rainfall variability in Sub Saharan Africa to investigate the potential impact of climate change on civil conflict within that region.

  18. Simplifying a hydrological ensemble prediction system with a backward greedy selection of members – Part 2: Generalization in time and space

    Directory of Open Access Journals (Sweden)

    D. Brochero

    2011-11-01

    Full Text Available An uncertainty cascade model applied to stream flow forecasting seeks to evaluate the different sources of uncertainty of the complex rainfall-runoff process. The current trend focuses on the combination of Meteorological Ensemble Prediction Systems (MEPS and hydrological model(s. However, the number of members of such a HEPS may rapidly increase to a level that may not be operationally sustainable. This paper evaluates the generalization ability of a simplification scheme of a 800-member HEPS formed by the combination of 16 lumped rainfall-runoff models with the 50 perturbed members from the European Centre for Medium-range Weather Forecasts (ECMWF EPS. Tests are made at two levels. At the local level, the transferability of the 9th day hydrological member selection for the other 8 forecast horizons exhibits an 82% success rate. The other evaluation is made at the regional or cluster level, the transferability from one catchment to another from within a cluster of watersheds also leads to a good performance (85% success rate, especially for forecast time horizons above 3 days and when the basins that formed the cluster presented themselves a good performance on an individual basis. Diversity, defined as hydrological model complementarity addressing different aspects of a forecast, was identified as the critical factor for proper selection applications.

  19. Predictability over the North Atlantic ocean in hindcast ensembles of MPI-ESM initialized by EnKF and three nudging systems

    Science.gov (United States)

    Brune, Sebastian; Pohlmann, Holger; Düsterhus, Andre; Kröger, Jürgen; Müller, Wolfgang; Baehr, Johanna

    2016-04-01

    We investigate hindcast skill for surface air temperature and upper ocean heat content (0-700m) in the North Atlantic for yearly mean values from 1960 to 2014 in four prediction systems based on the global coupled Max Planck Institute for Meteorology Earth System Model (MPI-ESM). We find that in the North Atlantic and within the four prediction systems under consideration only the EnKF initialized hindcasts reproduce the variability of the reference data well both in terms of anomaly correlation and representation of the probability density function. The systems under consideration only differ in the method how they incorporate surface and sub-surface oceanic temperatures and salinities during assimilation: ensemble Kalman Filter (EnKF), anomaly nudging of ORA reanalysis (BS-1), full field nudging of ORA and GECCO reanalysis, respectively (PT-ORA, PT-GEC). We assess the hindcast skill of each prediction system with reference to HadCRUT4 near surface air temperature data (Morice et al. 2012) and NOAA OC5 upper ocean heat content data (Levitus et al. 2012) using anomaly correlation (ACC) and by analysing the interquartile range (IQR) of the probability density function (PDF). Firstly, we calculate hindcast skill in terms of ACC and IQR against reference data over the whole time period. Here, the hindcast skills of EnKF and BS-1 are better for both ACC and IQR in lead years 2 to 5 when compared to PT-ORA and PT-GEC, their hindcast skill drops off after lead year 1. Secondly, the PDF of the reference data is not uniformly distributed over time. We therefore calculate ACC and IQR for a 20 year moving window. We find hindcast skill in terms of ACC for EnKF and BS-1 in the 1960s and from the 1990s onwards, up to eight lead years in advance, with almost no skill for the time period inbetween. In contrast, there is no skill for PT-ORA and PT-GEC in any period after lead year one. The IQR of reference data is best captured by the EnKF, in the 1960s and 1990s up to lead year

  20. Skill forecasting from ensemble predictions of wind power

    DEFF Research Database (Denmark)

    Pinson, Pierre; Nielsen, Henrik Aalborg; Madsen, Henrik

    2009-01-01

    Optimal management and trading of wind generation calls for the providing of uncertainty estimates along with the commonly provided short-term wind power point predictions. Alternative approaches for the use of probabilistic forecasting are introduced. More precisely, focus is given to prediction...... risk indices aiming to give a comprehensive signal on the expected level of forecast uncertainty. Ensemble predictions of wind generation are used as input. A proposal for the definition of prediction risk indices is given. Such skill forecasts are based on the spread of ensemble forecasts (i.e. a set...... of alternative scenarios for the coming period) for a single prediction horizon or over a took-ahead period. It is shown on the test case of a Danish offshore wind farm how these prediction risk indices may be related to several levels of forecast uncertainty (and potential energy imbalances). Wind power...

  1. Explosive Ordnance Disposal (EOD) Ensembles: Biophysical Characteristics and Predicted Work Times With and Without Chemical Protection and Active Cooling Systems

    Science.gov (United States)

    2015-04-29

    Integrated groin protector (IGP), and Boot Protector); GORE lined leather combat boots; and NOMEX® gloves with Velcro ; and EOD9 full face helmet... effective heat removal or cooling capacity of the active cooling system could not be obtained on the manikin, reasonable estimates can be used to...Price MJ, & Oldroyd M. The effect of heat acclimation on thermal strain during explosives ordnance disposal (EOD) related activity in moderate and

  2. Ensemble Streamflow Prediction in Korea: Past and Future 5 Years

    Science.gov (United States)

    Jeong, D.; Kim, Y.; Lee, J.

    2005-05-01

    The Ensemble Streamflow Prediction (ESP) approach was first introduced in 2000 by the Hydrology Research Group (HRG) at Seoul National University as an alternative probabilistic forecasting technique for improving the 'Water Supply Outlook' That is issued every month by the Ministry of Construction and Transportation in Korea. That study motivated the Korea Water Resources Corporation (KOWACO) to establish their seasonal probabilistic forecasting system for the 5 major river basins using the ESP approach. In cooperation with the HRG, the KOWACO developed monthly optimal multi-reservoir operating systems for the Geum river basin in 2004, which coupled the ESP forecasts with an optimization model using sampling stochastic dynamic programming. The user interfaces for both ESP and SSDP have also been designed for the developed computer systems to become more practical. More projects for developing ESP systems to the other 3 major river basins (i.e. the Nakdong, Han and Seomjin river basins) was also completed by the HRG and KOWACO at the end of December 2004. Therefore, the ESP system has become the most important mid- and long-term streamflow forecast technique in Korea. In addition to the practical aspects, resent research experience on ESP has raised some concerns into ways of improving the accuracy of ESP in Korea. Jeong and Kim (2002) performed an error analysis on its resulting probabilistic forecasts and found that the modeling error is dominant in the dry season, while the meteorological error is dominant in the flood season. To address the first issue, Kim et al. (2004) tested various combinations and/or combining techniques and showed that the ESP probabilistic accuracy could be improved considerably during the dry season when the hydrologic models were combined and/or corrected. In addition, an attempt was also made to improve the ESP accuracy for the flood season using climate forecast information. This ongoing project handles three types of climate

  3. Development of a regional ensemble prediction method for probabilistic weather prediction

    International Nuclear Information System (INIS)

    Nohara, Daisuke; Tamura, Hidetoshi; Hirakuchi, Hiromaru

    2015-01-01

    A regional ensemble prediction method has been developed to provide probabilistic weather prediction using a numerical weather prediction model. To obtain consistent perturbations with the synoptic weather pattern, both of initial and lateral boundary perturbations were given by differences between control and ensemble member of the Japan Meteorological Agency (JMA)'s operational one-week ensemble forecast. The method provides a multiple ensemble member with a horizontal resolution of 15 km for 48-hour based on a downscaling of the JMA's operational global forecast accompanied with the perturbations. The ensemble prediction was examined in the case of heavy snow fall event in Kanto area on January 14, 2013. The results showed that the predictions represent different features of high-resolution spatiotemporal distribution of precipitation affected by intensity and location of extra-tropical cyclone in each ensemble member. Although the ensemble prediction has model bias of mean values and variances in some variables such as wind speed and solar radiation, the ensemble prediction has a potential to append a probabilistic information to a deterministic prediction. (author)

  4. Managing uncertainty in metabolic network structure and improving predictions using EnsembleFBA.

    Directory of Open Access Journals (Sweden)

    Matthew B Biggs

    2017-03-01

    Full Text Available Genome-scale metabolic network reconstructions (GENREs are repositories of knowledge about the metabolic processes that occur in an organism. GENREs have been used to discover and interpret metabolic functions, and to engineer novel network structures. A major barrier preventing more widespread use of GENREs, particularly to study non-model organisms, is the extensive time required to produce a high-quality GENRE. Many automated approaches have been developed which reduce this time requirement, but automatically-reconstructed draft GENREs still require curation before useful predictions can be made. We present a novel approach to the analysis of GENREs which improves the predictive capabilities of draft GENREs by representing many alternative network structures, all equally consistent with available data, and generating predictions from this ensemble. This ensemble approach is compatible with many reconstruction methods. We refer to this new approach as Ensemble Flux Balance Analysis (EnsembleFBA. We validate EnsembleFBA by predicting growth and gene essentiality in the model organism Pseudomonas aeruginosa UCBPP-PA14. We demonstrate how EnsembleFBA can be included in a systems biology workflow by predicting essential genes in six Streptococcus species and mapping the essential genes to small molecule ligands from DrugBank. We found that some metabolic subsystems contributed disproportionately to the set of predicted essential reactions in a way that was unique to each Streptococcus species, leading to species-specific outcomes from small molecule interactions. Through our analyses of P. aeruginosa and six Streptococci, we show that ensembles increase the quality of predictions without drastically increasing reconstruction time, thus making GENRE approaches more practical for applications which require predictions for many non-model organisms. All of our functions and accompanying example code are available in an open online repository.

  5. Ensemble prediction of floods – catchment non-linearity and forecast probabilities

    Directory of Open Access Journals (Sweden)

    C. Reszler

    2007-07-01

    Full Text Available Quantifying the uncertainty of flood forecasts by ensemble methods is becoming increasingly important for operational purposes. The aim of this paper is to examine how the ensemble distribution of precipitation forecasts propagates in the catchment system, and to interpret the flood forecast probabilities relative to the forecast errors. We use the 622 km2 Kamp catchment in Austria as an example where a comprehensive data set, including a 500 yr and a 1000 yr flood, is available. A spatially-distributed continuous rainfall-runoff model is used along with ensemble and deterministic precipitation forecasts that combine rain gauge data, radar data and the forecast fields of the ALADIN and ECMWF numerical weather prediction models. The analyses indicate that, for long lead times, the variability of the precipitation ensemble is amplified as it propagates through the catchment system as a result of non-linear catchment response. In contrast, for lead times shorter than the catchment lag time (e.g. 12 h and less, the variability of the precipitation ensemble is decreased as the forecasts are mainly controlled by observed upstream runoff and observed precipitation. Assuming that all ensemble members are equally likely, the statistical analyses for five flood events at the Kamp showed that the ensemble spread of the flood forecasts is always narrower than the distribution of the forecast errors. This is because the ensemble forecasts focus on the uncertainty in forecast precipitation as the dominant source of uncertainty, and other sources of uncertainty are not accounted for. However, a number of analyses, including Relative Operating Characteristic diagrams, indicate that the ensemble spread is a useful indicator to assess potential forecast errors for lead times larger than 12 h.

  6. Ensemble of classifiers based network intrusion detection system performance bound

    CSIR Research Space (South Africa)

    Mkuzangwe, Nenekazi NP

    2017-11-01

    Full Text Available This paper provides a performance bound of a network intrusion detection system (NIDS) that uses an ensemble of classifiers. Currently researchers rely on implementing the ensemble of classifiers based NIDS before they can determine the performance...

  7. Global Ensemble Forecast System (GEFS) [2.5 Deg.

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Global Ensemble Forecast System (GEFS) is a weather forecast model made up of 21 separate forecasts, or ensemble members. The National Centers for Environmental...

  8. Ensemble learned vaccination uptake prediction using web search queries

    DEFF Research Database (Denmark)

    Hansen, Niels Dalum; Lioma, Christina; Mølbak, Kåre

    2016-01-01

    We present a method that uses ensemble learning to combine clinical and web-mined time-series data in order to predict future vaccination uptake. The clinical data is official vaccination registries, and the web data is query frequencies collected from Google Trends. Experiments with official...... vaccine records show that our method predicts vaccination uptake eff?ectively (4.7 Root Mean Squared Error). Whereas performance is best when combining clinical and web data, using solely web data yields comparative performance. To our knowledge, this is the ?first study to predict vaccination uptake...

  9. Limited-area short-range ensemble predictions targeted for heavy rain in Europe

    Directory of Open Access Journals (Sweden)

    K. Sattler

    2005-01-01

    Full Text Available Inherent uncertainties in short-range quantitative precipitation forecasts (QPF from the high-resolution, limited-area numerical weather prediction model DMI-HIRLAM (LAM are addressed using two different approaches to creating a small ensemble of LAM simulations, with focus on prediction of extreme rainfall events over European river basins. The first ensemble type is designed to represent uncertainty in the atmospheric state of the initial condition and at the lateral LAM boundaries. The global ensemble prediction system (EPS from ECMWF serves as host model to the LAM and provides the state perturbations, from which a small set of significant members is selected. The significance is estimated on the basis of accumulated precipitation over a target area of interest, which contains the river basin(s under consideration. The selected members provide the initial and boundary data for the ensemble integration in the LAM. A second ensemble approach tries to address a portion of the model-inherent uncertainty responsible for errors in the forecasted precipitation field by utilising different parameterisation schemes for condensation and convection in the LAM. Three periods around historical heavy rain events that caused or contributed to disastrous river flooding in Europe are used to study the performance of the LAM ensemble designs. The three cases exhibit different dynamic and synoptic characteristics and provide an indication of the ensemble qualities in different weather situations. Precipitation analyses from the Deutsche Wetterdienst (DWD are used as the verifying reference and a comparison of daily rainfall amounts is referred to the respective river basins of the historical cases.

  10. Ensemble-based prediction of RNA secondary structures.

    Science.gov (United States)

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between

  11. Predicting artificailly drained areas by means of selective model ensemble

    DEFF Research Database (Denmark)

    Møller, Anders Bjørn; Beucher, Amélie; Iversen, Bo Vangsø

    . The approaches employed include decision trees, discriminant analysis, regression models, neural networks and support vector machines amongst others. Several models are trained with each method, using variously the original soil covariates and principal components of the covariates. With a large ensemble...... out since the mid-19th century, and it has been estimated that half of the cultivated area is artificially drained (Olesen, 2009). A number of machine learning approaches can be used to predict artificially drained areas in geographic space. However, instead of choosing the most accurate model....... The study aims firstly to train a large number of models to predict the extent of artificially drained areas using various machine learning approaches. Secondly, the study will develop a method for selecting the models, which give a good prediction of artificially drained areas, when used in conjunction...

  12. Ensemble atmospheric dispersion calculations for decision support systems

    International Nuclear Information System (INIS)

    Borysiewicz, M.; Potempski, S.; Galkowski, A.; Zelazny, R.

    2003-01-01

    This document describes two approaches to long-range atmospheric dispersion of pollutants based on the ensemble concept. In the first part of the report some experiences related to the exercises undertaken under the ENSEMBLE project of the European Union are presented. The second part is devoted to the implementation of mesoscale numerical prediction models RAMS and atmospheric dispersion model HYPACT on Beowulf cluster and theirs usage for ensemble forecasting and long range atmospheric ensemble dispersion calculations based on available meteorological data from NCEO, NOAA (USA). (author)

  13. Simultaneous calibration of ensemble river flow predictions over an entire range of lead times

    Science.gov (United States)

    Hemri, S.; Fundel, F.; Zappa, M.

    2013-10-01

    Probabilistic estimates of future water levels and river discharge are usually simulated with hydrologic models using ensemble weather forecasts as main inputs. As hydrologic models are imperfect and the meteorological ensembles tend to be biased and underdispersed, the ensemble forecasts for river runoff typically are biased and underdispersed, too. Thus, in order to achieve both reliable and sharp predictions statistical postprocessing is required. In this work Bayesian model averaging (BMA) is applied to statistically postprocess ensemble runoff raw forecasts for a catchment in Switzerland, at lead times ranging from 1 to 240 h. The raw forecasts have been obtained using deterministic and ensemble forcing meteorological models with different forecast lead time ranges. First, BMA is applied based on mixtures of univariate normal distributions, subject to the assumption of independence between distinct lead times. Then, the independence assumption is relaxed in order to estimate multivariate runoff forecasts over the entire range of lead times simultaneously, based on a BMA version that uses multivariate normal distributions. Since river runoff is a highly skewed variable, Box-Cox transformations are applied in order to achieve approximate normality. Both univariate and multivariate BMA approaches are able to generate well calibrated probabilistic forecasts that are considerably sharper than climatological forecasts. Additionally, multivariate BMA provides a promising approach for incorporating temporal dependencies into the postprocessed forecasts. Its major advantage against univariate BMA is an increase in reliability when the forecast system is changing due to model availability.

  14. CarcinoPred-EL: Novel models for predicting the carcinogenicity of chemicals using molecular fingerprints and ensemble learning methods.

    Science.gov (United States)

    Zhang, Li; Ai, Haixin; Chen, Wen; Yin, Zimo; Hu, Huan; Zhu, Junfeng; Zhao, Jian; Zhao, Qi; Liu, Hongsheng

    2017-05-18

    Carcinogenicity refers to a highly toxic end point of certain chemicals, and has become an important issue in the drug development process. In this study, three novel ensemble classification models, namely Ensemble SVM, Ensemble RF, and Ensemble XGBoost, were developed to predict carcinogenicity of chemicals using seven types of molecular fingerprints and three machine learning methods based on a dataset containing 1003 diverse compounds with rat carcinogenicity. Among these three models, Ensemble XGBoost is found to be the best, giving an average accuracy of 70.1 ± 2.9%, sensitivity of 67.0 ± 5.0%, and specificity of 73.1 ± 4.4% in five-fold cross-validation and an accuracy of 70.0%, sensitivity of 65.2%, and specificity of 76.5% in external validation. In comparison with some recent methods, the ensemble models outperform some machine learning-based approaches and yield equal accuracy and higher specificity but lower sensitivity than rule-based expert systems. It is also found that the ensemble models could be further improved if more data were available. As an application, the ensemble models are employed to discover potential carcinogens in the DrugBank database. The results indicate that the proposed models are helpful in predicting the carcinogenicity of chemicals. A web server called CarcinoPred-EL has been built for these models ( http://ccsipb.lnu.edu.cn/toxicity/CarcinoPred-EL/ ).

  15. Spam comments prediction using stacking with ensemble learning

    Science.gov (United States)

    Mehmood, Arif; On, Byung-Won; Lee, Ingyu; Ashraf, Imran; Choi, Gyu Sang

    2018-01-01

    Illusive comments of product or services are misleading for people in decision making. The current methodologies to predict deceptive comments are concerned for feature designing with single training model. Indigenous features have ability to show some linguistic phenomena but are hard to reveal the latent semantic meaning of the comments. We propose a prediction model on general features of documents using stacking with ensemble learning. Term Frequency/Inverse Document Frequency (TF/IDF) features are inputs to stacking of Random Forest and Gradient Boosted Trees and the outputs of the base learners are encapsulated with decision tree to make final training of the model. The results exhibits that our approach gives the accuracy of 92.19% which outperform the state-of-the-art method.

  16. Learning to REDUCE: A Reduced Electricity Consumption Prediction Ensemble

    Energy Technology Data Exchange (ETDEWEB)

    Aman, Saima; Chelmis, Charalampos; Prasanna, Viktor

    2016-02-12

    Utilities use Demand Response (DR) to balance supply and demand in the electric grid by involving customers in efforts to reduce electricity consumption during peak periods. To implement and adapt DR under dynamically changing conditions of the grid, reliable prediction of reduced consumption is critical. However, despite the wealth of research on electricity consumption prediction and DR being long in practice, the problem of reduced consumption prediction remains largely un-addressed. In this paper, we identify unique computational challenges associated with the prediction of reduced consumption and contrast this to that of normal consumption and DR baseline prediction.We propose a novel ensemble model that leverages different sequences of daily electricity consumption on DR event days as well as contextual attributes for reduced consumption prediction. We demonstrate the success of our model on a large, real-world, high resolution dataset from a university microgrid comprising of over 950 DR events across a diverse set of 32 buildings. Our model achieves an average error of 13.5%, an 8.8% improvement over the baseline. Our work is particularly relevant for buildings where electricity consumption is not tied to strict schedules. Our results and insights should prove useful to the researchers and practitioners working in the sustainable energy domain.

  17. Ensemble Prediction Model with Expert Selection for Electricity Price Forecasting

    Directory of Open Access Journals (Sweden)

    Bijay Neupane

    2017-01-01

    Full Text Available Forecasting of electricity prices is important in deregulated electricity markets for all of the stakeholders: energy wholesalers, traders, retailers and consumers. Electricity price forecasting is an inherently difficult problem due to its special characteristic of dynamicity and non-stationarity. In this paper, we present a robust price forecasting mechanism that shows resilience towards the aggregate demand response effect and provides highly accurate forecasted electricity prices to the stakeholders in a dynamic environment. We employ an ensemble prediction model in which a group of different algorithms participates in forecasting 1-h ahead the price for each hour of a day. We propose two different strategies, namely, the Fixed Weight Method (FWM and the Varying Weight Method (VWM, for selecting each hour’s expert algorithm from the set of participating algorithms. In addition, we utilize a carefully engineered set of features selected from a pool of features extracted from the past electricity price data, weather data and calendar data. The proposed ensemble model offers better results than the Autoregressive Integrated Moving Average (ARIMA method, the Pattern Sequence-based Forecasting (PSF method and our previous work using Artificial Neural Networks (ANN alone on the datasets for New York, Australian and Spanish electricity markets.

  18. Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.

    Directory of Open Access Journals (Sweden)

    Nikola Simidjievski

    Full Text Available Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting, significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.

  19. Skill prediction of local weather forecasts based on the ECMWF ensemble

    Directory of Open Access Journals (Sweden)

    C. Ziehmann

    2001-01-01

    Full Text Available Ensemble Prediction has become an essential part of numerical weather forecasting. In this paper we investigate the ability of ensemble forecasts to provide an a priori estimate of the expected forecast skill. Several quantities derived from the local ensemble distribution are investigated for a two year data set of European Centre for Medium-Range Weather Forecasts (ECMWF temperature and wind speed ensemble forecasts at 30 German stations. The results indicate that the population of the ensemble mode provides useful information for the uncertainty in temperature forecasts. The ensemble entropy is a similar good measure. This is not true for the spread if it is simply calculated as the variance of the ensemble members with respect to the ensemble mean. The number of clusters in the C regions is almost unrelated to the local skill. For wind forecasts, the results are less promising.

  20. Ensemble Methods

    Science.gov (United States)

    Re, Matteo; Valentini, Giorgio

    2012-03-01

    Ensemble methods are statistical and computational learning procedures reminiscent of the human social learning behavior of seeking several opinions before making any crucial decision. The idea of combining the opinions of different "experts" to obtain an overall “ensemble” decision is rooted in our culture at least from the classical age of ancient Greece, and it has been formalized during the Enlightenment with the Condorcet Jury Theorem[45]), which proved that the judgment of a committee is superior to those of individuals, provided the individuals have reasonable competence. Ensembles are sets of learning machines that combine in some way their decisions, or their learning algorithms, or different views of data, or other specific characteristics to obtain more reliable and more accurate predictions in supervised and unsupervised learning problems [48,116]. A simple example is represented by the majority vote ensemble, by which the decisions of different learning machines are combined, and the class that receives the majority of “votes” (i.e., the class predicted by the majority of the learning machines) is the class predicted by the overall ensemble [158]. In the literature, a plethora of terms other than ensembles has been used, such as fusion, combination, aggregation, and committee, to indicate sets of learning machines that work together to solve a machine learning problem [19,40,56,66,99,108,123], but in this chapter we maintain the term ensemble in its widest meaning, in order to include the whole range of combination methods. Nowadays, ensemble methods represent one of the main current research lines in machine learning [48,116], and the interest of the research community on ensemble methods is witnessed by conferences and workshops specifically devoted to ensembles, first of all the multiple classifier systems (MCS) conference organized by Roli, Kittler, Windeatt, and other researchers of this area [14,62,85,149,173]. Several theories have been

  1. Enhancing Predictive Accuracy of Cardiac Autonomic Neuropathy Using Blood Biochemistry Features and Iterative Multitier Ensembles.

    Science.gov (United States)

    Abawajy, Jemal; Kelarev, Andrei; Chowdhury, Morshed U; Jelinek, Herbert F

    2016-01-01

    Blood biochemistry attributes form an important class of tests, routinely collected several times per year for many patients with diabetes. The objective of this study is to investigate the role of blood biochemistry for improving the predictive accuracy of the diagnosis of cardiac autonomic neuropathy (CAN) progression. Blood biochemistry contributes to CAN, and so it is a causative factor that can provide additional power for the diagnosis of CAN especially in the absence of a complete set of Ewing tests. We introduce automated iterative multitier ensembles (AIME) and investigate their performance in comparison to base classifiers and standard ensemble classifiers for blood biochemistry attributes. AIME incorporate diverse ensembles into several tiers simultaneously and combine them into one automatically generated integrated system so that one ensemble acts as an integral part of another ensemble. We carried out extensive experimental analysis using large datasets from the diabetes screening research initiative (DiScRi) project. The results of our experiments show that several blood biochemistry attributes can be used to supplement the Ewing battery for the detection of CAN in situations where one or more of the Ewing tests cannot be completed because of the individual difficulties faced by each patient in performing the tests. The results show that AIME provide higher accuracy as a multitier CAN classification paradigm. The best predictive accuracy of 99.57% has been obtained by the AIME combining decorate on top tier with bagging on middle tier based on random forest. Practitioners can use these findings to increase the accuracy of CAN diagnosis.

  2. Predicting Power Outages Using Multi-Model Ensemble Forecasts

    Science.gov (United States)

    Cerrai, D.; Anagnostou, E. N.; Yang, J.; Astitha, M.

    2017-12-01

    Power outages affect every year millions of people in the United States, affecting the economy and conditioning the everyday life. An Outage Prediction Model (OPM) has been developed at the University of Connecticut for helping utilities to quickly restore outages and to limit their adverse consequences on the population. The OPM, operational since 2015, combines several non-parametric machine learning (ML) models that use historical weather storm simulations and high-resolution weather forecasts, satellite remote sensing data, and infrastructure and land cover data to predict the number and spatial distribution of power outages. A new methodology, developed for improving the outage model performances by combining weather- and soil-related variables using three different weather models (WRF 3.7, WRF 3.8 and RAMS/ICLAMS), will be presented in this study. First, we will present a performance evaluation of each model variable, by comparing historical weather analyses with station data or reanalysis over the entire storm data set. Hence, each variable of the new outage model version is extracted from the best performing weather model for that variable, and sensitivity tests are performed for investigating the most efficient variable combination for outage prediction purposes. Despite that the final variables combination is extracted from different weather models, this ensemble based on multi-weather forcing and multi-statistical model power outage prediction outperforms the currently operational OPM version that is based on a single weather forcing variable (WRF 3.7), because each model component is the closest to the actual atmospheric state.

  3. Ensemble Bayesian forecasting system Part I: Theory and algorithms

    Science.gov (United States)

    Herr, Henry D.; Krzysztofowicz, Roman

    2015-05-01

    The ensemble Bayesian forecasting system (EBFS), whose theory was published in 2001, is developed for the purpose of quantifying the total uncertainty about a discrete-time, continuous-state, non-stationary stochastic process such as a time series of stages, discharges, or volumes at a river gauge. The EBFS is built of three components: an input ensemble forecaster (IEF), which simulates the uncertainty associated with random inputs; a deterministic hydrologic model (of any complexity), which simulates physical processes within a river basin; and a hydrologic uncertainty processor (HUP), which simulates the hydrologic uncertainty (an aggregate of all uncertainties except input). It works as a Monte Carlo simulator: an ensemble of time series of inputs (e.g., precipitation amounts) generated by the IEF is transformed deterministically through a hydrologic model into an ensemble of time series of outputs, which is next transformed stochastically by the HUP into an ensemble of time series of predictands (e.g., river stages). Previous research indicated that in order to attain an acceptable sampling error, the ensemble size must be on the order of hundreds (for probabilistic river stage forecasts and probabilistic flood forecasts) or even thousands (for probabilistic stage transition forecasts). The computing time needed to run the hydrologic model this many times renders the straightforward simulations operationally infeasible. This motivates the development of the ensemble Bayesian forecasting system with randomization (EBFSR), which takes full advantage of the analytic meta-Gaussian HUP and generates multiple ensemble members after each run of the hydrologic model; this auxiliary randomization reduces the required size of the meteorological input ensemble and makes it operationally feasible to generate a Bayesian ensemble forecast of large size. Such a forecast quantifies the total uncertainty, is well calibrated against the prior (climatic) distribution of

  4. Competitive Learning Neural Network Ensemble Weighted by Predicted Performance

    Science.gov (United States)

    Ye, Qiang

    2010-01-01

    Ensemble approaches have been shown to enhance classification by combining the outputs from a set of voting classifiers. Diversity in error patterns among base classifiers promotes ensemble performance. Multi-task learning is an important characteristic for Neural Network classifiers. Introducing a secondary output unit that receives different…

  5. Time-dependent generalized Gibbs ensembles in open quantum systems

    Science.gov (United States)

    Lange, Florian; Lenarčič, Zala; Rosch, Achim

    2018-04-01

    Generalized Gibbs ensembles have been used as powerful tools to describe the steady state of integrable many-particle quantum systems after a sudden change of the Hamiltonian. Here, we demonstrate numerically that they can be used for a much broader class of problems. We consider integrable systems in the presence of weak perturbations which break both integrability and drive the system to a state far from equilibrium. Under these conditions, we show that the steady state and the time evolution on long timescales can be accurately described by a (truncated) generalized Gibbs ensemble with time-dependent Lagrange parameters, determined from simple rate equations. We compare the numerically exact time evolutions of density matrices for small systems with a theory based on block-diagonal density matrices (diagonal ensemble) and a time-dependent generalized Gibbs ensemble containing only a small number of approximately conserved quantities, using the one-dimensional Heisenberg model with perturbations described by Lindblad operators as an example.

  6. Long-range hydrometeorological ensemble predictions of drought parameters

    Science.gov (United States)

    Fundel, F.; Jörg-Hess, S.; Zappa, M.

    2012-06-01

    Low streamflow as consequence of a drought event affects numerous aspects of life. Economic sectors that may be impacted by drought are, e.g. power production, agriculture, tourism and water quality management. Numerical models have increasingly been used to forecast low-flow and have become the focus of recent research. Here, we consider daily ensemble runoff forecasts for the river Thur, which has its source in the Swiss Alps. We focus on the low-flow indices duration, severity and magnitude, with a forecast lead-time of one month, to assess their potential usefulness for predictions. The ECMWF VarEPS 5 member reforecast, which covers 18 yr, is used as forcing for the hydrological model PREVAH. A thorough verification shows that, compared to peak flow, probabilistic low-flow forecasts are skillful for longer lead-times, low-flow index forecasts could also be beneficially included in a decision-making process. The results suggest monthly runoff forecasts are useful for accessing the risk of hydrological droughts.

  7. Ensemble models on palaeoclimate to predict India's groundwater challenge

    Directory of Open Access Journals (Sweden)

    Partha Sarathi Datta

    2013-09-01

    Full Text Available In many parts of the world, freshwater crisis is largely due to increasing water consumption and pollution by rapidly growing population and aspirations for economic development, but, ascribed usually to the climate. However, limited understanding and knowledge gaps in the factors controlling climate and uncertainties in the climate models are unable to assess the probable impacts on water availability in tropical regions. In this context, review of ensemble models on δ18O and δD in rainfall and groundwater, 3H- and 14C- ages of groundwater and 14C- age of lakes sediments helped to reconstruct palaeoclimate and long-term recharge in the North-west India; and predict future groundwater challenge. The annual mean temperature trend indicates both warming/cooling in different parts of India in the past and during 1901–2010. Neither the GCMs (Global Climate Models nor the observational record indicates any significant change/increase in temperature and rainfall over the last century, and climate change during the last 1200 yrs BP. In much of the North-West region, deep groundwater renewal occurred from past humid climate, and shallow groundwater renewal from limited modern recharge over the past decades. To make water management to be more responsive to climate change, the gaps in the science of climate change need to be bridged.

  8. Prediction of drug synergy in cancer using ensemble-based machine learning techniques

    Science.gov (United States)

    Singh, Harpreet; Rana, Prashant Singh; Singh, Urvinder

    2018-04-01

    Drug synergy prediction plays a significant role in the medical field for inhibiting specific cancer agents. It can be developed as a pre-processing tool for therapeutic successes. Examination of different drug-drug interaction can be done by drug synergy score. It needs efficient regression-based machine learning approaches to minimize the prediction errors. Numerous machine learning techniques such as neural networks, support vector machines, random forests, LASSO, Elastic Nets, etc., have been used in the past to realize requirement as mentioned above. However, these techniques individually do not provide significant accuracy in drug synergy score. Therefore, the primary objective of this paper is to design a neuro-fuzzy-based ensembling approach. To achieve this, nine well-known machine learning techniques have been implemented by considering the drug synergy data. Based on the accuracy of each model, four techniques with high accuracy are selected to develop ensemble-based machine learning model. These models are Random forest, Fuzzy Rules Using Genetic Cooperative-Competitive Learning method (GFS.GCCL), Adaptive-Network-Based Fuzzy Inference System (ANFIS) and Dynamic Evolving Neural-Fuzzy Inference System method (DENFIS). Ensembling is achieved by evaluating the biased weighted aggregation (i.e. adding more weights to the model with a higher prediction score) of predicted data by selected models. The proposed and existing machine learning techniques have been evaluated on drug synergy score data. The comparative analysis reveals that the proposed method outperforms others in terms of accuracy, root mean square error and coefficient of correlation.

  9. Identifying and Assessing Gaps in Subseasonal to Seasonal Prediction Skill using the North American Multi-model Ensemble

    Science.gov (United States)

    Pegion, K.; DelSole, T. M.; Becker, E.; Cicerone, T.

    2016-12-01

    Predictability represents the upper limit of prediction skill if we had an infinite member ensemble and a perfect model. It is an intrinsic limit of the climate system associated with the chaotic nature of the atmosphere. Producing a forecast system that can make predictions very near to this limit is the ultimate goal of forecast system development. Estimates of predictability together with calculations of current prediction skill are often used to define the gaps in our prediction capabilities on subseasonal to seasonal timescales and to inform the scientific issues that must be addressed to build the next forecast system. Quantification of the predictability is also important for providing a scientific basis for relaying to stakeholders what kind of climate information can be provided to inform decision-making and what kind of information is not possible given the intrinsic predictability of the climate system. One challenge with predictability estimates is that different prediction systems can give different estimates of the upper limit of skill. How do we know which estimate of predictability is most representative of the true predictability of the climate system? Previous studies have used the spread-error relationship and the autocorrelation to evaluate the fidelity of the signal and noise estimates. Using a multi-model ensemble prediction system, we can quantify whether these metrics accurately indicate an individual model's ability to properly estimate the signal, noise, and predictability. We use this information to identify the best estimates of predictability for 2-meter temperature, precipitation, and sea surface temperature from the North American Multi-model Ensemble and compare with current skill to indicate the regions with potential for improving skill.

  10. A Diagnostics Tool to detect ensemble forecast system anomaly and guide operational decisions

    Science.gov (United States)

    Park, G. H.; Srivastava, A.; Shrestha, E.; Thiemann, M.; Day, G. N.; Draijer, S.

    2017-12-01

    The hydrologic community is moving toward using ensemble forecasts to take uncertainty into account during the decision-making process. The New York City Department of Environmental Protection (DEP) implements several types of ensemble forecasts in their decision-making process: ensemble products for a statistical model (Hirsch and enhanced Hirsch); the National Weather Service (NWS) Advanced Hydrologic Prediction Service (AHPS) forecasts based on the classical Ensemble Streamflow Prediction (ESP) technique; and the new NWS Hydrologic Ensemble Forecasting Service (HEFS) forecasts. To remove structural error and apply the forecasts to additional forecast points, the DEP post processes both the AHPS and the HEFS forecasts. These ensemble forecasts provide mass quantities of complex data, and drawing conclusions from these forecasts is time-consuming and difficult. The complexity of these forecasts also makes it difficult to identify system failures resulting from poor data, missing forecasts, and server breakdowns. To address these issues, we developed a diagnostic tool that summarizes ensemble forecasts and provides additional information such as historical forecast statistics, forecast skill, and model forcing statistics. This additional information highlights the key information that enables operators to evaluate the forecast in real-time, dynamically interact with the data, and review additional statistics, if needed, to make better decisions. We used Bokeh, a Python interactive visualization library, and a multi-database management system to create this interactive tool. This tool compiles and stores data into HTML pages that allows operators to readily analyze the data with built-in user interaction features. This paper will present a brief description of the ensemble forecasts, forecast verification results, and the intended applications for the diagnostic tool.

  11. Ensemble of data-driven prognostic algorithms for robust prediction of remaining useful life

    International Nuclear Information System (INIS)

    Hu Chao; Youn, Byeng D.; Wang Pingfeng; Taek Yoon, Joung

    2012-01-01

    Prognostics aims at determining whether a failure of an engineered system (e.g., a nuclear power plant) is impending and estimating the remaining useful life (RUL) before the failure occurs. The traditional data-driven prognostic approach is to construct multiple candidate algorithms using a training data set, evaluate their respective performance using a testing data set, and select the one with the best performance while discarding all the others. This approach has three shortcomings: (i) the selected standalone algorithm may not be robust; (ii) it wastes the resources for constructing the algorithms that are discarded; (iii) it requires the testing data in addition to the training data. To overcome these drawbacks, this paper proposes an ensemble data-driven prognostic approach which combines multiple member algorithms with a weighted-sum formulation. Three weighting schemes, namely the accuracy-based weighting, diversity-based weighting and optimization-based weighting, are proposed to determine the weights of member algorithms. The k-fold cross validation (CV) is employed to estimate the prediction error required by the weighting schemes. The results obtained from three case studies suggest that the ensemble approach with any weighting scheme gives more accurate RUL predictions compared to any sole algorithm when member algorithms producing diverse RUL predictions have comparable prediction accuracy and that the optimization-based weighting scheme gives the best overall performance among the three weighting schemes.

  12. Quantum Control of Open Systems and Dense Atomic Ensembles

    Science.gov (United States)

    DiLoreto, Christopher

    . This effect motivates the need for using multi-directional basis sets in theoretical analysis of dense quantum systems. My results demonstrate the shortcomings of short-pulse techniques used in many recent studies. Based on my numerical studies, I hypothesize that the dense ensemble can be modelled by an effective single quantum system that has a decoherence rate that changes over time. My effective single particle model provides a way in which computational time can be reduced, and also a model in which the underlying physical processes involved in the system's evolution are much easier to understand. I then use this model to provide an elegant theoretical explanation for an unusual experimental result called "transverse optical magnetism''. My effective single particle model's predictions match very well with experimental data.

  13. Probabilistic Predictions of PM2.5 Using a Novel Ensemble Design for the NAQFC

    Science.gov (United States)

    Kumar, R.; Lee, J. A.; Delle Monache, L.; Alessandrini, S.; Lee, P.

    2017-12-01

    Poor air quality (AQ) in the U.S. is estimated to cause about 60,000 premature deaths with costs of 100B-150B annually. To reduce such losses, the National AQ Forecasting Capability (NAQFC) at the National Oceanic and Atmospheric Administration (NOAA) produces forecasts of ozone, particulate matter less than 2.5 mm in diameter (PM2.5), and other pollutants so that advance notice and warning can be issued to help individuals and communities limit the exposure and reduce air pollution-caused health problems. The current NAQFC, based on the U.S. Environmental Protection Agency Community Multi-scale AQ (CMAQ) modeling system, provides only deterministic AQ forecasts and does not quantify the uncertainty associated with the predictions, which could be large due to the chaotic nature of atmosphere and nonlinearity in atmospheric chemistry. This project aims to take NAQFC a step further in the direction of probabilistic AQ prediction by exploring and quantifying the potential value of ensemble predictions of PM2.5, and perturbing three key aspects of PM2.5 modeling: the meteorology, emissions, and CMAQ secondary organic aerosol formulation. This presentation focuses on the impact of meteorological variability, which is represented by three members of NOAA's Short-Range Ensemble Forecast (SREF) system that were down-selected by hierarchical cluster analysis. These three SREF members provide the physics configurations and initial/boundary conditions for the Weather Research and Forecasting (WRF) model runs that generate required output variables for driving CMAQ that are missing in operational SREF output. We conducted WRF runs for Jan, Apr, Jul, and Oct 2016 to capture seasonal changes in meteorology. Estimated emissions of trace gases and aerosols via the Sparse Matrix Operator Kernel (SMOKE) system were developed using the WRF output. WRF and SMOKE output drive a 3-member CMAQ mini-ensemble of once-daily, 48-h PM2.5 forecasts for the same four months. The CMAQ mini-ensemble

  14. Skill of real-time operational forecasts with the APCC multi-model ensemble prediction system during the period 2008-2015

    Science.gov (United States)

    Min, Young-Mi; Kryjov, Vladimir N.; Oh, Sang Myeong; Lee, Hyun-Ju

    2017-12-01

    This paper assesses the real-time 1-month lead forecasts of 3-month (seasonal) mean temperature and precipitation on a monthly basis issued by the Asia-Pacific Economic Cooperation Climate Center (APCC) for 2008-2015 (8 years, 96 forecasts). It shows the current level of the APCC operational multi-model prediction system performance. The skill of the APCC forecasts strongly depends on seasons and regions that it is higher for the tropics and boreal winter than for the extratropics and boreal summer due to direct effects and remote teleconnections from boundary forcings. There is a negative relationship between the forecast skill and its interseasonal variability for both variables and the forecast skill for precipitation is more seasonally and regionally dependent than that for temperature. The APCC operational probabilistic forecasts during this period show a cold bias (underforecasting of above-normal temperature and overforecasting of below-normal temperature) underestimating a long-term warming trend. A wet bias is evident for precipitation, particularly in the extratropical regions. The skill of both temperature and precipitation forecasts strongly depends upon the ENSO strength. Particularly, the highest forecast skill noted in 2015/2016 boreal winter is associated with the strong forcing of an extreme El Nino event. Meanwhile, the relatively low skill is associated with the transition and/or continuous ENSO-neutral phases of 2012-2014. As a result the skill of real-time forecast for boreal winter season is higher than that of hindcast. However, on average, the level of forecast skill during the period 2008-2015 is similar to that of hindcast.

  15. Sea surface temperature predictions using a multi-ocean analysis ensemble scheme

    Science.gov (United States)

    Zhang, Ying; Zhu, Jieshun; Li, Zhongxian; Chen, Haishan; Zeng, Gang

    2017-08-01

    This study examined the global sea surface temperature (SST) predictions by a so-called multiple-ocean analysis ensemble (MAE) initialization method which was applied in the National Centers for Environmental Prediction (NCEP) Climate Forecast System Version 2 (CFSv2). Different from most operational climate prediction practices which are initialized by a specific ocean analysis system, the MAE method is based on multiple ocean analyses. In the paper, the MAE method was first justified by analyzing the ocean temperature variability in four ocean analyses which all are/were applied for operational climate predictions either at the European Centre for Medium-range Weather Forecasts or at NCEP. It was found that these systems exhibit substantial uncertainties in estimating the ocean states, especially at the deep layers. Further, a set of MAE hindcasts was conducted based on the four ocean analyses with CFSv2, starting from each April during 1982-2007. The MAE hindcasts were verified against a subset of hindcasts from the NCEP CFS Reanalysis and Reforecast (CFSRR) Project. Comparisons suggested that MAE shows better SST predictions than CFSRR over most regions where ocean dynamics plays a vital role in SST evolutions, such as the El Niño and Atlantic Niño regions. Furthermore, significant improvements were also found in summer precipitation predictions over the equatorial eastern Pacific and Atlantic oceans, for which the local SST prediction improvements should be responsible. The prediction improvements by MAE imply a problem for most current climate predictions which are based on a specific ocean analysis system. That is, their predictions would drift towards states biased by errors inherent in their ocean initialization system, and thus have large prediction errors. In contrast, MAE arguably has an advantage by sampling such structural uncertainties, and could efficiently cancel these errors out in their predictions.

  16. Assessing probabilistic predictions of ENSO phase and intensity from the North American Multimodel Ensemble

    Science.gov (United States)

    Tippett, Michael K.; Ranganathan, Meghana; L'Heureux, Michelle; Barnston, Anthony G.; DelSole, Timothy

    2017-05-01

    Here we examine the skill of three, five, and seven-category monthly ENSO probability forecasts (1982-2015) from single and multi-model ensemble integrations of the North American Multimodel Ensemble (NMME) project. Three-category forecasts are typical and provide probabilities for the ENSO phase (El Niño, La Niña or neutral). Additional forecast categories indicate the likelihood of ENSO conditions being weak, moderate or strong. The level of skill observed for differing numbers of forecast categories can help to determine the appropriate degree of forecast precision. However, the dependence of the skill score itself on the number of forecast categories must be taken into account. For reliable forecasts with same quality, the ranked probability skill score (RPSS) is fairly insensitive to the number of categories, while the logarithmic skill score (LSS) is an information measure and increases as categories are added. The ignorance skill score decreases to zero as forecast categories are added, regardless of skill level. For all models, forecast formats and skill scores, the northern spring predictability barrier explains much of the dependence of skill on target month and forecast lead. RPSS values for monthly ENSO forecasts show little dependence on the number of categories. However, the LSS of multimodel ensemble forecasts with five and seven categories show statistically significant advantages over the three-category forecasts for the targets and leads that are least affected by the spring predictability barrier. These findings indicate that current prediction systems are capable of providing more detailed probabilistic forecasts of ENSO phase and amplitude than are typically provided.

  17. Adiabatic passage and ensemble control of quantum systems

    International Nuclear Information System (INIS)

    Leghtas, Z; Sarlette, A; Rouchon, P

    2011-01-01

    This paper considers population transfer between eigenstates of a finite quantum ladder controlled by a classical electric field. Using an appropriate change of variables, we show that this setting can be set in the framework of adiabatic passage, which is known to facilitate ensemble control of quantum systems. Building on this insight, we present a mathematical proof of robustness for a control protocol-chirped pulse-practised by experimentalists to drive an ensemble of quantum systems from the ground state to the most excited state. We then propose new adiabatic control protocols using a single chirped and amplitude-shaped pulse, to robustly perform any permutation of eigenstate populations, on an ensemble of systems with unknown coupling strengths. These adiabatic control protocols are illustrated by simulations on a four-level ladder.

  18. Dynamical predictive power of the generalized Gibbs ensemble revealed in a second quench.

    Science.gov (United States)

    Zhang, J M; Cui, F C; Hu, Jiangping

    2012-04-01

    We show that a quenched and relaxed completely integrable system is hardly distinguishable from the corresponding generalized Gibbs ensemble in a dynamical sense. To be specific, the response of the quenched and relaxed system to a second quench can be accurately reproduced by using the generalized Gibbs ensemble as a substitute. Remarkably, as demonstrated with the transverse Ising model and the hard-core bosons in one dimension, not only the steady values but even the transient, relaxation dynamics of the physical variables can be accurately reproduced by using the generalized Gibbs ensemble as a pseudoinitial state. This result is an important complement to the previously established result that a quenched and relaxed system is hardly distinguishable from the generalized Gibbs ensemble in a static sense. The relevance of the generalized Gibbs ensemble in the nonequilibrium dynamics of completely integrable systems is then greatly strengthened.

  19. Ensemble system for Part-of-Speech tagging

    OpenAIRE

    Dell'Orletta, Felice

    2009-01-01

    The paper contains a description of the Felice-POS-Tagger and of its performance in Evalita 2009. Felice-POS-Tagger is an ensemble system that combines six different POS taggers. When evaluated on the official test set, the ensemble system outperforms each of the single tagger components and achieves the highest accuracy score in Evalita 2009 POS Closed Task. It is shown rst that the errors made from the dierent taggers are complementary, and then how to use this complementary behavior to the...

  20. Multiple-Swarm Ensembles: Improving the Predictive Power and Robustness of Predictive Models and Its Use in Computational Biology.

    Science.gov (United States)

    Alves, Pedro; Liu, Shuang; Wang, Daifeng; Gerstein, Mark

    2018-01-01

    Machine learning is an integral part of computational biology, and has already shown its use in various applications, such as prognostic tests. In the last few years in the non-biological machine learning community, ensembling techniques have shown their power in data mining competitions such as the Netflix challenge; however, such methods have not found wide use in computational biology. In this work, we endeavor to show how ensembling techniques can be applied to practical problems, including problems in the field of bioinformatics, and how they often outperform other machine learning techniques in both predictive power and robustness. Furthermore, we develop a methodology of ensembling, Multi-Swarm Ensemble (MSWE) by using multiple particle swarm optimizations and demonstrate its ability to further enhance the performance of ensembles.

  1. HEPS4Power - Extended-range Hydrometeorological Ensemble Predictions for Improved Hydropower Operations and Revenues

    Science.gov (United States)

    Bogner, Konrad; Monhart, Samuel; Liniger, Mark; Spririg, Christoph; Jordan, Fred; Zappa, Massimiliano

    2015-04-01

    In recent years large progresses have been achieved in the operational prediction of floods and hydrological drought with up to ten days lead time. Both the public and the private sectors are currently using probabilistic runoff forecast in order to monitoring water resources and take actions when critical conditions are to be expected. The use of extended-range predictions with lead times exceeding 10 days is not yet established. The hydropower sector in particular might have large benefits from using hydro meteorological forecasts for the next 15 to 60 days in order to optimize the operations and the revenues from their watersheds, dams, captions, turbines and pumps. The new Swiss Competence Centers in Energy Research (SCCER) targets at boosting research related to energy issues in Switzerland. The objective of HEPS4POWER is to demonstrate that operational extended-range hydro meteorological forecasts have the potential to become very valuable tools for fine tuning the production of energy from hydropower systems. The project team covers a specific system-oriented value chain starting from the collection and forecast of meteorological data (MeteoSwiss), leading to the operational application of state-of-the-art hydrological models (WSL) and terminating with the experience in data presentation and power production forecasts for end-users (e-dric.ch). The first task of the HEPS4POWER will be the downscaling and post-processing of ensemble extended-range meteorological forecasts (EPS). The goal is to provide well-tailored forecasts of probabilistic nature that should be reliable in statistical and localized at catchment or even station level. The hydrology related task will consist in feeding the post-processed meteorological forecasts into a HEPS using a multi-model approach by implementing models with different complexity. Also in the case of the hydrological ensemble predictions, post-processing techniques need to be tested in order to improve the quality of the

  2. Adaboost Ensemble with Simple Genetic Algorithm for Student Prediction Mode

    OpenAIRE

    AhmedSharaf ElDen; ElDen1Malaka A. Moustafa2Hany; M. Harb; AbdelH.Emara

    2013-01-01

    Predicting the student performance is a great concern to the higher education managements.Thisprediction helps to identify and to improve students' performance.Several factors may improve thisperformance.In the present study, we employ the data mining processes, particularly classification, toenhance the quality of the higher educational system. Recently, a new direction is used for the improvementof the classification accuracy by combining classifiers.In thispaper, we design and evaluate a f...

  3. Early hospital mortality prediction of intensive care unit patients using an ensemble learning approach.

    Science.gov (United States)

    Awad, Aya; Bader-El-Den, Mohamed; McNicholas, James; Briggs, Jim

    2017-12-01

    Mortality prediction of hospitalized patients is an important problem. Over the past few decades, several severity scoring systems and machine learning mortality prediction models have been developed for predicting hospital mortality. By contrast, early mortality prediction for intensive care unit patients remains an open challenge. Most research has focused on severity of illness scoring systems or data mining (DM) models designed for risk estimation at least 24 or 48h after ICU admission. This study highlights the main data challenges in early mortality prediction in ICU patients and introduces a new machine learning based framework for Early Mortality Prediction for Intensive Care Unit patients (EMPICU). The proposed method is evaluated on the Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database. Mortality prediction models are developed for patients at the age of 16 or above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU). We employ the ensemble learning Random Forest (RF), the predictive Decision Trees (DT), the probabilistic Naive Bayes (NB) and the rule-based Projective Adaptive Resonance Theory (PART) models. The primary outcome was hospital mortality. The explanatory variables included demographic, physiological, vital signs and laboratory test variables. Performance measures were calculated using cross-validated area under the receiver operating characteristic curve (AUROC) to minimize bias. 11,722 patients with single ICU stays are considered. Only patients at the age of 16 years old and above in Medical ICU (MICU), Surgical ICU (SICU) or Cardiac Surgery Recovery Unit (CSRU) are considered in this study. The proposed EMPICU framework outperformed standard scoring systems (SOFA, SAPS-I, APACHE-II, NEWS and qSOFA) in terms of AUROC and time (i.e. at 6h compared to 48h or more after admission). The results show that although there are many values missing in the first few hour of ICU admission

  4. An ensemble approach to the evolution of complex systems

    Indian Academy of Sciences (India)

    2014-03-15

    Mar 15, 2014 ... [Arpağ G and Erzan A 2014 An ensemble approach to the evolution of complex systems. J. Biosci. ... almost nothing about all the different ways in which your ...... energy cost to the organism of the maintenance, replication,.

  5. Improving Robustness of Hydrologic Ensemble Predictions Through Probabilistic Pre- and Post-Processing in Sequential Data Assimilation

    Science.gov (United States)

    Wang, S.; Ancell, B. C.; Huang, G. H.; Baetz, B. W.

    2018-03-01

    Data assimilation using the ensemble Kalman filter (EnKF) has been increasingly recognized as a promising tool for probabilistic hydrologic predictions. However, little effort has been made to conduct the pre- and post-processing of assimilation experiments, posing a significant challenge in achieving the best performance of hydrologic predictions. This paper presents a unified data assimilation framework for improving the robustness of hydrologic ensemble predictions. Statistical pre-processing of assimilation experiments is conducted through the factorial design and analysis to identify the best EnKF settings with maximized performance. After the data assimilation operation, statistical post-processing analysis is also performed through the factorial polynomial chaos expansion to efficiently address uncertainties in hydrologic predictions, as well as to explicitly reveal potential interactions among model parameters and their contributions to the predictive accuracy. In addition, the Gaussian anamorphosis is used to establish a seamless bridge between data assimilation and uncertainty quantification of hydrologic predictions. Both synthetic and real data assimilation experiments are carried out to demonstrate feasibility and applicability of the proposed methodology in the Guadalupe River basin, Texas. Results suggest that statistical pre- and post-processing of data assimilation experiments provide meaningful insights into the dynamic behavior of hydrologic systems and enhance robustness of hydrologic ensemble predictions.

  6. Ensemble Methods in Data Mining Improving Accuracy Through Combining Predictions

    CERN Document Server

    Seni, Giovanni

    2010-01-01

    This book is aimed at novice and advanced analytic researchers and practitioners -- especially in Engineering, Statistics, and Computer Science. Those with little exposure to ensembles will learn why and how to employ this breakthrough method, and advanced practitioners will gain insight into building even more powerful models. Throughout, snippets of code in R are provided to illustrate the algorithms described and to encourage the reader to try the techniques. The authors are industry experts in data mining and machine learning who are also adjunct professors and popular speakers. Although e

  7. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    Science.gov (United States)

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  8. The North American Multi-Model Ensemble (NMME): Phase-1 Seasonal to Interannual Prediction, Phase-2 Toward Developing Intra-Seasonal Prediction

    Science.gov (United States)

    Kirtman, Ben P.; Min, Dughong; Infanti, Johnna M.; Kinter, James L., III; Paolino, Daniel A.; Zhang, Qin; vandenDool, Huug; Saha, Suranjana; Mendez, Malaquias Pena; Becker, Emily; hide

    2013-01-01

    The recent US National Academies report "Assessment of Intraseasonal to Interannual Climate Prediction and Predictability" was unequivocal in recommending the need for the development of a North American Multi-Model Ensemble (NMME) operational predictive capability. Indeed, this effort is required to meet the specific tailored regional prediction and decision support needs of a large community of climate information users. The multi-model ensemble approach has proven extremely effective at quantifying prediction uncertainty due to uncertainty in model formulation, and has proven to produce better prediction quality (on average) then any single model ensemble. This multi-model approach is the basis for several international collaborative prediction research efforts, an operational European system and there are numerous examples of how this multi-model ensemble approach yields superior forecasts compared to any single model. Based on two NOAA Climate Test Bed (CTB) NMME workshops (February 18, and April 8, 2011) a collaborative and coordinated implementation strategy for a NMME prediction system has been developed and is currently delivering real-time seasonal-to-interannual predictions on the NOAA Climate Prediction Center (CPC) operational schedule. The hindcast and real-time prediction data is readily available (e.g., http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/) and in graphical format from CPC (http://origin.cpc.ncep.noaa.gov/products/people/wd51yf/NMME/index.html). Moreover, the NMME forecast are already currently being used as guidance for operational forecasters. This paper describes the new NMME effort, presents an overview of the multi-model forecast quality, and the complementary skill associated with individual models.

  9. Security Enrichment in Intrusion Detection System Using Classifier Ensemble

    Directory of Open Access Journals (Sweden)

    Uma R. Salunkhe

    2017-01-01

    Full Text Available In the era of Internet and with increasing number of people as its end users, a large number of attack categories are introduced daily. Hence, effective detection of various attacks with the help of Intrusion Detection Systems is an emerging trend in research these days. Existing studies show effectiveness of machine learning approaches in handling Intrusion Detection Systems. In this work, we aim to enhance detection rate of Intrusion Detection System by using machine learning technique. We propose a novel classifier ensemble based IDS that is constructed using hybrid approach which combines data level and feature level approach. Classifier ensembles combine the opinions of different experts and improve the intrusion detection rate. Experimental results show the improved detection rates of our system compared to reference technique.

  10. Ensemble annealing of complex physical systems

    OpenAIRE

    Habeck, Michael

    2015-01-01

    Algorithms for simulating complex physical systems or solving difficult optimization problems often resort to an annealing process. Rather than simulating the system at the temperature of interest, an annealing algorithm starts at a temperature that is high enough to ensure ergodicity and gradually decreases it until the destination temperature is reached. This idea is used in popular algorithms such as parallel tempering and simulated annealing. A general problem with annealing methods is th...

  11. A short-term ensemble wind speed forecasting system for wind power applications

    Science.gov (United States)

    Baidya Roy, S.; Traiteur, J. J.; Callicutt, D.; Smith, M.

    2011-12-01

    This study develops an adaptive, blended forecasting system to provide accurate wind speed forecasts 1 hour ahead of time for wind power applications. The system consists of an ensemble of 21 forecasts with different configurations of the Weather Research and Forecasting Single Column Model (WRFSCM) and a persistence model. The ensemble is calibrated against observations for a 2 month period (June-July, 2008) at a potential wind farm site in Illinois using the Bayesian Model Averaging (BMA) technique. The forecasting system is evaluated against observations for August 2008 at the same site. The calibrated ensemble forecasts significantly outperform the forecasts from the uncalibrated ensemble while significantly reducing forecast uncertainty under all environmental stability conditions. The system also generates significantly better forecasts than persistence, autoregressive (AR) and autoregressive moving average (ARMA) models during the morning transition and the diurnal convective regimes. This forecasting system is computationally more efficient than traditional numerical weather prediction models and can generate a calibrated forecast, including model runs and calibration, in approximately 1 minute. Currently, hour-ahead wind speed forecasts are almost exclusively produced using statistical models. However, numerical models have several distinct advantages over statistical models including the potential to provide turbulence forecasts. Hence, there is an urgent need to explore the role of numerical models in short-term wind speed forecasting. This work is a step in that direction and is likely to trigger a debate within the wind speed forecasting community.

  12. A novel least squares support vector machine ensemble model for NOx emission prediction of a coal-fired boiler

    International Nuclear Information System (INIS)

    Lv, You; Liu, Jizhen; Yang, Tingting; Zeng, Deliang

    2013-01-01

    Real operation data of power plants are inclined to be concentrated in some local areas because of the operators’ habits and control system design. In this paper, a novel least squares support vector machine (LSSVM)-based ensemble learning paradigm is proposed to predict NO x emission of a coal-fired boiler using real operation data. In view of the plant data characteristics, a soft fuzzy c-means cluster algorithm is proposed to decompose the original data and guarantee the diversity of individual learners. Subsequently the base LSSVM is trained in each individual subset to solve the subtask. Finally, partial least squares (PLS) is applied as the combination strategy to eliminate the collinear and redundant information of the base learners. Considering that the fuzzy membership also has an effect on the ensemble output, the membership degree is added as one of the variables of the combiner. The single LSSVM and other ensemble models using different decomposition and combination strategies are also established to make a comparison. The result shows that the new soft FCM-LSSVM-PLS ensemble method can predict NO x emission accurately. Besides, because of the divide and conquer frame, the total time consumed in the searching the parameters and training also decreases evidently. - Highlights: • A novel LSSVM ensemble model to predict NO x emissions is presented. • LSSVM is used as the base learner and PLS is employed as the combiner. • The model is applied to process data from a 660 MW coal-fired boiler. • The generalization ability of the model is enhanced. • The time consuming in training and searching the parameters decreases sharply

  13. SVM and SVM Ensembles in Breast Cancer Prediction

    OpenAIRE

    Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong

    2017-01-01

    Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction per...

  14. Revisiting the synoptic-scale predictability of severe European winter storms using ECMWF ensemble reforecasts

    Directory of Open Access Journals (Sweden)

    F. Pantillon

    2017-10-01

    Full Text Available New insights into the synoptic-scale predictability of 25 severe European winter storms of the 1995–2015 period are obtained using the homogeneous ensemble reforecast dataset from the European Centre for Medium-Range Weather Forecasts. The predictability of the storms is assessed with different metrics including (a the track and intensity to investigate the storms' dynamics and (b the Storm Severity Index to estimate the impact of the associated wind gusts. The storms are well predicted by the whole ensemble up to 2–4 days ahead. At longer lead times, the number of members predicting the observed storms decreases and the ensemble average is not clearly defined for the track and intensity. The Extreme Forecast Index and Shift of Tails are therefore computed from the deviation of the ensemble from the model climate. Based on these indices, the model has some skill in forecasting the area covered by extreme wind gusts up to 10 days, which indicates a clear potential for early warnings. However, large variability is found between the individual storms. The poor predictability of outliers appears related to their physical characteristics such as explosive intensification or small size. Longer datasets with more cases would be needed to further substantiate these points.

  15. Modelling machine ensembles with discrete event dynamical system theory

    Science.gov (United States)

    Hunter, Dan

    1990-01-01

    Discrete Event Dynamical System (DEDS) theory can be utilized as a control strategy for future complex machine ensembles that will be required for in-space construction. The control strategy involves orchestrating a set of interactive submachines to perform a set of tasks for a given set of constraints such as minimum time, minimum energy, or maximum machine utilization. Machine ensembles can be hierarchically modeled as a global model that combines the operations of the individual submachines. These submachines are represented in the global model as local models. Local models, from the perspective of DEDS theory , are described by the following: a set of system and transition states, an event alphabet that portrays actions that takes a submachine from one state to another, an initial system state, a partial function that maps the current state and event alphabet to the next state, and the time required for the event to occur. Each submachine in the machine ensemble is presented by a unique local model. The global model combines the local models such that the local models can operate in parallel under the additional logistic and physical constraints due to submachine interactions. The global model is constructed from the states, events, event functions, and timing requirements of the local models. Supervisory control can be implemented in the global model by various methods such as task scheduling (open-loop control) or implementing a feedback DEDS controller (closed-loop control).

  16. Neural Network Ensemble Based Approach for 2D-Interval Prediction of Solar Photovoltaic Power

    Directory of Open Access Journals (Sweden)

    Mashud Rana

    2016-10-01

    Full Text Available Solar energy generated from PhotoVoltaic (PV systems is one of the most promising types of renewable energy. However, it is highly variable as it depends on the solar irradiance and other meteorological factors. This variability creates difficulties for the large-scale integration of PV power in the electricity grid and requires accurate forecasting of the electricity generated by PV systems. In this paper we consider 2D-interval forecasts, where the goal is to predict summary statistics for the distribution of the PV power values in a future time interval. 2D-interval forecasts have been recently introduced, and they are more suitable than point forecasts for applications where the predicted variable has a high variability. We propose a method called NNE2D that combines variable selection based on mutual information and an ensemble of neural networks, to compute 2D-interval forecasts, where the two interval boundaries are expressed in terms of percentiles. NNE2D was evaluated for univariate prediction of Australian solar PV power data for two years. The results show that it is a promising method, outperforming persistence baselines and other methods used for comparison in terms of accuracy and coverage probability.

  17. A deep learning-based multi-model ensemble method for cancer prediction.

    Science.gov (United States)

    Xiao, Yawen; Wu, Jun; Lin, Zongli; Zhao, Xiaodong

    2018-01-01

    Cancer is a complex worldwide health problem associated with high mortality. With the rapid development of the high-throughput sequencing technology and the application of various machine learning methods that have emerged in recent years, progress in cancer prediction has been increasingly made based on gene expression, providing insight into effective and accurate treatment decision making. Thus, developing machine learning methods, which can successfully distinguish cancer patients from healthy persons, is of great current interest. However, among the classification methods applied to cancer prediction so far, no one method outperforms all the others. In this paper, we demonstrate a new strategy, which applies deep learning to an ensemble approach that incorporates multiple different machine learning models. We supply informative gene data selected by differential gene expression analysis to five different classification models. Then, a deep learning method is employed to ensemble the outputs of the five classifiers. The proposed deep learning-based multi-model ensemble method was tested on three public RNA-seq data sets of three kinds of cancers, Lung Adenocarcinoma, Stomach Adenocarcinoma and Breast Invasive Carcinoma. The test results indicate that it increases the prediction accuracy of cancer for all the tested RNA-seq data sets as compared to using a single classifier or the majority voting algorithm. By taking full advantage of different classifiers, the proposed deep learning-based multi-model ensemble method is shown to be accurate and effective for cancer prediction. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Operational water management of Rijnland water system and pilot of ensemble forecasting system for flood control

    Science.gov (United States)

    van der Zwan, Rene

    2013-04-01

    The Rijnland water system is situated in the western part of the Netherlands, and is a low-lying area of which 90% is below sea-level. The area covers 1,100 square kilometres, where 1.3 million people live, work, travel and enjoy leisure. The District Water Control Board of Rijnland is responsible for flood defence, water quantity and quality management. This includes design and maintenance of flood defence structures, control of regulating structures for an adequate water level management, and waste water treatment. For water quantity management Rijnland uses, besides an online monitoring network for collecting water level and precipitation data, a real time control decision support system. This decision support system consists of deterministic hydro-meteorological forecasts with a 24-hr forecast horizon, coupled with a control module that provides optimal operation schedules for the storage basin pumping stations. The uncertainty of the rainfall forecast is not forwarded in the hydrological prediction. At this moment 65% of the pumping capacity of the storage basin pumping stations can be automatically controlled by the decision control system. Within 5 years, after renovation of two other pumping stations, the total capacity of 200 m3/s will be automatically controlled. In critical conditions there is a need of both a longer forecast horizon and a probabilistic forecast. Therefore ensemble precipitation forecasts of the ECMWF are already consulted off-line during dry-spells, and Rijnland is running a pilot operational system providing 10-day water level ensemble forecasts. The use of EPS during dry-spells and the findings of the pilot will be presented. Challenges and next steps towards on-line implementation of ensemble forecasts for risk-based operational management of the Rijnland water system will be discussed. An important element in that discussion is the question: will policy and decision makers, operator and citizens adapt this Anticipatory Water

  19. Momentum distribution functions in ensembles: the inequivalence of microcannonical and canonical ensembles in a finite ultracold system.

    Science.gov (United States)

    Wang, Pei; Xianlong, Gao; Li, Haibin

    2013-08-01

    It is demonstrated in many thermodynamic textbooks that the equivalence of the different ensembles is achieved in the thermodynamic limit. In this present work we discuss the inequivalence of microcanonical and canonical ensembles in a finite ultracold system at low energies. We calculate the microcanonical momentum distribution function (MDF) in a system of identical fermions (bosons). We find that the microcanonical MDF deviates from the canonical one, which is the Fermi-Dirac (Bose-Einstein) function, in a finite system at low energies where the single-particle density of states and its inverse are finite.

  20. Optimal Initial Perturbations for Ensemble Prediction of the Madden-Julian Oscillation during Boreal Winter

    Science.gov (United States)

    Ham, Yoo-Geun; Schubert, Siegfried; Chang, Yehui

    2012-01-01

    An initialization strategy, tailored to the prediction of the Madden-Julian oscillation (MJO), is evaluated using the Goddard Earth Observing System Model, version 5 (GEOS-5), coupled general circulation model (CGCM). The approach is based on the empirical singular vectors (ESVs) of a reduced-space statistically determined linear approximation of the full nonlinear CGCM. The initial ESV, extracted using 10 years (1990-99) of boreal winter hindcast data, has zonal wind anomalies over the western Indian Ocean, while the final ESV (at a forecast lead time of 10 days) reflects a propagation of the zonal wind anomalies to the east over the Maritime Continent an evolution that is characteristic of the MJO. A new set of ensemble hindcasts are produced for the boreal winter season from 1990 to 1999 in which the leading ESV provides the initial perturbations. The results are compared with those from a set of control hindcasts generated using random perturbations. It is shown that the ESV-based predictions have a systematically higher bivariate correlation skill in predicting the MJO compared to those using the random perturbations. Furthermore, the improvement in the skill depends on the phase of the MJO. The ESV is particularly effective in increasing the forecast skill during those phases of the MJO in which the control has low skill (with correlations increasing by as much as 0.2 at 20 25-day lead times), as well as during those times in which the MJO is weak.

  1. Evaluation of the Plant-Craig stochastic convection scheme in an ensemble forecasting system

    Science.gov (United States)

    Keane, R. J.; Plant, R. S.; Tennant, W. J.

    2015-12-01

    The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic element only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.

  2. A Prediction Method of Airport Noise Based on Hybrid Ensemble Learning

    Directory of Open Access Journals (Sweden)

    Tao XU

    2014-05-01

    Full Text Available Using monitoring history data to build and to train a prediction model for airport noise is a normal method in recent years. However, the single model built in different ways has various performances in the storage, efficiency and accuracy. In order to predict the noise accurately in some complex environment around airport, this paper presents a prediction method based on hybrid ensemble learning. The proposed method ensembles three algorithms: artificial neural network as an active learner, nearest neighbor as a passive leaner and nonlinear regression as a synthesized learner. The experimental results show that the three learners can meet forecast demands respectively in on- line, near-line and off-line. And the accuracy of prediction is improved by integrating these three learners’ results.

  3. Wind power application research on the fusion of the determination and ensemble prediction

    Science.gov (United States)

    Lan, Shi; Lina, Xu; Yuzhu, Hao

    2017-07-01

    The fused product of wind speed for the wind farm is designed through the use of wind speed products of ensemble prediction from the European Centre for Medium-Range Weather Forecasts (ECMWF) and professional numerical model products on wind power based on Mesoscale Model5 (MM5) and Beijing Rapid Update Cycle (BJ-RUC), which are suitable for short-term wind power forecasting and electric dispatch. The single-valued forecast is formed by calculating the different ensemble statistics of the Bayesian probabilistic forecasting representing the uncertainty of ECMWF ensemble prediction. Using autoregressive integrated moving average (ARIMA) model to improve the time resolution of the single-valued forecast, and based on the Bayesian model averaging (BMA) and the deterministic numerical model prediction, the optimal wind speed forecasting curve and the confidence interval are provided. The result shows that the fusion forecast has made obvious improvement to the accuracy relative to the existing numerical forecasting products. Compared with the 0-24 h existing deterministic forecast in the validation period, the mean absolute error (MAE) is decreased by 24.3 % and the correlation coefficient (R) is increased by 12.5 %. In comparison with the ECMWF ensemble forecast, the MAE is reduced by 11.7 %, and R is increased 14.5 %. Additionally, MAE did not increase with the prolongation of the forecast ahead.

  4. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Science.gov (United States)

    Drzewiecki, Wojciech

    2016-12-01

    In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels) was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques. The results proved that in case of sub-pixel evaluation the most accurate prediction of change may not necessarily be based on the most accurate individual assessments. When single methods are considered, based on obtained results Cubist algorithm may be advised for Landsat based mapping of imperviousness for single dates. However, Random Forest may be endorsed when the most reliable evaluation of imperviousness change is the primary goal. It gave lower accuracies for individual assessments, but better prediction of change due to more correlated errors of individual predictions. Heterogeneous model ensembles performed for individual time points assessments at least as well as the best individual models. In case of imperviousness change assessment the ensembles always outperformed single model approaches. It means that it is possible to improve the accuracy of sub-pixel imperviousness change assessment using ensembles of heterogeneous non-linear regression models.

  5. Infinite ensemble of support vector machines for prediction of ...

    African Journals Online (AJOL)

    Many researchers have demonstrated the use of artificial neural networks (ANNs) to predict musculoskeletal disorders risk associated with occupational exposures. In order to improve the accuracy of LBDs risk classification, this paper proposes to use the support vector machines (SVMs), a machine learning algorithm used ...

  6. An ensemble method for predicting subnuclear localizations from primary protein structures.

    Directory of Open Access Journals (Sweden)

    Guo Sheng Han

    Full Text Available BACKGROUND: Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. METHODOLOGY/PRINCIPAL FINDINGS: A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. CONCLUSIONS: It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method

  7. Prediction of Coal Face Gas Concentration by Multi-Scale Selective Ensemble Hybrid Modeling

    Directory of Open Access Journals (Sweden)

    WU Xiang

    2014-06-01

    Full Text Available A selective ensemble hybrid modeling prediction method based on wavelet transformation is proposed to improve the fitting and generalization capability of the existing prediction models of the coal face gas concentration, which has a strong stochastic volatility. Mallat algorithm was employed for the multi-scale decomposition and single-scale reconstruction of the gas concentration time series. Then, it predicted every subsequence by sparsely weighted multi unstable ELM(extreme learning machine predictor within method SERELM(sparse ensemble regressors of ELM. At last, it superimposed the predicted values of these models to obtain the predicted values of the original sequence. The proposed method takes advantage of characteristics of multi scale analysis of wavelet transformation, accuracy and fast characteristics of ELM prediction and the generalization ability of L1 regularized selective ensemble learning method. The results show that the forecast accuracy has large increase by using the proposed method. The average relative error is 0.65%, the maximum relative error is 4.16% and the probability of relative error less than 1% reaches 0.785.

  8. Improvement of Disease Prediction and Modeling through the Use of Meteorological Ensembles: Human Plague in Uganda

    Science.gov (United States)

    Moore, Sean M.; Monaghan, Andrew; Griffith, Kevin S.; Apangu, Titus; Mead, Paul S.; Eisen, Rebecca J.

    2012-01-01

    Climate and weather influence the occurrence, distribution, and incidence of infectious diseases, particularly those caused by vector-borne or zoonotic pathogens. Thus, models based on meteorological data have helped predict when and where human cases are most likely to occur. Such knowledge aids in targeting limited prevention and control resources and may ultimately reduce the burden of diseases. Paradoxically, localities where such models could yield the greatest benefits, such as tropical regions where morbidity and mortality caused by vector-borne diseases is greatest, often lack high-quality in situ local meteorological data. Satellite- and model-based gridded climate datasets can be used to approximate local meteorological conditions in data-sparse regions, however their accuracy varies. Here we investigate how the selection of a particular dataset can influence the outcomes of disease forecasting models. Our model system focuses on plague (Yersinia pestis infection) in the West Nile region of Uganda. The majority of recent human cases have been reported from East Africa and Madagascar, where meteorological observations are sparse and topography yields complex weather patterns. Using an ensemble of meteorological datasets and model-averaging techniques we find that the number of suspected cases in the West Nile region was negatively associated with dry season rainfall (December-February) and positively with rainfall prior to the plague season. We demonstrate that ensembles of available meteorological datasets can be used to quantify climatic uncertainty and minimize its impacts on infectious disease models. These methods are particularly valuable in regions with sparse observational networks and high morbidity and mortality from vector-borne diseases. PMID:23024750

  9. Improvement of disease prediction and modeling through the use of meteorological ensembles: human plague in Uganda.

    Directory of Open Access Journals (Sweden)

    Sean M Moore

    Full Text Available Climate and weather influence the occurrence, distribution, and incidence of infectious diseases, particularly those caused by vector-borne or zoonotic pathogens. Thus, models based on meteorological data have helped predict when and where human cases are most likely to occur. Such knowledge aids in targeting limited prevention and control resources and may ultimately reduce the burden of diseases. Paradoxically, localities where such models could yield the greatest benefits, such as tropical regions where morbidity and mortality caused by vector-borne diseases is greatest, often lack high-quality in situ local meteorological data. Satellite- and model-based gridded climate datasets can be used to approximate local meteorological conditions in data-sparse regions, however their accuracy varies. Here we investigate how the selection of a particular dataset can influence the outcomes of disease forecasting models. Our model system focuses on plague (Yersinia pestis infection in the West Nile region of Uganda. The majority of recent human cases have been reported from East Africa and Madagascar, where meteorological observations are sparse and topography yields complex weather patterns. Using an ensemble of meteorological datasets and model-averaging techniques we find that the number of suspected cases in the West Nile region was negatively associated with dry season rainfall (December-February and positively with rainfall prior to the plague season. We demonstrate that ensembles of available meteorological datasets can be used to quantify climatic uncertainty and minimize its impacts on infectious disease models. These methods are particularly valuable in regions with sparse observational networks and high morbidity and mortality from vector-borne diseases.

  10. Efficient multi-scenario Model Predictive Control for water resources management with ensemble streamflow forecasts

    Science.gov (United States)

    Tian, Xin; Negenborn, Rudy R.; van Overloop, Peter-Jules; María Maestre, José; Sadowska, Anna; van de Giesen, Nick

    2017-11-01

    Model Predictive Control (MPC) is one of the most advanced real-time control techniques that has been widely applied to Water Resources Management (WRM). MPC can manage the water system in a holistic manner and has a flexible structure to incorporate specific elements, such as setpoints and constraints. Therefore, MPC has shown its versatile performance in many branches of WRM. Nonetheless, with the in-depth understanding of stochastic hydrology in recent studies, MPC also faces the challenge of how to cope with hydrological uncertainty in its decision-making process. A possible way to embed the uncertainty is to generate an Ensemble Forecast (EF) of hydrological variables, rather than a deterministic one. The combination of MPC and EF results in a more comprehensive approach: Multi-scenario MPC (MS-MPC). In this study, we will first assess the model performance of MS-MPC, considering an ensemble streamflow forecast. Noticeably, the computational inefficiency may be a critical obstacle that hinders applicability of MS-MPC. In fact, with more scenarios taken into account, the computational burden of solving an optimization problem in MS-MPC accordingly increases. To deal with this challenge, we propose the Adaptive Control Resolution (ACR) approach as a computationally efficient scheme to practically reduce the number of control variables in MS-MPC. In brief, the ACR approach uses a mixed-resolution control time step from the near future to the distant future. The ACR-MPC approach is tested on a real-world case study: an integrated flood control and navigation problem in the North Sea Canal of the Netherlands. Such an approach reduces the computation time by 18% and up in our case study. At the same time, the model performance of ACR-MPC remains close to that of conventional MPC.

  11. Predicting lymphatic filariasis transmission and elimination dynamics using a multi-model ensemble framework

    Directory of Open Access Journals (Sweden)

    Morgan E. Smith

    2017-03-01

    Full Text Available Mathematical models of parasite transmission provide powerful tools for assessing the impacts of interventions. Owing to complexity and uncertainty, no single model may capture all features of transmission and elimination dynamics. Multi-model ensemble modelling offers a framework to help overcome biases of single models. We report on the development of a first multi-model ensemble of three lymphatic filariasis (LF models (EPIFIL, LYMFASIM, and TRANSFIL, and evaluate its predictive performance in comparison with that of the constituents using calibration and validation data from three case study sites, one each from the three major LF endemic regions: Africa, Southeast Asia and Papua New Guinea (PNG. We assessed the performance of the respective models for predicting the outcomes of annual MDA strategies for various baseline scenarios thought to exemplify the current endemic conditions in the three regions. The results show that the constructed multi-model ensemble outperformed the single models when evaluated across all sites. Single models that best fitted calibration data tended to do less well in simulating the out-of-sample, or validation, intervention data. Scenario modelling results demonstrate that the multi-model ensemble is able to compensate for variance between single models in order to produce more plausible predictions of intervention impacts. Our results highlight the value of an ensemble approach to modelling parasite control dynamics. However, its optimal use will require further methodological improvements as well as consideration of the organizational mechanisms required to ensure that modelling results and data are shared effectively between all stakeholders.

  12. ECLogger: Cross-Project Catch-Block Logging Prediction Using Ensemble of Classifiers

    Directory of Open Access Journals (Sweden)

    Sangeeta Lal

    2017-01-01

    Full Text Available Background: Software developers insert log statements in the source code to record program execution information. However, optimizing the number of log statements in the source code is challenging. Machine learning based within-project logging prediction tools, proposed in previous studies, may not be suitable for new or small software projects. For such software projects, we can use cross-project logging prediction. Aim: The aim of the study presented here is to investigate cross-project logging prediction methods and techniques. Method: The proposed method is ECLogger, which is a novel, ensemble-based, cross-project, catch-block logging prediction model. In the research We use 9 base classifiers were used and combined using ensemble techniques. The performance of ECLogger was evaluated on on three open-source Java projects: Tomcat, CloudStack and Hadoop. Results: ECLogger Bagging, ECLogger AverageVote, and ECLogger MajorityVote show a considerable improvement in the average Logged F-measure (LF on 3, 5, and 4 source -> target project pairs, respectively, compared to the baseline classifiers. ECLogger AverageVote performs best and shows improvements of 3.12% (average LF and 6.08% (average ACC – Accuracy. Conclusion: The classifier based on ensemble techniques, such as bagging, average vote, and majority vote outperforms the baseline classifier. Overall, the ECLogger AverageVote model performs best. The results show that the CloudStack project is more generalizable than the other projects.

  13. An ensemble method to predict target genes and pathways in uveal melanoma

    Directory of Open Access Journals (Sweden)

    Wei Chao

    2018-04-01

    Full Text Available This work proposes to predict target genes and pathways for uveal melanoma (UM based on an ensemble method and pathway analyses. Methods: The ensemble method integrated a correlation method (Pearson correlation coefficient, PCC, a causal inference method (IDA and a regression method (Lasso utilizing the Borda count election method. Subsequently, to validate the performance of PIL method, comparisons between confirmed database and predicted miRNA targets were performed. Ultimately, pathway enrichment analysis was conducted on target genes in top 1000 miRNA-mRNA interactions to identify target pathways for UM patients. Results: Thirty eight of the predicted interactions were matched with the confirmed interactions, indicating that the ensemble method was a suitable and feasible approach to predict miRNA targets. We obtained 50 seed miRNA-mRNA interactions of UM patients and extracted target genes from these interactions, such as ASPG, BSDC1 and C4BP. The 601 target genes in top 1,000 miRNA-mRNA interactions were enriched in 12 target pathways, of which Phototransduction was the most significant one. Conclusion: The target genes and pathways might provide a new way to reveal the molecular mechanism of UM and give hand for target treatments and preventions of this malignant tumor.

  14. Prediction of Human Phenotype Ontology terms by means of hierarchical ensemble methods.

    Science.gov (United States)

    Notaro, Marco; Schubach, Max; Robinson, Peter N; Valentini, Giorgio

    2017-10-12

    The prediction of human gene-abnormal phenotype associations is a fundamental step toward the discovery of novel genes associated with human disorders, especially when no genes are known to be associated with a specific disease. In this context the Human Phenotype Ontology (HPO) provides a standard categorization of the abnormalities associated with human diseases. While the problem of the prediction of gene-disease associations has been widely investigated, the related problem of gene-phenotypic feature (i.e., HPO term) associations has been largely overlooked, even if for most human genes no HPO term associations are known and despite the increasing application of the HPO to relevant medical problems. Moreover most of the methods proposed in literature are not able to capture the hierarchical relationships between HPO terms, thus resulting in inconsistent and relatively inaccurate predictions. We present two hierarchical ensemble methods that we formally prove to provide biologically consistent predictions according to the hierarchical structure of the HPO. The modular structure of the proposed methods, that consists in a "flat" learning first step and a hierarchical combination of the predictions in the second step, allows the predictions of virtually any flat learning method to be enhanced. The experimental results show that hierarchical ensemble methods are able to predict novel associations between genes and abnormal phenotypes with results that are competitive with state-of-the-art algorithms and with a significant reduction of the computational complexity. Hierarchical ensembles are efficient computational methods that guarantee biologically meaningful predictions that obey the true path rule, and can be used as a tool to improve and make consistent the HPO terms predictions starting from virtually any flat learning method. The implementation of the proposed methods is available as an R package from the CRAN repository.

  15. An ensemble machine learning approach to predict survival in breast cancer.

    Science.gov (United States)

    Djebbari, Amira; Liu, Ziying; Phan, Sieu; Famili, Fazel

    2008-01-01

    Current breast cancer predictive signatures are not unique. Can we use this fact to our advantage to improve prediction? From the machine learning perspective, it is well known that combining multiple classifiers can improve classification performance. We propose an ensemble machine learning approach which consists of choosing feature subsets and learning predictive models from them. We then combine models based on certain model fusion criteria and we also introduce a tuning parameter to control sensitivity. Our method significantly improves classification performance with a particular emphasis on sensitivity which is critical to avoid misclassifying poor prognosis patients as good prognosis.

  16. Experimental real-time multi-model ensemble (MME) prediction of ...

    Indian Academy of Sciences (India)

    calibration (training) has to be of good quality. Otherwise, it might degrade the MME results. Early works by ... ECMWF ensemble data (Evans et al 2000), and they showed the superiority of the multi-model system over the ..... eral idea of the quality of rainfall forecasts in terms of error statistics for monsoon for the member.

  17. Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT) project.

    Science.gov (United States)

    Alghamdi, Manal; Al-Mallah, Mouaz; Keteyian, Steven; Brawner, Clinton; Ehrman, Jonathan; Sakr, Sherif

    2017-01-01

    Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE). The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree) and achieved high accuracy of prediction (AUC = 0.92). The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.

  18. Predicting diabetes mellitus using SMOTE and ensemble machine learning approach: The Henry Ford ExercIse Testing (FIT project.

    Directory of Open Access Journals (Sweden)

    Manal Alghamdi

    Full Text Available Machine learning is becoming a popular and important approach in the field of medical research. In this study, we investigate the relative performance of various machine learning methods such as Decision Tree, Naïve Bayes, Logistic Regression, Logistic Model Tree and Random Forests for predicting incident diabetes using medical records of cardiorespiratory fitness. In addition, we apply different techniques to uncover potential predictors of diabetes. This FIT project study used data of 32,555 patients who are free of any known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems between 1991 and 2009 and had a complete 5-year follow-up. At the completion of the fifth year, 5,099 of those patients have developed diabetes. The dataset contained 62 attributes classified into four categories: demographic characteristics, disease history, medication use history, and stress test vital signs. We developed an Ensembling-based predictive model using 13 attributes that were selected based on their clinical importance, Multiple Linear Regression, and Information Gain Ranking methods. The negative effect of the imbalance class of the constructed model was handled by Synthetic Minority Oversampling Technique (SMOTE. The overall performance of the predictive model classifier was improved by the Ensemble machine learning approach using the Vote method with three Decision Trees (Naïve Bayes Tree, Random Forest, and Logistic Model Tree and achieved high accuracy of prediction (AUC = 0.92. The study shows the potential of ensembling and SMOTE approaches for predicting incident diabetes using cardiorespiratory fitness data.

  19. Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk

    International Nuclear Information System (INIS)

    Lee, Sangkyu; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam; Faria, Sergio; Kopek, Neil; Brisebois, Pascale; Bradley, Jeffrey D.; Robinson, Clifford

    2015-01-01

    Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0

  20. Bayesian network ensemble as a multivariate strategy to predict radiation pneumonitis risk

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Sangkyu, E-mail: sangkyu.lee@mail.mcgill.ca; Ybarra, Norma; Jeyaseelan, Krishinima; Seuntjens, Jan; El Naqa, Issam [Medical Physics Unit, McGill University, Montreal, Quebec H3G1A4 (Canada); Faria, Sergio; Kopek, Neil; Brisebois, Pascale [Department of Radiation Oncology, Montreal General Hospital, Montreal, H3G1A4 (Canada); Bradley, Jeffrey D.; Robinson, Clifford [Radiation Oncology, Washington University School of Medicine in St. Louis, St. Louis, Missouri 63110 (United States)

    2015-05-15

    Purpose: Prediction of radiation pneumonitis (RP) has been shown to be challenging due to the involvement of a variety of factors including dose–volume metrics and radiosensitivity biomarkers. Some of these factors are highly correlated and might affect prediction results when combined. Bayesian network (BN) provides a probabilistic framework to represent variable dependencies in a directed acyclic graph. The aim of this study is to integrate the BN framework and a systems’ biology approach to detect possible interactions among RP risk factors and exploit these relationships to enhance both the understanding and prediction of RP. Methods: The authors studied 54 nonsmall-cell lung cancer patients who received curative 3D-conformal radiotherapy. Nineteen RP events were observed (common toxicity criteria for adverse events grade 2 or higher). Serum concentration of the following four candidate biomarkers were measured at baseline and midtreatment: alpha-2-macroglobulin, angiotensin converting enzyme (ACE), transforming growth factor, interleukin-6. Dose-volumetric and clinical parameters were also included as covariates. Feature selection was performed using a Markov blanket approach based on the Koller–Sahami filter. The Markov chain Monte Carlo technique estimated the posterior distribution of BN graphs built from the observed data of the selected variables and causality constraints. RP probability was estimated using a limited number of high posterior graphs (ensemble) and was averaged for the final RP estimate using Bayes’ rule. A resampling method based on bootstrapping was applied to model training and validation in order to control under- and overfit pitfalls. Results: RP prediction power of the BN ensemble approach reached its optimum at a size of 200. The optimized performance of the BN model recorded an area under the receiver operating characteristic curve (AUC) of 0.83, which was significantly higher than multivariate logistic regression (0

  1. Visualizing uncertainties in a storm surge ensemble data assimilation and forecasting system

    KAUST Repository

    Hollt, Thomas

    2015-01-15

    We present a novel integrated visualization system that enables the interactive visual analysis of ensemble simulations and estimates of the sea surface height and other model variables that are used for storm surge prediction. Coastal inundation, caused by hurricanes and tropical storms, poses large risks for today\\'s societies. High-fidelity numerical models of water levels driven by hurricane-force winds are required to predict these events, posing a challenging computational problem, and even though computational models continue to improve, uncertainties in storm surge forecasts are inevitable. Today, this uncertainty is often exposed to the user by running the simulation many times with different parameters or inputs following a Monte-Carlo framework in which uncertainties are represented as stochastic quantities. This results in multidimensional, multivariate and multivalued data, so-called ensemble data. While the resulting datasets are very comprehensive, they are also huge in size and thus hard to visualize and interpret. In this paper, we tackle this problem by means of an interactive and integrated visual analysis system. By harnessing the power of modern graphics processing units for visualization as well as computation, our system allows the user to browse through the simulation ensembles in real time, view specific parameter settings or simulation models and move between different spatial and temporal regions without delay. In addition, our system provides advanced visualizations to highlight the uncertainty or show the complete distribution of the simulations at user-defined positions over the complete time series of the prediction. We highlight the benefits of our system by presenting its application in a real-world scenario using a simulation of Hurricane Ike.

  2. Wave ensemble forecast in the Western Mediterranean Sea, application to an early warning system.

    Science.gov (United States)

    Pallares, Elena; Hernandez, Hector; Moré, Jordi; Espino, Manuel; Sairouni, Abdel

    2015-04-01

    The Western Mediterranean Sea is a highly heterogeneous and variable area, as is reflected on the wind field, the current field, and the waves, mainly in the first kilometers offshore. As a result of this variability, the wave forecast in these regions is quite complicated to perform, usually with some accuracy problems during energetic storm events. Moreover, is in these areas where most of the economic activities take part, including fisheries, sailing, tourism, coastal management and offshore renewal energy platforms. In order to introduce an indicator of the probability of occurrence of the different sea states and give more detailed information of the forecast to the end users, an ensemble wave forecast system is considered. The ensemble prediction systems have already been used in the last decades for the meteorological forecast; to deal with the uncertainties of the initial conditions and the different parametrizations used in the models, which may introduce some errors in the forecast, a bunch of different perturbed meteorological simulations are considered as possible future scenarios and compared with the deterministic forecast. In the present work, the SWAN wave model (v41.01) has been implemented for the Western Mediterranean sea, forced with wind fields produced by the deterministic Global Forecast System (GFS) and Global Ensemble Forecast System (GEFS). The wind fields includes a deterministic forecast (also named control), between 11 and 21 ensemble members, and some intelligent member obtained from the ensemble, as the mean of all the members. Four buoys located in the study area, moored in coastal waters, have been used to validate the results. The outputs include all the time series, with a forecast horizon of 8 days and represented in spaghetti diagrams, the spread of the system and the probability at different thresholds. The main goal of this exercise is to be able to determine the degree of the uncertainty of the wave forecast, meaningful

  3. An Integrated Ensemble-Based Operational Framework to Predict Urban Flooding: A Case Study of Hurricane Sandy in the Passaic and Hackensack River Basins

    Science.gov (United States)

    Saleh, F.; Ramaswamy, V.; Georgas, N.; Blumberg, A. F.; Wang, Y.

    2016-12-01

    Advances in computational resources and modeling techniques are opening the path to effectively integrate existing complex models. In the context of flood prediction, recent extreme events have demonstrated the importance of integrating components of the hydrosystem to better represent the interactions amongst different physical processes and phenomena. As such, there is a pressing need to develop holistic and cross-disciplinary modeling frameworks that effectively integrate existing models and better represent the operative dynamics. This work presents a novel Hydrologic-Hydraulic-Hydrodynamic Ensemble (H3E) flood prediction framework that operationally integrates existing predictive models representing coastal (New York Harbor Observing and Prediction System, NYHOPS), hydrologic (US Army Corps of Engineers Hydrologic Modeling System, HEC-HMS) and hydraulic (2-dimensional River Analysis System, HEC-RAS) components. The state-of-the-art framework is forced with 125 ensemble meteorological inputs from numerical weather prediction models including the Global Ensemble Forecast System, the European Centre for Medium-Range Weather Forecasts (ECMWF), the Canadian Meteorological Centre (CMC), the Short Range Ensemble Forecast (SREF) and the North American Mesoscale Forecast System (NAM). The framework produces, within a 96-hour forecast horizon, on-the-fly Google Earth flood maps that provide critical information for decision makers and emergency preparedness managers. The utility of the framework was demonstrated by retrospectively forecasting an extreme flood event, hurricane Sandy in the Passaic and Hackensack watersheds (New Jersey, USA). Hurricane Sandy caused significant damage to a number of critical facilities in this area including the New Jersey Transit's main storage and maintenance facility. The results of this work demonstrate that ensemble based frameworks provide improved flood predictions and useful information about associated uncertainties, thus

  4. Predicting gene function using hierarchical multi-label decision tree ensembles

    Directory of Open Access Journals (Sweden)

    Kocev Dragi

    2010-01-01

    Full Text Available Abstract Background S. cerevisiae, A. thaliana and M. musculus are well-studied organisms in biology and the sequencing of their genomes was completed many years ago. It is still a challenge, however, to develop methods that assign biological functions to the ORFs in these genomes automatically. Different machine learning methods have been proposed to this end, but it remains unclear which method is to be preferred in terms of predictive performance, efficiency and usability. Results We study the use of decision tree based models for predicting the multiple functions of ORFs. First, we describe an algorithm for learning hierarchical multi-label decision trees. These can simultaneously predict all the functions of an ORF, while respecting a given hierarchy of gene functions (such as FunCat or GO. We present new results obtained with this algorithm, showing that the trees found by it exhibit clearly better predictive performance than the trees found by previously described methods. Nevertheless, the predictive performance of individual trees is lower than that of some recently proposed statistical learning methods. We show that ensembles of such trees are more accurate than single trees and are competitive with state-of-the-art statistical learning and functional linkage methods. Moreover, the ensemble method is computationally efficient and easy to use. Conclusions Our results suggest that decision tree based methods are a state-of-the-art, efficient and easy-to-use approach to ORF function prediction.

  5. Cortical ensemble activity increasingly predicts behaviour outcomes during learning of a motor task

    Science.gov (United States)

    Laubach, Mark; Wessberg, Johan; Nicolelis, Miguel A. L.

    2000-06-01

    When an animal learns to make movements in response to different stimuli, changes in activity in the motor cortex seem to accompany and underlie this learning. The precise nature of modifications in cortical motor areas during the initial stages of motor learning, however, is largely unknown. Here we address this issue by chronically recording from neuronal ensembles located in the rat motor cortex, throughout the period required for rats to learn a reaction-time task. Motor learning was demonstrated by a decrease in the variance of the rats' reaction times and an increase in the time the animals were able to wait for a trigger stimulus. These behavioural changes were correlated with a significant increase in our ability to predict the correct or incorrect outcome of single trials based on three measures of neuronal ensemble activity: average firing rate, temporal patterns of firing, and correlated firing. This increase in prediction indicates that an association between sensory cues and movement emerged in the motor cortex as the task was learned. Such modifications in cortical ensemble activity may be critical for the initial learning of motor tasks.

  6. A comparison of the performance of the 3-D super-ensemble and an ensemble Kalman filter for short-range regional ocean prediction

    Directory of Open Access Journals (Sweden)

    Baptiste Mourre

    2014-01-01

    Full Text Available This study compares the ability of two approaches integrating models and data to forecast the Ligurian Sea regional oceanographic conditions in the short-term range (0–72 hours when constrained by a common observation dataset. The post-processing 3-D super-ensemble (3DSE algorithm, which uses observations to optimally combine multi-model forecasts into a single prediction of the oceanic variable, is first considered. The 3DSE predictive skills are compared to those of the Regional Ocean Modeling System model in which observations are assimilated through a more conventional ensemble Kalman filter (EnKF approach. Assimilated measurements include sea surface temperature maps, and temperature and salinity subsurface observations from a fleet of five underwater gliders. Retrospective analyses are carried out to produce daily predictions during the 11-d period of the REP10 sea trial experiment. The forecast skill evaluation based on a distributed multi-sensor validation dataset indicates an overall superior performance of the EnKF, both at the surface and at depth. While the 3DSE and EnKF perform comparably well in the area spanned by the incorporated measurements, the 3DSE accuracy is found to rapidly decrease outside this area. In particular, the univariate formulation of the method combined with the absence of regular surface salinity measurements produces large errors in the 3DSE salinity forecast. On the contrary, the EnKF leads to more homogeneous forecast errors over the modelling domain for both temperature and salinity. The EnKF is found to consistently improve the predictions with respect to the control solution without assimilation and to be positively skilled when compared to the climatological estimate. For typical regional oceanographic applications with scarce subsurface observations, the lack of physical spatial and multivariate error covariances applicable to the individual model weights in the 3DSE formulation constitutes a major

  7. Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins.

    Science.gov (United States)

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2016-12-02

    In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers' convenience, the online Web server LNP-Chlo is freely available at http://bioinfo.eie.polyu.edu.hk/LNPChloServer/ .

  8. Ensemble approach combining multiple methods improves human transcription start site prediction

    LENUS (Irish Health Repository)

    Dineen, David G

    2010-11-30

    Abstract Background The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets. Results We demonstrate the heterogeneity of current prediction sets, and take advantage of this heterogeneity to construct a two-level classifier (\\'Profisi Ensemble\\') using predictions from 7 programs, along with 2 other data sources. Support vector machines using \\'full\\' and \\'reduced\\' data sets are combined in an either\\/or approach. We achieve a 14% increase in performance over the current state-of-the-art, as benchmarked by a third-party tool. Conclusions Supervised learning methods are a useful way to combine predictions from diverse sources.

  9. An ensemble prediction approach to weekly Dengue cases forecasting based on climatic and terrain conditions

    Directory of Open Access Journals (Sweden)

    Sougata Deb

    2017-11-01

    Full Text Available Introduction: Dengue fever has been one of the most concerning endemic diseases of recent times. Every year, 50-100 million people get infected by the dengue virus across the world. Historically, it has been most prevalent in Southeast Asia and the Pacific Islands. In recent years, frequent dengue epidemics have started occurring in Latin America as well. This study focused on assessing the impact of different short and long-term lagged climatic predictors on dengue cases. Additionally, it assessed the impact of building an ensemble model using multiple time series and regression models, in improving prediction accuracy. Materials and Methods: Experimental data were based on two Latin American cities, viz. San Juan (Puerto Rico and Iquitos (Peru. Due to weather and geographic differences, San Juan recorded higher dengue incidences than Iquitos. Using lagged cross-correlations, this study confirmed the impact of temperature and vegetation on the number of dengue cases for both cities, though in varied degrees and time lags. An ensemble of multiple predictive models using an elaborate set of derived predictors was built and validated. Results: The proposed ensemble prediction achieved a mean absolute error of 21.55, 4.26 points lower than the 25.81 obtained by a standard negative binomial model. Changes in climatic conditions and urbanization were found to be strong predictors as established empirically in other researches. Some of the predictors were new and informative, which have not been explored in any other relevant studies yet. Discussion and Conclusions: Two original contributions were made in this research. Firstly, a focused and extensive feature engineering aligned with the mosquito lifecycle. Secondly, a novel covariate pattern-matching based prediction approach using past time series trend of the predictor variables. Increased accuracy of the proposed model over the benchmark model proved the appropriateness of the analytical approach

  10. Dynamic Security Assessment of Western Danish Power System Based on Ensemble Decision Trees

    DEFF Research Database (Denmark)

    Liu, Leo; Bak, Claus Leth; Chen, Zhe

    2014-01-01

    With the increasing penetration of renewable energy resources and other forms of dispersed generation, more and more uncertainties will be brought to the dynamic security assessment (DSA) of power systems. This paper proposes an approach that uses ensemble decision trees (EDT) for online DSA. Fed...... with online wide-area measurement data, it is capable of not only predicting the security states of current operating conditions (OC) with high accuracy, but also indicating the confidence of the security states 1 minute ahead of the real time by an outlier identification method. The results of EDT together...

  11. The GMAO Hybrid Ensemble-Variational Atmospheric Data Assimilation System: Version 2.0

    Science.gov (United States)

    Todling, Ricardo; El Akkraoui, Amal

    2018-01-01

    should point out that Release 1.0 of this document was made available to GMAO in mid-2013, when we introduced Hybrid 3D-Var capability to GEOS ADAS. This initial version of the documentation included a considerably different state-of-science introductory section but many of the same detailed description of the mechanisms of GEOS EnADAS. We are glad to report that a few of the desirable Future Works listed in Release 1.0 have now been added to the present version of GEOS EnADAS. These include the ability to exercise an Ensemble Prediction System that uses the ensemble analyses of GEOS EnADAS and (a very early, but functional version of) a tool to support Ensemble Forecast Sensitivity and Observation Impact applications.

  12. AUC-based biomarker ensemble with an application on gene scores predicting low bone mineral density.

    Science.gov (United States)

    Zhao, X G; Dai, W; Li, Y; Tian, L

    2011-11-01

    The area under the receiver operating characteristic (ROC) curve (AUC), long regarded as a 'golden' measure for the predictiveness of a continuous score, has propelled the need to develop AUC-based predictors. However, the AUC-based ensemble methods are rather scant, largely due to the fact that the associated objective function is neither continuous nor concave. Indeed, there is no reliable numerical algorithm identifying optimal combination of a set of biomarkers to maximize the AUC, especially when the number of biomarkers is large. We have proposed a novel AUC-based statistical ensemble methods for combining multiple biomarkers to differentiate a binary response of interest. Specifically, we propose to replace the non-continuous and non-convex AUC objective function by a convex surrogate loss function, whose minimizer can be efficiently identified. With the established framework, the lasso and other regularization techniques enable feature selections. Extensive simulations have demonstrated the superiority of the new methods to the existing methods. The proposal has been applied to a gene expression dataset to construct gene expression scores to differentiate elderly women with low bone mineral density (BMD) and those with normal BMD. The AUCs of the resulting scores in the independent test dataset has been satisfactory. Aiming for directly maximizing AUC, the proposed AUC-based ensemble method provides an efficient means of generating a stable combination of multiple biomarkers, which is especially useful under the high-dimensional settings. lutian@stanford.edu. Supplementary data are available at Bioinformatics online.

  13. Adaptive Encoding of Outcome Prediction by Prefrontal Cortex Ensembles Supports Behavioral Flexibility.

    Science.gov (United States)

    Del Arco, Alberto; Park, Junchol; Wood, Jesse; Kim, Yunbok; Moghaddam, Bita

    2017-08-30

    The prefrontal cortex (PFC) is thought to play a critical role in behavioral flexibility by monitoring action-outcome contingencies. How PFC ensembles represent shifts in behavior in response to changes in these contingencies remains unclear. We recorded single-unit activity and local field potentials in the dorsomedial PFC (dmPFC) of male rats during a set-shifting task that required them to update their behavior, among competing options, in response to changes in action-outcome contingencies. As behavior was updated, a subset of PFC ensembles encoded the current trial outcome before the outcome was presented. This novel outcome-prediction encoding was absent in a control task, in which actions were rewarded pseudorandomly, indicating that PFC neurons are not merely providing an expectancy signal. In both control and set-shifting tasks, dmPFC neurons displayed postoutcome discrimination activity, indicating that these neurons also monitor whether a behavior is successful in generating rewards. Gamma-power oscillatory activity increased before the outcome in both tasks but did not differentiate between expected outcomes, suggesting that this measure is not related to set-shifting behavior but reflects expectation of an outcome after action execution. These results demonstrate that PFC neurons support flexible rule-based action selection by predicting outcomes that follow a particular action. SIGNIFICANCE STATEMENT Tracking action-outcome contingencies and modifying behavior when those contingencies change is critical to behavioral flexibility. We find that ensembles of dorsomedial prefrontal cortex neurons differentiate between expected outcomes when action-outcome contingencies change. This predictive mode of signaling may be used to promote a new response strategy at the service of behavioral flexibility. Copyright © 2017 the authors 0270-6474/17/378363-11$15.00/0.

  14. Impacto da utilização de previsões "defasadas" no sistema de previsão de tempo por conjunto do CPTEC/INPE The impact of using lagged forecasts on the CPTEC/INPE ensemble prediction system

    Directory of Open Access Journals (Sweden)

    Lúcia Helena Ribas Machado

    2010-03-01

    improves the performance of the operational ensemble contributing to increase the ensemble spreading and, consequently, to reduce the under-dispersion of the system. Also we observed that lagged average forecast (LAF shows similar performance of the operational EPS-CPTEC/INPE and that there is a tendency to higher performance when spread forecast is low, for 5 and 7 day forecast. These results provide the basis for the operational implementation of the LAF technique, which has low computational cost, and contribute to a more efficient utilization of the CPTEC/INPE ensemble predictions.

  15. Stochastic Prediction of Wind Generating Resources Using the Enhanced Ensemble Model for Jeju Island’s Wind Farms in South Korea

    OpenAIRE

    Deockho Kim; Jin Hur

    2017-01-01

    Due to the intermittency of wind power generation, it is very hard to manage its system operation and planning. In order to incorporate higher wind power penetrations into power systems that maintain secure and economic power system operation, an accurate and efficient estimation of wind power outputs is needed. In this paper, we propose the stochastic prediction of wind generating resources using an enhanced ensemble model for Jeju Island’s wind farms in South Korea. When selecting the poten...

  16. Probability weighted ensemble transfer learning for predicting interactions between HIV-1 and human proteins.

    Directory of Open Access Journals (Sweden)

    Suyu Mei

    Full Text Available Reconstruction of host-pathogen protein interaction networks is of great significance to reveal the underlying microbic pathogenesis. However, the current experimentally-derived networks are generally small and should be augmented by computational methods for less-biased biological inference. From the point of view of computational modelling, data scarcity, data unavailability and negative data sampling are the three major problems for host-pathogen protein interaction networks reconstruction. In this work, we are motivated to address the three concerns and propose a probability weighted ensemble transfer learning model for HIV-human protein interaction prediction (PWEN-TLM, where support vector machine (SVM is adopted as the individual classifier of the ensemble model. In the model, data scarcity and data unavailability are tackled by homolog knowledge transfer. The importance of homolog knowledge is measured by the ROC-AUC metric of the individual classifiers, whose outputs are probability weighted to yield the final decision. In addition, we further validate the assumption that only the homolog knowledge is sufficient to train a satisfactory model for host-pathogen protein interaction prediction. Thus the model is more robust against data unavailability with less demanding data constraint. As regards with negative data construction, experiments show that exclusiveness of subcellular co-localized proteins is unbiased and more reliable than random sampling. Last, we conduct analysis of overlapped predictions between our model and the existing models, and apply the model to novel host-pathogen PPIs recognition for further biological research.

  17. EMUDRA: Ensemble of Multiple Drug Repositioning Approaches to Improve Prediction Accuracy.

    Science.gov (United States)

    Zhou, Xianxiao; Wang, Minghui; Katsyv, Igor; Irie, Hanna; Zhang, Bin

    2018-04-24

    Availability of large-scale genomic, epigenetic and proteomic data in complex diseases makes it possible to objectively and comprehensively identify therapeutic targets that can lead to new therapies. The Connectivity Map has been widely used to explore novel indications of existing drugs. However, the prediction accuracy of the existing methods, such as Kolmogorov-Smirnov statistic remains low. Here we present a novel high-performance drug repositioning approach that improves over the state-of-the-art methods. We first designed an expression weighted cosine method (EWCos) to minimize the influence of the uninformative expression changes and then developed an ensemble approach termed EMUDRA (Ensemble of Multiple Drug Repositioning Approaches) to integrate EWCos and three existing state-of-the-art methods. EMUDRA significantly outperformed individual drug repositioning methods when applied to simulated and independent evaluation datasets. We predicted using EMUDRA and experimentally validated an antibiotic rifabutin as an inhibitor of cell growth in triple negative breast cancer. EMUDRA can identify drugs that more effectively target disease gene signatures and will thus be a useful tool for identifying novel therapies for complex diseases and predicting new indications for existing drugs. The EMUDRA R package is available at doi:10.7303/syn11510888. bin.zhang@mssm.edu or zhangb@hotmail.com. Supplementary data are available at Bioinformatics online.

  18. A Machine Learning Ensemble Classifier for Early Prediction of Diabetic Retinopathy.

    Science.gov (United States)

    S K, Somasundaram; P, Alli

    2017-11-09

    of DR screening system using Bagging Ensemble Classifier (BEC) is investigated. With the help of voting the process in ML-BEC, bagging minimizes the error due to variance of the base classifier. With the publicly available retinal image databases, our classifier is trained with 25% of RI. Results show that the ensemble classifier can achieve better classification accuracy (CA) than single classification models. Empirical experiments suggest that the machine learning-based ensemble classifier is efficient for further reducing DR classification time (CT).

  19. A Unified Air-Sea Interface in Fully Coupled Atmosphere-Wave-Ocean Models for Data Assimilation and Ensemble Prediction

    Science.gov (United States)

    Chen, Shuyi; Curcic, Milan; Donelan, Mark; Campbell, Tim; Smith, Travis; Chen, Sue; Allard, Rick; Michalakes, John

    2014-05-01

    The goals of this study are to 1) better understand the physical processes controlling air-sea interaction and their impact on coastal marine and storm predictions, 2) explore the use of coupled atmosphere-ocean observations in model verification and data assimilation, and 3) develop a physically based and computationally efficient coupling at the air-sea interface that is flexible for use in a multi-model system and portable for transition to the next generation research and operational coupled atmosphere-wave-ocean-land models. We have developed a unified air-sea interface module that couples multiple atmosphere, wave, and ocean models using the Earth System Modeling Framework (ESMF). This standardized coupling framework allows researchers to develop and test air-sea coupling parameterizations and coupled data assimilation, and to better facilitate research-to-operation activities. It also allows for future ensemble forecasts using coupled models that can be used for coupled data assimilation and assessment of uncertainties in coupled model predictions. The current component models include two atmospheric models (WRF and COAMPS), two ocean models (HYCOM and NCOM), and two wave models (UMWM and SWAN). The coupled modeling systems have been tested and evaluated using the coupled air-sea observations (e.g., GPS dropsondes and AXBTs, drifters and floats) collected in recent field campaigns in the Gulf of Mexico and tropical cyclones in the Atlantic and Pacific basins. This talk will provide an overview of the unified air-sea interface model and fully coupled atmosphere-wave-ocean model predictions over various coastal regions and tropical cyclones in the Pacific and Atlantic basins including an example from coupled ensemble prediction of Superstorm Sandy (2012).

  20. WE-E-BRE-05: Ensemble of Graphical Models for Predicting Radiation Pneumontis Risk

    Energy Technology Data Exchange (ETDEWEB)

    Lee, S; Ybarra, N; Jeyaseelan, K; El Naqa, I [McGill University, Montreal, Quebec (Canada); Faria, S; Kopek, N [Montreal General Hospital, Montreal, Quebec (Canada)

    2014-06-15

    Purpose: We propose a prior knowledge-based approach to construct an interaction graph of biological and dosimetric radiation pneumontis (RP) covariates for the purpose of developing a RP risk classifier. Methods: We recruited 59 NSCLC patients who received curative radiotherapy with minimum 6 month follow-up. 16 RP events was observed (CTCAE grade ≥2). Blood serum was collected from every patient before (pre-RT) and during RT (mid-RT). From each sample the concentration of the following five candidate biomarkers were taken as covariates: alpha-2-macroglobulin (α2M), angiotensin converting enzyme (ACE), transforming growth factor β (TGF-β), interleukin-6 (IL-6), and osteopontin (OPN). Dose-volumetric parameters were also included as covariates. The number of biological and dosimetric covariates was reduced by a variable selection scheme implemented by L1-regularized logistic regression (LASSO). Posterior probability distribution of interaction graphs between the selected variables was estimated from the data under the literature-based prior knowledge to weight more heavily the graphs that contain the expected associations. A graph ensemble was formed by averaging the most probable graphs weighted by their posterior, creating a Bayesian Network (BN)-based RP risk classifier. Results: The LASSO selected the following 7 RP covariates: (1) pre-RT concentration level of α2M, (2) α2M level mid- RT/pre-RT, (3) pre-RT IL6 level, (4) IL6 level mid-RT/pre-RT, (5) ACE mid-RT/pre-RT, (6) PTV volume, and (7) mean lung dose (MLD). The ensemble BN model achieved the maximum sensitivity/specificity of 81%/84% and outperformed univariate dosimetric predictors as shown by larger AUC values (0.78∼0.81) compared with MLD (0.61), V20 (0.65) and V30 (0.70). The ensembles obtained by incorporating the prior knowledge improved classification performance for the ensemble size 5∼50. Conclusion: We demonstrated a probabilistic ensemble method to detect robust associations between

  1. Predictor-Year Subspace Clustering Based Ensemble Prediction of Indian Summer Monsoon

    Directory of Open Access Journals (Sweden)

    Moumita Saha

    2016-01-01

    Full Text Available Forecasting the Indian summer monsoon is a challenging task due to its complex and nonlinear behavior. A large number of global climatic variables with varying interaction patterns over years influence monsoon. Various statistical and neural prediction models have been proposed for forecasting monsoon, but many of them fail to capture variability over years. The skill of predictor variables of monsoon also evolves over time. In this article, we propose a joint-clustering of monsoon years and predictors for understanding and predicting the monsoon. This is achieved by subspace clustering algorithm. It groups the years based on prevailing global climatic condition using statistical clustering technique and subsequently for each such group it identifies significant climatic predictor variables which assist in better prediction. Prediction model is designed to frame individual cluster using random forest of regression tree. Prediction of aggregate and regional monsoon is attempted. Mean absolute error of 5.2% is obtained for forecasting aggregate Indian summer monsoon. Errors in predicting the regional monsoons are also comparable in comparison to the high variation of regional precipitation. Proposed joint-clustering based ensemble model is observed to be superior to existing monsoon prediction models and it also surpasses general nonclustering based prediction models.

  2. Predicting protein subcellular locations using hierarchical ensemble of Bayesian classifiers based on Markov chains

    Directory of Open Access Journals (Sweden)

    Eils Roland

    2006-06-01

    Full Text Available Abstract Background The subcellular location of a protein is closely related to its function. It would be worthwhile to develop a method to predict the subcellular location for a given protein when only the amino acid sequence of the protein is known. Although many efforts have been made to predict subcellular location from sequence information only, there is the need for further research to improve the accuracy of prediction. Results A novel method called HensBC is introduced to predict protein subcellular location. HensBC is a recursive algorithm which constructs a hierarchical ensemble of classifiers. The classifiers used are Bayesian classifiers based on Markov chain models. We tested our method on six various datasets; among them are Gram-negative bacteria dataset, data for discriminating outer membrane proteins and apoptosis proteins dataset. We observed that our method can predict the subcellular location with high accuracy. Another advantage of the proposed method is that it can improve the accuracy of the prediction of some classes with few sequences in training and is therefore useful for datasets with imbalanced distribution of classes. Conclusion This study introduces an algorithm which uses only the primary sequence of a protein to predict its subcellular location. The proposed recursive scheme represents an interesting methodology for learning and combining classifiers. The method is computationally efficient and competitive with the previously reported approaches in terms of prediction accuracies as empirical results indicate. The code for the software is available upon request.

  3. Modified ensemble Kalman filter for nuclear accident atmospheric dispersion: prediction improved and source estimated.

    Science.gov (United States)

    Zhang, X L; Su, G F; Yuan, H Y; Chen, J G; Huang, Q Y

    2014-09-15

    Atmospheric dispersion models play an important role in nuclear power plant accident management. A reliable estimation of radioactive material distribution in short range (about 50 km) is in urgent need for population sheltering and evacuation planning. However, the meteorological data and the source term which greatly influence the accuracy of the atmospheric dispersion models are usually poorly known at the early phase of the emergency. In this study, a modified ensemble Kalman filter data assimilation method in conjunction with a Lagrangian puff-model is proposed to simultaneously improve the model prediction and reconstruct the source terms for short range atmospheric dispersion using the off-site environmental monitoring data. Four main uncertainty parameters are considered: source release rate, plume rise height, wind speed and wind direction. Twin experiments show that the method effectively improves the predicted concentration distribution, and the temporal profiles of source release rate and plume rise height are also successfully reconstructed. Moreover, the time lag in the response of ensemble Kalman filter is shortened. The method proposed here can be a useful tool not only in the nuclear power plant accident emergency management but also in other similar situation where hazardous material is released into the atmosphere. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Assessing the predictive capability of randomized tree-based ensembles in streamflow modelling

    Science.gov (United States)

    Galelli, S.; Castelletti, A.

    2013-07-01

    Combining randomization methods with ensemble prediction is emerging as an effective option to balance accuracy and computational efficiency in data-driven modelling. In this paper, we investigate the prediction capability of extremely randomized trees (Extra-Trees), in terms of accuracy, explanation ability and computational efficiency, in a streamflow modelling exercise. Extra-Trees are a totally randomized tree-based ensemble method that (i) alleviates the poor generalisation property and tendency to overfitting of traditional standalone decision trees (e.g. CART); (ii) is computationally efficient; and, (iii) allows to infer the relative importance of the input variables, which might help in the ex-post physical interpretation of the model. The Extra-Trees potential is analysed on two real-world case studies - Marina catchment (Singapore) and Canning River (Western Australia) - representing two different morphoclimatic contexts. The evaluation is performed against other tree-based methods (CART and M5) and parametric data-driven approaches (ANNs and multiple linear regression). Results show that Extra-Trees perform comparatively well to the best of the benchmarks (i.e. M5) in both the watersheds, while outperforming the other approaches in terms of computational requirement when adopted on large datasets. In addition, the ranking of the input variable provided can be given a physically meaningful interpretation.

  5. Multidimensional generalized-ensemble algorithms for complex systems.

    Science.gov (United States)

    Mitsutake, Ayori; Okamoto, Yuko

    2009-06-07

    We give general formulations of the multidimensional multicanonical algorithm, simulated tempering, and replica-exchange method. We generalize the original potential energy function E(0) by adding any physical quantity V of interest as a new energy term. These multidimensional generalized-ensemble algorithms then perform a random walk not only in E(0) space but also in V space. Among the three algorithms, the replica-exchange method is the easiest to perform because the weight factor is just a product of regular Boltzmann-like factors, while the weight factors for the multicanonical algorithm and simulated tempering are not a priori known. We give a simple procedure for obtaining the weight factors for these two latter algorithms, which uses a short replica-exchange simulation and the multiple-histogram reweighting techniques. As an example of applications of these algorithms, we have performed a two-dimensional replica-exchange simulation and a two-dimensional simulated-tempering simulation using an alpha-helical peptide system. From these simulations, we study the helix-coil transitions of the peptide in gas phase and in aqueous solution.

  6. An ensemble approach to predicting the impact of vaccination on rotavirus disease in Niger.

    Science.gov (United States)

    Park, Jaewoo; Goldstein, Joshua; Haran, Murali; Ferrari, Matthew

    2017-10-13

    Recently developed vaccines provide a new way of controlling rotavirus in sub-Saharan Africa. Models for the transmission dynamics of rotavirus are critical both for estimating current burden from imperfect surveillance and for assessing potential effects of vaccine intervention strategies. We examine rotavirus infection in the Maradi area in southern Niger using hospital surveillance data provided by Epicentre collected over two years. Additionally, a cluster survey of households in the region allows us to estimate the proportion of children with diarrhea who consulted at a health structure. Model fit and future projections are necessarily particular to a given model; thus, where there are competing models for the underlying epidemiology an ensemble approach can account for that uncertainty. We compare our results across several variants of Susceptible-Infectious-Recovered (SIR) compartmental models to quantify the impact of modeling assumptions on our estimates. Model-specific parameters are estimated by Bayesian inference using Markov chain Monte Carlo. We then use Bayesian model averaging to generate ensemble estimates of the current dynamics, including estimates of R 0 , the burden of infection in the region, as well as the impact of vaccination on both the short-term dynamics and the long-term reduction of rotavirus incidence under varying levels of coverage. The ensemble of models predicts that the current burden of severe rotavirus disease is 2.6-3.7% of the population each year and that a 2-dose vaccine schedule achieving 70% coverage could reduce burden by 39-42%. Copyright © 2017. Published by Elsevier Ltd.

  7. A study of fuzzy logic ensemble system performance on face recognition problem

    Science.gov (United States)

    Polyakova, A.; Lipinskiy, L.

    2017-02-01

    Some problems are difficult to solve by using a single intelligent information technology (IIT). The ensemble of the various data mining (DM) techniques is a set of models which are able to solve the problem by itself, but the combination of which allows increasing the efficiency of the system as a whole. Using the IIT ensembles can improve the reliability and efficiency of the final decision, since it emphasizes on the diversity of its components. The new method of the intellectual informational technology ensemble design is considered in this paper. It is based on the fuzzy logic and is designed to solve the classification and regression problems. The ensemble consists of several data mining algorithms: artificial neural network, support vector machine and decision trees. These algorithms and their ensemble have been tested by solving the face recognition problems. Principal components analysis (PCA) is used for feature selection.

  8. Potential predictability and forecast skill in ensemble climate forecast: a skill-persistence rule

    Science.gov (United States)

    Jin, Yishuai; Rong, Xinyao; Liu, Zhengyu

    2017-12-01

    This study investigates the factors relationship between the forecast skills for the real world (actual skill) and perfect model (perfect skill) in ensemble climate model forecast with a series of fully coupled general circulation model forecast experiments. It is found that the actual skill for sea surface temperature (SST) in seasonal forecast is substantially higher than the perfect skill on a large part of the tropical oceans, especially the tropical Indian Ocean and the central-eastern Pacific Ocean. The higher actual skill is found to be related to the higher observational SST persistence, suggesting a skill-persistence rule: a higher SST persistence in the real world than in the model could overwhelm the model bias to produce a higher forecast skill for the real world than for the perfect model. The relation between forecast skill and persistence is further proved using a first-order autoregressive model (AR1) analytically for theoretical solutions and numerically for analogue experiments. The AR1 model study shows that the skill-persistence rule is strictly valid in the case of infinite ensemble size, but could be distorted by sampling errors and non-AR1 processes. This study suggests that the so called "perfect skill" is model dependent and cannot serve as an accurate estimate of the true upper limit of real world prediction skill, unless the model can capture at least the persistence property of the observation.

  9. Very short-term rainfall forecasting by effectively using the ensemble outputs of numerical weather prediction models

    Science.gov (United States)

    Wu, Ming-Chang; Lin, Gwo-Fong; Feng, Lei; Hwang, Gong-Do

    2017-04-01

    In Taiwan, heavy rainfall brought by typhoons often causes serious disasters and leads to loss of life and property. In order to reduce the impact of these disasters, accurate rainfall forecasts are always important for civil protection authorities to prepare proper measures in advance. In this study, a methodology is proposed for providing very short-term (1- to 6-h ahead) rainfall forecasts in a basin-scale area. The proposed methodology is developed based on the use of analogy reasoning approach to effectively integrate the ensemble precipitation forecasts from a numerical weather prediction system in Taiwan. To demonstrate the potential of the proposed methodology, an application to a basin-scale area (the Choshui River basin located in west-central Taiwan) during five typhoons is conducted. The results indicate that the proposed methodology yields more accurate hourly rainfall forecasts, especially the forecasts with a lead time of 1 to 3 hours. On average, improvement of the Nash-Sutcliffe efficiency coefficient is about 14% due to the effective use of the ensemble forecasts through the proposed methodology. The proposed methodology is expected to be useful for providing accurate very short-term rainfall forecasts during typhoons.

  10. Seasonal Climate Predictability in a Coupled OAGCM Using a Different Approach for Ensemble Forecasts.

    Science.gov (United States)

    Luo, Jing-Jia; Masson, Sebastien; Behera, Swadhin; Shingu, Satoru; Yamagata, Toshio

    2005-11-01

    Predictabilities of tropical climate signals are investigated using a relatively high resolution Scale Interaction Experiment Frontier Research Center for Global Change (FRCGC) coupled GCM (SINTEX-F). Five ensemble forecast members are generated by perturbing the model’s coupling physics, which accounts for the uncertainties of both initial conditions and model physics. Because of the model’s good performance in simulating the climatology and ENSO in the tropical Pacific, a simple coupled SST-nudging scheme generates realistic thermocline and surface wind variations in the equatorial Pacific. Several westerly and easterly wind bursts in the western Pacific are also captured.Hindcast results for the period 1982 2001 show a high predictability of ENSO. All past El Niño and La Niña events, including the strongest 1997/98 warm episode, are successfully predicted with the anomaly correlation coefficient (ACC) skill scores above 0.7 at the 12-month lead time. The predicted signals of some particular events, however, become weak with a delay in the phase at mid and long lead times. This is found to be related to the intraseasonal wind bursts that are unpredicted beyond a few months of lead time. The model forecasts also show a “spring prediction barrier” similar to that in observations. Spatial SST anomalies, teleconnection, and global drought/flood during three different phases of ENSO are successfully predicted at 9 12-month lead times.In the tropical North Atlantic and southwestern Indian Ocean, where ENSO has predominant influences, the model shows skillful predictions at the 7 12-month lead times. The distinct signal of the Indian Ocean dipole (IOD) event in 1994 is predicted at the 6-month lead time. SST anomalies near the western coast of Australia are also predicted beyond the 12-month lead time because of pronounced decadal signals there.

  11. Development of web-based services for an ensemble flood forecasting and risk assessment system

    Science.gov (United States)

    Yaw Manful, Desmond; He, Yi; Cloke, Hannah; Pappenberger, Florian; Li, Zhijia; Wetterhall, Fredrik; Huang, Yingchun; Hu, Yuzhong

    2010-05-01

    Flooding is a wide spread and devastating natural disaster worldwide. Floods that took place in the last decade in China were ranked the worst amongst recorded floods worldwide in terms of the number of human fatalities and economic losses (Munich Re-Insurance). Rapid economic development and population expansion into low lying flood plains has worsened the situation. Current conventional flood prediction systems in China are neither suited to the perceptible climate variability nor the rapid pace of urbanization sweeping the country. Flood prediction, from short-term (a few hours) to medium-term (a few days), needs to be revisited and adapted to changing socio-economic and hydro-climatic realities. The latest technology requires implementation of multiple numerical weather prediction systems. The availability of twelve global ensemble weather prediction systems through the ‘THORPEX Interactive Grand Global Ensemble' (TIGGE) offers a good opportunity for an effective state-of-the-art early forecasting system. A prototype of a Novel Flood Early Warning System (NEWS) using the TIGGE database is tested in the Huai River basin in east-central China. It is the first early flood warning system in China that uses the massive TIGGE database cascaded with river catchment models, the Xinanjiang hydrologic model and a 1-D hydraulic model, to predict river discharge and flood inundation. The NEWS algorithm is also designed to provide web-based services to a broad spectrum of end-users. The latter presents challenges as both databases and proprietary codes reside in different locations and converge at dissimilar times. NEWS will thus make use of a ready-to-run grid system that makes distributed computing and data resources available in a seamless and secure way. An ability to run or function on different operating systems and provide an interface or front that is accessible to broad spectrum of end-users is additional requirement. The aim is to achieve robust interoperability

  12. Visualizing uncertainties in a storm surge ensemble data assimilation and forecasting system

    KAUST Repository

    Hollt, Thomas; Altaf, Muhammad; Mandli, Kyle T.; Hadwiger, Markus; Dawson, Clint N.; Hoteit, Ibrahim

    2015-01-01

    allows the user to browse through the simulation ensembles in real time, view specific parameter settings or simulation models and move between different spatial and temporal regions without delay. In addition, our system provides advanced visualizations

  13. Estimating predictive hydrological uncertainty by dressing deterministic and ensemble forecasts; a comparison, with application to Meuse and Rhine

    Science.gov (United States)

    Verkade, J. S.; Brown, J. D.; Davids, F.; Reggiani, P.; Weerts, A. H.

    2017-12-01

    Two statistical post-processing approaches for estimation of predictive hydrological uncertainty are compared: (i) 'dressing' of a deterministic forecast by adding a single, combined estimate of both hydrological and meteorological uncertainty and (ii) 'dressing' of an ensemble streamflow forecast by adding an estimate of hydrological uncertainty to each individual streamflow ensemble member. Both approaches aim to produce an estimate of the 'total uncertainty' that captures both the meteorological and hydrological uncertainties. They differ in the degree to which they make use of statistical post-processing techniques. In the 'lumped' approach, both sources of uncertainty are lumped by post-processing deterministic forecasts using their verifying observations. In the 'source-specific' approach, the meteorological uncertainties are estimated by an ensemble of weather forecasts. These ensemble members are routed through a hydrological model and a realization of the probability distribution of hydrological uncertainties (only) is then added to each ensemble member to arrive at an estimate of the total uncertainty. The techniques are applied to one location in the Meuse basin and three locations in the Rhine basin. Resulting forecasts are assessed for their reliability and sharpness, as well as compared in terms of multiple verification scores including the relative mean error, Brier Skill Score, Mean Continuous Ranked Probability Skill Score, Relative Operating Characteristic Score and Relative Economic Value. The dressed deterministic forecasts are generally more reliable than the dressed ensemble forecasts, but the latter are sharper. On balance, however, they show similar quality across a range of verification metrics, with the dressed ensembles coming out slightly better. Some additional analyses are suggested. Notably, these include statistical post-processing of the meteorological forecasts in order to increase their reliability, thus increasing the reliability

  14. Drug-target interaction prediction via class imbalance-aware ensemble learning.

    Science.gov (United States)

    Ezzat, Ali; Wu, Min; Li, Xiao-Li; Kwoh, Chee-Keong

    2016-12-22

    Multiple computational methods for predicting drug-target interactions have been developed to facilitate the drug discovery process. These methods use available data on known drug-target interactions to train classifiers with the purpose of predicting new undiscovered interactions. However, a key challenge regarding this data that has not yet been addressed by these methods, namely class imbalance, is potentially degrading the prediction performance. Class imbalance can be divided into two sub-problems. Firstly, the number of known interacting drug-target pairs is much smaller than that of non-interacting drug-target pairs. This imbalance ratio between interacting and non-interacting drug-target pairs is referred to as the between-class imbalance. Between-class imbalance degrades prediction performance due to the bias in prediction results towards the majority class (i.e. the non-interacting pairs), leading to more prediction errors in the minority class (i.e. the interacting pairs). Secondly, there are multiple types of drug-target interactions in the data with some types having relatively fewer members (or are less represented) than others. This variation in representation of the different interaction types leads to another kind of imbalance referred to as the within-class imbalance. In within-class imbalance, prediction results are biased towards the better represented interaction types, leading to more prediction errors in the less represented interaction types. We propose an ensemble learning method that incorporates techniques to address the issues of between-class imbalance and within-class imbalance. Experiments show that the proposed method improves results over 4 state-of-the-art methods. In addition, we simulated cases for new drugs and targets to see how our method would perform in predicting their interactions. New drugs and targets are those for which no prior interactions are known. Our method displayed satisfactory prediction performance and was

  15. Random matrix ensembles for PT-symmetric systems

    International Nuclear Information System (INIS)

    Graefe, Eva-Maria; Mudute-Ndumbe, Steve; Taylor, Matthew

    2015-01-01

    Recently much effort has been made towards the introduction of non-Hermitian random matrix models respecting PT-symmetry. Here we show that there is a one-to-one correspondence between complex PT-symmetric matrices and split-complex and split-quaternionic versions of Hermitian matrices. We introduce two new random matrix ensembles of (a) Gaussian split-complex Hermitian; and (b) Gaussian split-quaternionic Hermitian matrices, of arbitrary sizes. We conjecture that these ensembles represent universality classes for PT-symmetric matrices. For the case of 2 × 2 matrices we derive analytic expressions for the joint probability distributions of the eigenvalues, the one-level densities and the level spacings in the case of real eigenvalues. (fast track communication)

  16. Ensembl 2004.

    Science.gov (United States)

    Birney, E; Andrews, D; Bevan, P; Caccamo, M; Cameron, G; Chen, Y; Clarke, L; Coates, G; Cox, T; Cuff, J; Curwen, V; Cutts, T; Down, T; Durbin, R; Eyras, E; Fernandez-Suarez, X M; Gane, P; Gibbins, B; Gilbert, J; Hammond, M; Hotz, H; Iyer, V; Kahari, A; Jekosch, K; Kasprzyk, A; Keefe, D; Keenan, S; Lehvaslaiho, H; McVicker, G; Melsopp, C; Meidl, P; Mongin, E; Pettett, R; Potter, S; Proctor, G; Rae, M; Searle, S; Slater, G; Smedley, D; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Storey, R; Ureta-Vidal, A; Woodwark, C; Clamp, M; Hubbard, T

    2004-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organize biology around the sequences of large genomes. It is a comprehensive and integrated source of annotation of large genome sequences, available via interactive website, web services or flat files. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. The facilities of the system range from sequence analysis to data storage and visualization and installations exist around the world both in companies and at academic sites. With a total of nine genome sequences available from Ensembl and more genomes to follow, recent developments have focused mainly on closer integration between genomes and external data.

  17. Predicting Hepatotoxicity of Drug Metabolites Via an Ensemble Approach Based on Support Vector Machine

    Science.gov (United States)

    Lu, Yin; Liu, Lili; Lu, Dong; Cai, Yudong; Zheng, Mingyue; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian

    2017-11-20

    Drug-induced liver injury (DILI) is a major cause of drug withdrawal. The chemical properties of the drug, especially drug metabolites, play key roles in DILI. Our goal is to construct a QSAR model to predict drug hepatotoxicity based on drug metabolites. 64 hepatotoxic drug metabolites and 3,339 non-hepatotoxic drug metabolites were gathered from MDL Metabolite Database. Considering the imbalance of the dataset, we randomly split the negative samples and combined each portion with all the positive samples to construct individually balanced datasets for constructing independent classifiers. Then, we adopted an ensemble approach to make prediction based on the results of all individual classifiers and applied the minimum Redundancy Maximum Relevance (mRMR) feature selection method to select the molecular descriptors. Eventually, for the drugs in the external test set, a Bayesian inference method was used to predict the hepatotoxicity of a drug based on its metabolites. The model showed the average balanced accuracy=78.47%, sensitivity =74.17%, and specificity=82.77%. Five molecular descriptors characterizing molecular polarity, intramolecular bonding strength, and molecular frontier orbital energy were obtained. When predicting the hepatotoxicity of a drug based on all its metabolites, the sensitivity, specificity and balanced accuracy were 60.38%, 70.00%, and 65.19%, respectively, indicating that this method is useful for identifying the hepatotoxicity of drugs. We developed an in silico model to predict hepatotoxicity of drug metabolites. Moreover, Bayesian inference was applied to predict the hepatotoxicity of a drug based on its metabolites which brought out valuable high sensitivity and specificity. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  18. Ensemble-free configurational temperature for spin systems

    Science.gov (United States)

    Palma, G.; Gutiérrez, G.; Davis, S.

    2016-12-01

    An estimator for the dynamical temperature in an arbitrary ensemble is derived in the framework of the conjugate variables theorem. We prove directly that its average indeed gives the inverse temperature and that it is independent of the ensemble. We test this estimator numerically by a simulation of the two-dimensional X Y model in the canonical ensemble. As this model is critical in the whole region of temperatures below the Berezinski-Kosterlitz-Thouless critical temperature TBKT, we use a generalization of Wolff's unicluster algorithm. The numerical results allow us to confirm the robustness of the analytical expression for the microscopic estimator of the temperature. This microscopic estimator has also the advantage that it gives a direct measure of the thermalization process and can be used to compute absolute errors associated with statistical fluctuations. In consequence, this estimator allows for a direct, absolute, and stringent test of the ergodicity of the underlying Markov process, which encodes the algorithm used in a numerical simulation.

  19. HPSLPred: An Ensemble Multi-Label Classifier for Human Protein Subcellular Location Prediction with Imbalanced Source.

    Science.gov (United States)

    Wan, Shixiang; Duan, Yucong; Zou, Quan

    2017-09-01

    Predicting the subcellular localization of proteins is an important and challenging problem. Traditional experimental approaches are often expensive and time-consuming. Consequently, a growing number of research efforts employ a series of machine learning approaches to predict the subcellular location of proteins. There are two main challenges among the state-of-the-art prediction methods. First, most of the existing techniques are designed to deal with multi-class rather than multi-label classification, which ignores connections between multiple labels. In reality, multiple locations of particular proteins imply that there are vital and unique biological significances that deserve special focus and cannot be ignored. Second, techniques for handling imbalanced data in multi-label classification problems are necessary, but never employed. For solving these two issues, we have developed an ensemble multi-label classifier called HPSLPred, which can be applied for multi-label classification with an imbalanced protein source. For convenience, a user-friendly webserver has been established at http://server.malab.cn/HPSLPred. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Invariant methods for an ensemble-based sensitivity analysis of a passive containment cooling system of an AP1000 nuclear power plant

    International Nuclear Information System (INIS)

    Di Maio, Francesco; Nicola, Giancarlo; Borgonovo, Emanuele; Zio, Enrico

    2016-01-01

    Sensitivity Analysis (SA) is performed to gain fundamental insights on a system behavior that is usually reproduced by a model and to identify the most relevant input variables whose variations affect the system model functional response. For the reliability analysis of passive safety systems of Nuclear Power Plants (NPPs), models are Best Estimate (BE) Thermal Hydraulic (TH) codes, that predict the system functional response in normal and accidental conditions and, in this paper, an ensemble of three alternative invariant SA methods is innovatively set up for a SA on the TH code input variables. The ensemble aggregates the input variables raking orders provided by Pearson correlation ratio, Delta method and Beta method. The capability of the ensemble is shown on a BE–TH code of the Passive Containment Cooling System (PCCS) of an Advanced Pressurized water reactor AP1000, during a Loss Of Coolant Accident (LOCA), whose output probability density function (pdf) is approximated by a Finite Mixture Model (FMM), on the basis of a limited number of simulations. - Highlights: • We perform the reliability analysis of a passive safety system of Nuclear Power Plant (NPP). • We use a Thermal Hydraulic (TH) code for predicting the NPP response to accidents. • We propose an ensemble of Invariant Methods for the sensitivity analysis of the TH code • The ensemble aggregates the rankings of Pearson correlation, Delta and Beta methods. • The approach is tested on a Passive Containment Cooling System of an AP1000 NPP.

  1. Intelligent and robust prediction of short term wind power using genetic programming based ensemble of neural networks

    International Nuclear Information System (INIS)

    Zameer, Aneela; Arshad, Junaid; Khan, Asifullah; Raja, Muhammad Asif Zahoor

    2017-01-01

    Highlights: • Genetic programming based ensemble of neural networks is employed for short term wind power prediction. • Proposed predictor shows resilience against abrupt changes in weather. • Genetic programming evolves nonlinear mapping between meteorological measures and wind-power. • Proposed approach gives mathematical expressions of wind power to its independent variables. • Proposed model shows relatively accurate and steady wind-power prediction performance. - Abstract: The inherent instability of wind power production leads to critical problems for smooth power generation from wind turbines, which then requires an accurate forecast of wind power. In this study, an effective short term wind power prediction methodology is presented, which uses an intelligent ensemble regressor that comprises Artificial Neural Networks and Genetic Programming. In contrast to existing series based combination of wind power predictors, whereby the error or variation in the leading predictor is propagated down the stream to the next predictors, the proposed intelligent ensemble predictor avoids this shortcoming by introducing Genetical Programming based semi-stochastic combination of neural networks. It is observed that the decision of the individual base regressors may vary due to the frequent and inherent fluctuations in the atmospheric conditions and thus meteorological properties. The novelty of the reported work lies in creating ensemble to generate an intelligent, collective and robust decision space and thereby avoiding large errors due to the sensitivity of the individual wind predictors. The proposed ensemble based regressor, Genetic Programming based ensemble of Artificial Neural Networks, has been implemented and tested on data taken from five different wind farms located in Europe. Obtained numerical results of the proposed model in terms of various error measures are compared with the recent artificial intelligence based strategies to demonstrate the

  2. An Ensemble Nonlinear Model Predictive Control Algorithm in an Artificial Pancreas for People with Type 1 Diabetes

    DEFF Research Database (Denmark)

    Boiroux, Dimitri; Hagdrup, Morten; Mahmoudi, Zeinab

    2016-01-01

    patients with different physiological parameters and a time-varying insulin sensitivity using the Medtronic Virtual Patient (MVP) model. We augment the MVP model with stochastic diffusion terms, time-varying insulin sensitivity and noise-corrupted CGM measurements. We consider meal challenges where......This paper presents a novel ensemble nonlinear model predictive control (NMPC) algorithm for glucose regulation in type 1 diabetes. In this approach, we consider a number of scenarios describing different uncertainties, for instance meals or metabolic variations. We simulate a population of 9...... the uncertainty in meal size is ±50%. Numerical results show that the ensemble NMPC reduces the risk of hypoglycemia compared to standard NMPC in the case where the meal size is overestimated or correctly estimated at the expense of a slightly increased number of hyperglycemia. Therefore, ensemble MPC...

  3. Ensemble Architecture for Prediction of Enzyme-ligand Binding Residues Using Evolutionary Information.

    Science.gov (United States)

    Pai, Priyadarshini P; Dattatreya, Rohit Kadam; Mondal, Sukanta

    2017-11-01

    Enzyme interactions with ligands are crucial for various biochemical reactions governing life. Over many years attempts to identify these residues for biotechnological manipulations have been made using experimental and computational techniques. The computational approaches have gathered impetus with the accruing availability of sequence and structure information, broadly classified into template-based and de novo methods. One of the predominant de novo methods using sequence information involves application of biological properties for supervised machine learning. Here, we propose a support vector machines-based ensemble for prediction of protein-ligand interacting residues using one of the most important discriminative contributing properties in the interacting residue neighbourhood, i. e., evolutionary information in the form of position-specific- scoring matrix (PSSM). The study has been performed on a non-redundant dataset comprising of 9269 interacting and 91773 non-interacting residues for prediction model generation and further evaluation. Of the various PSSM-based models explored, the proposed method named ROBBY (pRediction Of Biologically relevant small molecule Binding residues on enzYmes) shows an accuracy of 84.0 %, Matthews Correlation Coefficient of 0.343 and F-measure of 39.0 % on 78 test enzymes. Further, scope of adding domain knowledge such as pocket information has also been investigated; results showed significant enhancement in method precision. Findings are hoped to boost the reliability of small-molecule ligand interaction prediction for enzyme applications and drug design. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Forecasting skills of the ensemble hydro-meteorological system for the Po river floods

    Science.gov (United States)

    Ricciardi, Giuseppe; Montani, Andrea; Paccagnella, Tiziana; Pecora, Silvano; Tonelli, Fabrizio

    2013-04-01

    The Po basin is the largest and most economically important river-basin in Italy. Extreme hydrological events, including floods, flash floods and droughts, are expected to become more severe in the next future due to climate change, and related ground effects are linked both with environmental and social resilience. A Warning Operational Center (WOC) for hydrological event management was created in Emilia Romagna region. In the last years, the WOC faced challenges in legislation, organization, technology and economics, achieving improvements in forecasting skill and information dissemination. Since 2005, an operational forecasting and modelling system for flood modelling and forecasting has been implemented, aimed at supporting and coordinating flood control and emergency management on the whole Po basin. This system, referred to as FEWSPo, has also taken care of environmental aspects of flood forecast. The FEWSPo system has reached a very high level of complexity, due to the combination of three different hydrological-hydraulic chains (HEC-HMS/RAS - MIKE11 NAM/HD, Topkapi/Sobek), with several meteorological inputs (forecasted - COSMOI2, COSMOI7, COSMO-LEPS among others - and observed). In this hydrological and meteorological ensemble the management of the relative predictive uncertainties, which have to be established and communicated to decision makers, is a debated scientific and social challenge. Real time activities face professional, modelling and technological aspects but are also strongly interrelated with organization and human aspects. The authors will report a case study using the operational flood forecast hydro-meteorological ensemble, provided by the MIKE11 chain fed by COSMO_LEPS EQPF. The basic aim of the proposed approach is to analyse limits and opportunities of the long term forecast (with a lead time ranging from 3 to 5 days), for the implementation of low cost actions, also looking for a well informed decision making and the improvement of

  5. On Ensemble Nonlinear Kalman Filtering with Symmetric Analysis Ensembles

    KAUST Repository

    Luo, Xiaodong

    2010-09-19

    The ensemble square root filter (EnSRF) [1, 2, 3, 4] is a popular method for data assimilation in high dimensional systems (e.g., geophysics models). Essentially the EnSRF is a Monte Carlo implementation of the conventional Kalman filter (KF) [5, 6]. It is mainly different from the KF at the prediction steps, where it is some ensembles, rather then the means and covariance matrices, of the system state that are propagated forward. In doing this, the EnSRF is computationally more efficient than the KF, since propagating a covariance matrix forward in high dimensional systems is prohibitively expensive. In addition, the EnSRF is also very convenient in implementation. By propagating the ensembles of the system state, the EnSRF can be directly applied to nonlinear systems without any change in comparison to the assimilation procedures in linear systems. However, by adopting the Monte Carlo method, the EnSRF also incurs certain sampling errors. One way to alleviate this problem is to introduce certain symmetry to the ensembles, which can reduce the sampling errors and spurious modes in evaluation of the means and covariances of the ensembles [7]. In this contribution, we present two methods to produce symmetric ensembles. One is based on the unscented transform [8, 9], which leads to the unscented Kalman filter (UKF) [8, 9] and its variant, the ensemble unscented Kalman filter (EnUKF) [7]. The other is based on Stirling’s interpolation formula (SIF), which results in the divided difference filter (DDF) [10]. Here we propose a simplified divided difference filter (sDDF) in the context of ensemble filtering. The similarity and difference between the sDDF and the EnUKF will be discussed. Numerical experiments will also be conducted to investigate the performance of the sDDF and the EnUKF, and compare them to a well‐established EnSRF, the ensemble transform Kalman filter (ETKF) [2].

  6. Regression trees for predicting mortality in patients with cardiovascular disease: What improvement is achieved by using ensemble-based methods?

    Science.gov (United States)

    Austin, Peter C; Lee, Douglas S; Steyerberg, Ewout W; Tu, Jack V

    2012-01-01

    In biomedical research, the logistic regression model is the most commonly used method for predicting the probability of a binary outcome. While many clinical researchers have expressed an enthusiasm for regression trees, this method may have limited accuracy for predicting health outcomes. We aimed to evaluate the improvement that is achieved by using ensemble-based methods, including bootstrap aggregation (bagging) of regression trees, random forests, and boosted regression trees. We analyzed 30-day mortality in two large cohorts of patients hospitalized with either acute myocardial infarction (N = 16,230) or congestive heart failure (N = 15,848) in two distinct eras (1999–2001 and 2004–2005). We found that both the in-sample and out-of-sample prediction of ensemble methods offered substantial improvement in predicting cardiovascular mortality compared to conventional regression trees. However, conventional logistic regression models that incorporated restricted cubic smoothing splines had even better performance. We conclude that ensemble methods from the data mining and machine learning literature increase the predictive performance of regression trees, but may not lead to clear advantages over conventional logistic regression models for predicting short-term mortality in population-based samples of subjects with cardiovascular disease. PMID:22777999

  7. Problems of a Statistical Ensemble Theory for Systems Far from Equilibrium

    Science.gov (United States)

    Ebeling, Werner

    The development of a general statistical physics of nonequilibrium systems was one of the main unfinished tasks of statistical physics of the 20th century. The aim of this work is the study of a special class of nonequilibrium systems where the formulation of an ensemble theory of some generality is possible. These are the so-called canonical-dissipative systems, where the driving terms are determined by invariants of motion. We construct canonical-dissipative systems which are ergodic on certain surfaces on the phase plane. These systems may be described by a non-equilibrium microcanocical ensemble, corresponding to an equal distribution on the target surface. Next we construct and solve Fokker-Planck equations; this leads to a kind of canonical-dissipative ensemble. In the last part we discuss the thoretical problem how to define bifurcations in the framework of nonequilibrium statistics and several possible applications.

  8. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria.

    Science.gov (United States)

    Chevrette, Marc G; Aicheler, Fabian; Kohlbacher, Oliver; Currie, Cameron R; Medema, Marnix H

    2017-10-15

    Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables prioritization and dereplication, and integration with other data types in discovery efforts. However, insufficient training data and a lack of clarity regarding prediction quality have impeded optimal use. Here, we introduce prediCAT, a new phylogenetics-inspired algorithm, which quantitatively estimates the degree of predictability of each A-domain. We then systematically benchmarked all algorithms on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. Subsequently, we developed SANDPUMA, a powerful ensemble algorithm, based on newly trained versions of all high-performing algorithms, which significantly outperforms individual methods. Finally, we deployed SANDPUMA in a systematic investigation of 7635 Actinobacteria genomes, suggesting that NRP chemical diversity is much higher than previously estimated. SANDPUMA has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool. SANDPUMA is freely available at https://bitbucket.org/chevrm/sandpuma and as a docker image at https://hub.docker.com/r/chevrm/sandpuma/ under the GNU Public License 3 (GPL3). chevrette@wisc.edu or marnix.medema@wur.nl. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. A hybrid nudging-ensemble Kalman filter approach to data assimilation. Part I: application in the Lorenz system

    Directory of Open Access Journals (Sweden)

    Lili Lei

    2012-05-01

    Full Text Available A hybrid data assimilation approach combining nudging and the ensemble Kalman filter (EnKF for dynamic analysis and numerical weather prediction is explored here using the non-linear Lorenz three-variable model system with the goal of a smooth, continuous and accurate data assimilation. The hybrid nudging-EnKF (HNEnKF computes the hybrid nudging coefficients from the flow-dependent, time-varying error covariance matrix from the EnKF's ensemble forecasts. It extends the standard diagonal nudging terms to additional off-diagonal statistical correlation terms for greater inter-variable influence of the innovations in the model's predictive equations to assist in the data assimilation process. The HNEnKF promotes a better fit of an analysis to data compared to that achieved by either nudging or incremental analysis update (IAU. When model error is introduced, it produces similar or better root mean square errors compared to the EnKF while minimising the error spikes/discontinuities created by the intermittent EnKF. It provides a continuous data assimilation with better inter-variable consistency and improved temporal smoothness than that of the EnKF. Data assimilation experiments are also compared to the ensemble Kalman smoother (EnKS. The HNEnKF has similar or better temporal smoothness than that of the EnKS, and with much smaller central processing unit (CPU time and data storage requirements.

  10. Real-time prediction of hand trajectory by ensembles of cortical neurons in primates

    Science.gov (United States)

    Wessberg, Johan; Stambaugh, Christopher R.; Kralik, Jerald D.; Beck, Pamela D.; Laubach, Mark; Chapin, John K.; Kim, Jung; Biggs, S. James; Srinivasan, Mandayam A.; Nicolelis, Miguel A. L.

    2000-11-01

    Signals derived from the rat motor cortex can be used for controlling one-dimensional movements of a robot arm. It remains unknown, however, whether real-time processing of cortical signals can be employed to reproduce, in a robotic device, the kind of complex arm movements used by primates to reach objects in space. Here we recorded the simultaneous activity of large populations of neurons, distributed in the premotor, primary motor and posterior parietal cortical areas, as non-human primates performed two distinct motor tasks. Accurate real-time predictions of one- and three-dimensional arm movement trajectories were obtained by applying both linear and nonlinear algorithms to cortical neuronal ensemble activity recorded from each animal. In addition, cortically derived signals were successfully used for real-time control of robotic devices, both locally and through the Internet. These results suggest that long-term control of complex prosthetic robot arm movements can be achieved by simple real-time transformations of neuronal population signals derived from multiple cortical areas in primates.

  11. The role of model dynamics in ensemble Kalman filter performance for chaotic systems

    Science.gov (United States)

    Ng, G.-H.C.; McLaughlin, D.; Entekhabi, D.; Ahanin, A.

    2011-01-01

    The ensemble Kalman filter (EnKF) is susceptible to losing track of observations, or 'diverging', when applied to large chaotic systems such as atmospheric and ocean models. Past studies have demonstrated the adverse impact of sampling error during the filter's update step. We examine how system dynamics affect EnKF performance, and whether the absence of certain dynamic features in the ensemble may lead to divergence. The EnKF is applied to a simple chaotic model, and ensembles are checked against singular vectors of the tangent linear model, corresponding to short-term growth and Lyapunov vectors, corresponding to long-term growth. Results show that the ensemble strongly aligns itself with the subspace spanned by unstable Lyapunov vectors. Furthermore, the filter avoids divergence only if the full linearized long-term unstable subspace is spanned. However, short-term dynamics also become important as non-linearity in the system increases. Non-linear movement prevents errors in the long-term stable subspace from decaying indefinitely. If these errors then undergo linear intermittent growth, a small ensemble may fail to properly represent all important modes, causing filter divergence. A combination of long and short-term growth dynamics are thus critical to EnKF performance. These findings can help in developing practical robust filters based on model dynamics. ?? 2011 The Authors Tellus A ?? 2011 John Wiley & Sons A/S.

  12. Uncertainty analysis of neural network based flood forecasting models: An ensemble based approach for constructing prediction interval

    Science.gov (United States)

    Kasiviswanathan, K.; Sudheer, K.

    2013-05-01

    Artificial neural network (ANN) based hydrologic models have gained lot of attention among water resources engineers and scientists, owing to their potential for accurate prediction of flood flows as compared to conceptual or physics based hydrologic models. The ANN approximates the non-linear functional relationship between the complex hydrologic variables in arriving at the river flow forecast values. Despite a large number of applications, there is still some criticism that ANN's point prediction lacks in reliability since the uncertainty of predictions are not quantified, and it limits its use in practical applications. A major concern in application of traditional uncertainty analysis techniques on neural network framework is its parallel computing architecture with large degrees of freedom, which makes the uncertainty assessment a challenging task. Very limited studies have considered assessment of predictive uncertainty of ANN based hydrologic models. In this study, a novel method is proposed that help construct the prediction interval of ANN flood forecasting model during calibration itself. The method is designed to have two stages of optimization during calibration: at stage 1, the ANN model is trained with genetic algorithm (GA) to obtain optimal set of weights and biases vector, and during stage 2, the optimal variability of ANN parameters (obtained in stage 1) is identified so as to create an ensemble of predictions. During the 2nd stage, the optimization is performed with multiple objectives, (i) minimum residual variance for the ensemble mean, (ii) maximum measured data points to fall within the estimated prediction interval and (iii) minimum width of prediction interval. The method is illustrated using a real world case study of an Indian basin. The method was able to produce an ensemble that has an average prediction interval width of 23.03 m3/s, with 97.17% of the total validation data points (measured) lying within the interval. The derived

  13. Climate Prediction Center (CPC)Ensemble Canonical Correlation Analysis 90-Day Seasonal Forecast of Precipitation

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Ensemble Canonical Correlation Analysis (ECCA) precipitation forecast is a 90-day (seasonal) outlook of US surface precipitation anomalies. The ECCA uses...

  14. Climate Prediction Center(CPC)Ensemble Canonical Correlation Analysis Forecast of Temperature

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Ensemble Canonical Correlation Analysis (ECCA) temperature forecast is a 90-day (seasonal) outlook of US surface temperature anomalies. The ECCA uses Canonical...

  15. The Ensembl genome database project.

    Science.gov (United States)

    Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

    2002-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

  16. The Experimental Regional Ensemble Forecast System (ExREF): Its Use in NWS Forecast Operations and Preliminary Verification

    Science.gov (United States)

    Reynolds, David; Rasch, William; Kozlowski, Daniel; Burks, Jason; Zavodsky, Bradley; Bernardet, Ligia; Jankov, Isidora; Albers, Steve

    2014-01-01

    The Experimental Regional Ensemble Forecast (ExREF) system is a tool for the development and testing of new Numerical Weather Prediction (NWP) methodologies. ExREF is run in near-realtime by the Global Systems Division (GSD) of the NOAA Earth System Research Laboratory (ESRL) and its products are made available through a website, an ftp site, and via the Unidata Local Data Manager (LDM). The ExREF domain covers most of North America and has 9-km horizontal grid spacing. The ensemble has eight members, all employing WRF-ARW. The ensemble uses a variety of initial conditions from LAPS and the Global Forecasting System (GFS) and multiple boundary conditions from the GFS ensemble. Additionally, a diversity of physical parameterizations is used to increase ensemble spread and to account for the uncertainty in forecasting extreme precipitation events. ExREF has been a component of the Hydrometeorology Testbed (HMT) NWP suite in the 2012-2013 and 2013-2014 winters. A smaller domain covering just the West Coast was created to minimize band-width consumption for the NWS. This smaller domain has and is being distributed to the National Weather Service (NWS) Weather Forecast Office and California Nevada River Forecast Center in Sacramento, California, where it is ingested into the Advanced Weather Interactive Processing System (AWIPS I and II) to provide guidance on the forecasting of extreme precipitation events. This paper will review the cooperative effort employed by NOAA ESRL, NASA SPoRT (Short-term Prediction Research and Transition Center), and the NWS to facilitate the ingest and display of ExREF data utilizing the AWIPS I and II D2D and GFE (Graphical Software Editor) software. Within GFE is a very useful verification software package called BoiVer that allows the NWS to utilize the River Forecast Center's 4 km gridded QPE to compare with all operational NWP models 6-hr QPF along with the ExREF mean 6-hr QPF so the forecasters can build confidence in the use of the

  17. Optimized expanded ensembles for simulations involving molecular insertions and deletions. II. Open systems

    Science.gov (United States)

    Escobedo, Fernando A.

    2007-11-01

    In the Grand Canonical, osmotic, and Gibbs ensembles, chemical potential equilibrium is attained via transfers of molecules between the system and either a reservoir or another subsystem. In this work, the expanded ensemble (EXE) methods described in part I [F. A. Escobedo and F. J. Martínez-Veracoechea, J. Chem. Phys. 127, 174103 (2007)] of this series are extended to these ensembles to overcome the difficulties associated with implementing such whole-molecule transfers. In EXE, such moves occur via a target molecule that undergoes transitions through a number of intermediate coupling states. To minimize the tunneling time between the fully coupled and fully decoupled states, the intermediate states could be either: (i) sampled with an optimal frequency distribution (the sampling problem) or (ii) selected with an optimal spacing distribution (staging problem). The sampling issue is addressed by determining the biasing weights that would allow generating an optimal ensemble; discretized versions of this algorithm (well suited for small number of coupling stages) are also presented. The staging problem is addressed by selecting the intermediate stages in such a way that a flat histogram is the optimized ensemble. The validity of the advocated methods is demonstrated by their application to two model problems, the solvation of large hard spheres into a fluid of small and large spheres, and the vapor-liquid equilibrium of a chain system.

  18. Mechanisms of appearance of amplitude and phase chimera states in ensembles of nonlocally coupled chaotic systems

    Science.gov (United States)

    Bogomolov, Sergey A.; Slepnev, Andrei V.; Strelkova, Galina I.; Schöll, Eckehard; Anishchenko, Vadim S.

    2017-02-01

    We explore the bifurcation transition from coherence to incoherence in ensembles of nonlocally coupled chaotic systems. It is firstly shown that two types of chimera states, namely, amplitude and phase, can be found in a network of coupled logistic maps, while only amplitude chimera states can be observed in a ring of continuous-time chaotic systems. We reveal a bifurcation mechanism by analyzing the evolution of space-time profiles and the coupling function with varying coupling coefficient and formulate the necessary and sufficient conditions for realizing the chimera states in the ensembles.

  19. Impact of Representing Model Error in a Hybrid Ensemble-Variational Data Assimilation System for Track Forecast of Tropical Cyclones over the Bay of Bengal

    Science.gov (United States)

    Kutty, Govindan; Muraleedharan, Rohit; Kesarkar, Amit P.

    2018-03-01

    Uncertainties in the numerical weather prediction models are generally not well-represented in ensemble-based data assimilation (DA) systems. The performance of an ensemble-based DA system becomes suboptimal, if the sources of error are undersampled in the forecast system. The present study examines the effect of accounting for model error treatments in the hybrid ensemble transform Kalman filter—three-dimensional variational (3DVAR) DA system (hybrid) in the track forecast of two tropical cyclones viz. Hudhud and Thane, formed over the Bay of Bengal, using Advanced Research Weather Research and Forecasting (ARW-WRF) model. We investigated the effect of two types of model error treatment schemes and their combination on the hybrid DA system; (i) multiphysics approach, which uses different combination of cumulus, microphysics and planetary boundary layer schemes, (ii) stochastic kinetic energy backscatter (SKEB) scheme, which perturbs the horizontal wind and potential temperature tendencies, (iii) a combination of both multiphysics and SKEB scheme. Substantial improvements are noticed in the track positions of both the cyclones, when flow-dependent ensemble covariance is used in 3DVAR framework. Explicit model error representation is found to be beneficial in treating the underdispersive ensembles. Among the model error schemes used in this study, a combination of multiphysics and SKEB schemes has outperformed the other two schemes with improved track forecast for both the tropical cyclones.

  20. A Novel Multiscale Ensemble Carbon Price Prediction Model Integrating Empirical Mode Decomposition, Genetic Algorithm and Artificial Neural Network

    Directory of Open Access Journals (Sweden)

    Bangzhu Zhu

    2012-02-01

    Full Text Available Due to the movement and complexity of the carbon market, traditional monoscale forecasting approaches often fail to capture its nonstationary and nonlinear properties and accurately describe its moving tendencies. In this study, a multiscale ensemble forecasting model integrating empirical mode decomposition (EMD, genetic algorithm (GA and artificial neural network (ANN is proposed to forecast carbon price. Firstly, the proposed model uses EMD to decompose carbon price data into several intrinsic mode functions (IMFs and one residue. Then, the IMFs and residue are composed into a high frequency component, a low frequency component and a trend component which have similar frequency characteristics, simple components and strong regularity using the fine-to-coarse reconstruction algorithm. Finally, those three components are predicted using an ANN trained by GA, i.e., a GAANN model, and the final forecasting results can be obtained by the sum of these three forecasting results. For verification and testing, two main carbon future prices with different maturity in the European Climate Exchange (ECX are used to test the effectiveness of the proposed multiscale ensemble forecasting model. Empirical results obtained demonstrate that the proposed multiscale ensemble forecasting model can outperform the single random walk (RW, ARIMA, ANN and GAANN models without EMD preprocessing and the ensemble ARIMA model with EMD preprocessing.

  1. Method of collective variables with reference system for the grand canonical ensemble

    International Nuclear Information System (INIS)

    Yukhnovskii, I.R.

    1989-01-01

    A method of collective variables with special reference system for the grand canonical ensemble is presented. An explicit form is obtained for the basis sixth-degree measure density needed to describe the liquid-gas phase transition. Here the author presents the fundamentals of the method, which are as follows: (1) the functional form for the partition function in the grand canonical ensemble; (2) derivation of thermodynamic relations for the coefficients of the Jacobian; (3) transition to the problem on an adequate lattice; and (4) obtaining of the explicit form for the functional of the partition function

  2. On the v-representability of ensemble densities of electron systems

    Science.gov (United States)

    Gonis, A.; Däne, M.

    2018-05-01

    Analogously to the case at zero temperature, where the density of the ground state of an interacting many-particle system determines uniquely (within an arbitrary additive constant) the external potential acting on the system, the thermal average of the density over an ensemble defined by the Boltzmann distribution at the minimum of the thermodynamic potential, or the free energy, determines the external potential uniquely (and not just modulo a constant) acting on a system described by this thermodynamic potential or free energy. The paper describes a formal procedure that generates the domain of a constrained search over general ensembles (at zero or elevated temperatures) that lead to a given density, including as a special case a density thermally averaged at a given temperature, and in the case of a v-representable density determines the external potential leading to the ensemble density. As an immediate consequence of the general formalism, the concept of v-representability is extended beyond the hitherto discussed case of ground state densities to encompass excited states as well. Specific application to thermally averaged densities solves the v-representability problem in connection with the Mermin functional in a manner analogous to that in which this problem was recently settled with respect to the Hohenberg and Kohn functional. The main formalism is illustrated with numerical results for ensembles of one-dimensional, non-interacting systems of particles under a harmonic potential.

  3. Wave ensemble forecast system for tropical cyclones in the Australian region

    Science.gov (United States)

    Zieger, Stefan; Greenslade, Diana; Kepert, Jeffrey D.

    2018-05-01

    Forecasting of waves under extreme conditions such as tropical cyclones is vitally important for many offshore industries, but there remain many challenges. For Northwest Western Australia (NW WA), wave forecasts issued by the Australian Bureau of Meteorology have previously been limited to products from deterministic operational wave models forced by deterministic atmospheric models. The wave models are run over global (resolution 1/4∘) and regional (resolution 1/10∘) domains with forecast ranges of + 7 and + 3 day respectively. Because of this relatively coarse resolution (both in the wave models and in the forcing fields), the accuracy of these products is limited under tropical cyclone conditions. Given this limited accuracy, a new ensemble-based wave forecasting system for the NW WA region has been developed. To achieve this, a new dedicated 8-km resolution grid was nested in the global wave model. Over this grid, the wave model is forced with winds from a bias-corrected European Centre for Medium Range Weather Forecast atmospheric ensemble that comprises 51 ensemble members to take into account the uncertainties in location, intensity and structure of a tropical cyclone system. A unique technique is used to select restart files for each wave ensemble member. The system is designed to operate in real time during the cyclone season providing + 10-day forecasts. This paper will describe the wave forecast components of this system and present the verification metrics and skill for specific events.

  4. Predictive systems ecology.

    Science.gov (United States)

    Evans, Matthew R; Bithell, Mike; Cornell, Stephen J; Dall, Sasha R X; Díaz, Sandra; Emmott, Stephen; Ernande, Bruno; Grimm, Volker; Hodgson, David J; Lewis, Simon L; Mace, Georgina M; Morecroft, Michael; Moustakas, Aristides; Murphy, Eugene; Newbold, Tim; Norris, K J; Petchey, Owen; Smith, Matthew; Travis, Justin M J; Benton, Tim G

    2013-11-22

    Human societies, and their well-being, depend to a significant extent on the state of the ecosystems that surround them. These ecosystems are changing rapidly usually in response to anthropogenic changes in the environment. To determine the likely impact of environmental change on ecosystems and the best ways to manage them, it would be desirable to be able to predict their future states. We present a proposal to develop the paradigm of predictive systems ecology, explicitly to understand and predict the properties and behaviour of ecological systems. We discuss the necessary and desirable features of predictive systems ecology models. There are places where predictive systems ecology is already being practised and we summarize a range of terrestrial and marine examples. Significant challenges remain but we suggest that ecology would benefit both as a scientific discipline and increase its impact in society if it were to embrace the need to become more predictive.

  5. An Assessment of the Subseasonal Forecast Performance in the Extended Global Ensemble Forecast System (GEFS)

    Science.gov (United States)

    Sinsky, E.; Zhu, Y.; Li, W.; Guan, H.; Melhauser, C.

    2017-12-01

    Optimal forecast quality is crucial for the preservation of life and property. Improving monthly forecast performance over both the tropics and extra-tropics requires attention to various physical aspects such as the representation of the underlying SST, model physics and the representation of the model physics uncertainty for an ensemble forecast system. This work focuses on the impact of stochastic physics, SST and the convection scheme on forecast performance for the sub-seasonal scale over the tropics and extra-tropics with emphasis on the Madden-Julian Oscillation (MJO). A 2-year period is evaluated using the National Centers for Environmental Prediction (NCEP) Global Ensemble Forecast System (GEFS). Three experiments with different configurations than the operational GEFS were performed to illustrate the impact of the stochastic physics, SST and convection scheme. These experiments are compared against a control experiment (CTL) which consists of the operational GEFS but its integration is extended from 16 to 35 days. The three configurations are: 1) SPs, which uses a Stochastically Perturbed Physics Tendencies (SPPT), Stochastic Perturbed Humidity (SHUM) and Stochastic Kinetic Energy Backscatter (SKEB); 2) SPs+SST_bc, which uses a combination of SPs and a bias-corrected forecast SST from the NCEP Climate Forecast System Version 2 (CFSv2); and 3) SPs+SST_bc+SA_CV, which combines SPs, a bias-corrected forecast SST and a scale aware convection scheme. When comparing to the CTL experiment, SPs shows substantial improvement. The MJO skill has improved by about 4 lead days during the 2-year period. Improvement is also seen over the extra-tropics due to the updated stochastic physics, where there is a 3.1% and a 4.2% improvement during weeks 3 and 4 over the northern hemisphere and southern hemisphere, respectively. Improvement is also seen when the bias-corrected CFSv2 SST is combined with SPs. Additionally, forecast performance enhances when the scale aware

  6. Impacts of calibration strategies and ensemble methods on ensemble flood forecasting over Lanjiang basin, Southeast China

    Science.gov (United States)

    Liu, Li; Xu, Yue-Ping

    2017-04-01

    Ensemble flood forecasting driven by numerical weather prediction products is becoming more commonly used in operational flood forecasting applications.In this study, a hydrological ensemble flood forecasting system based on Variable Infiltration Capacity (VIC) model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated.The hydrological model is optimized by parallel programmed ɛ-NSGAII multi-objective algorithm and two respectively parameterized models are determined to simulate daily flows and peak flows coupled with a modular approach.The results indicatethat the ɛ-NSGAII algorithm permits more efficient optimization and rational determination on parameter setting.It is demonstrated that the multimodel ensemble streamflow mean have better skills than the best singlemodel ensemble mean (ECMWF) and the multimodel ensembles weighted on members and skill scores outperform other multimodel ensembles. For typical flood event, it is proved that the flood can be predicted 3-4 days in advance, but the flows in rising limb can be captured with only 1-2 days ahead due to the flash feature. With respect to peak flows selected by Peaks Over Threshold approach, the ensemble means from either singlemodel or multimodels are generally underestimated as the extreme values are smoothed out by ensemble process.

  7. Simulating ensembles of nonlinear continuous time dynamical systems via active ultra wideband wireless network

    Energy Technology Data Exchange (ETDEWEB)

    Dmitriev, Alexander S.; Yemelyanov, Ruslan Yu. [V.A. Kotelnikov Institute of Radio Engineering and Electronics of the RAS Mokhovaya 11-7, Moscow, 125009 (Russian Federation); Moscow Institute of Physics and Technology (State University) 9 Institutskiy per., Dolgoprudny, Moscow, 141700 (Russian Federation); Gerasimov, Mark Yu. [V.A. Kotelnikov Institute of Radio Engineering and Electronics of the RAS Mokhovaya 11-7, Moscow, 125009 (Russian Federation); Itskov, Vadim V. [Moscow Institute of Physics and Technology (State University) 9 Institutskiy per., Dolgoprudny, Moscow, 141700 (Russian Federation)

    2016-06-08

    The paper deals with a new multi-element processor platform assigned for modelling the behaviour of interacting dynamical systems, i.e., active wireless network. Experimentally, this ensemble is implemented in an active network, the active nodes of which include direct chaotic transceivers and special actuator boards containing microcontrollers for modelling the dynamical systems and an information display unit (colored LEDs). The modelling technique and experimental results are described and analyzed.

  8. Level-statistics in Disordered Systems: A single parametric scaling and Connection to Brownian Ensembles

    OpenAIRE

    Shukla, Pragya

    2004-01-01

    We find that the statistics of levels undergoing metal-insulator transition in systems with multi-parametric Gaussian disorders and non-interacting electrons behaves in a way similar to that of the single parametric Brownian ensembles \\cite{dy}. The latter appear during a Poisson $\\to$ Wigner-Dyson transition, driven by a random perturbation. The analogy provides the analytical evidence for the single parameter scaling of the level-correlations in disordered systems as well as a tool to obtai...

  9. Validation of the Air Force Weather Agency Ensemble Prediction Systems

    Science.gov (United States)

    2014-03-27

    by Mr. Evan L. Kuchera. Also, I would like to express my gratitude to Mr. Jeff H. Zaunter for painstakingly working with me to provided station...my fellow AFIT classmates, Capt Jeremy J. Hromsco, Capt Haley A. Homan, Capt Kyle R. Thurmond and 2Lt Coy C. Fischer for their support and...Codes. The raw METARs and SPECIs were decoded and provided for this research by Mr. Jeff Zautner, 14/WS Meteorologist, Tailored Product Analyst

  10. Monthly hydrometeorological ensemble prediction of streamflow droughts and corresponding drought indices

    Directory of Open Access Journals (Sweden)

    F. Fundel

    2013-01-01

    Full Text Available Streamflow droughts, characterized by low runoff as consequence of a drought event, affect numerous aspects of life. Economic sectors that are impacted by low streamflow are, e.g., power production, agriculture, tourism, water quality management and shipping. Those sectors could potentially benefit from forecasts of streamflow drought events, even of short events on the monthly time scales or below. Numerical hydrometeorological models have increasingly been used to forecast low streamflow and have become the focus of recent research. Here, we consider daily ensemble runoff forecasts for the river Thur, which has its source in the Swiss Alps. We focus on the evaluation of low streamflow and of the derived indices as duration, severity and magnitude, characterizing streamflow droughts up to a lead time of one month.

    The ECMWF VarEPS 5-member ensemble reforecast, which covers 18 yr, is used as forcing for the hydrological model PREVAH. A thorough verification reveals that, compared to probabilistic peak-flow forecasts, which show skill up to a lead time of two weeks, forecasts of streamflow droughts are skilful over the entire forecast range of one month. For forecasts at the lower end of the runoff regime, the quality of the initial state seems to be crucial to achieve a good forecast quality in the longer range. It is shown that the states used in this study to initialize forecasts satisfy this requirement. The produced forecasts of streamflow drought indices, derived from the ensemble forecasts, could be beneficially included in a decision-making process. This is valid for probabilistic forecasts of streamflow drought events falling below a daily varying threshold, based on a quantile derived from a runoff climatology. Although the forecasts have a tendency to overpredict streamflow droughts, it is shown that the relative economic value of the ensemble forecasts reaches up to 60%, in case a forecast user is able to take preventive

  11. Monthly hydrometeorological ensemble prediction of streamflow droughts and corresponding drought indices

    Science.gov (United States)

    Fundel, F.; Jörg-Hess, S.; Zappa, M.

    2013-01-01

    Streamflow droughts, characterized by low runoff as consequence of a drought event, affect numerous aspects of life. Economic sectors that are impacted by low streamflow are, e.g., power production, agriculture, tourism, water quality management and shipping. Those sectors could potentially benefit from forecasts of streamflow drought events, even of short events on the monthly time scales or below. Numerical hydrometeorological models have increasingly been used to forecast low streamflow and have become the focus of recent research. Here, we consider daily ensemble runoff forecasts for the river Thur, which has its source in the Swiss Alps. We focus on the evaluation of low streamflow and of the derived indices as duration, severity and magnitude, characterizing streamflow droughts up to a lead time of one month. The ECMWF VarEPS 5-member ensemble reforecast, which covers 18 yr, is used as forcing for the hydrological model PREVAH. A thorough verification reveals that, compared to probabilistic peak-flow forecasts, which show skill up to a lead time of two weeks, forecasts of streamflow droughts are skilful over the entire forecast range of one month. For forecasts at the lower end of the runoff regime, the quality of the initial state seems to be crucial to achieve a good forecast quality in the longer range. It is shown that the states used in this study to initialize forecasts satisfy this requirement. The produced forecasts of streamflow drought indices, derived from the ensemble forecasts, could be beneficially included in a decision-making process. This is valid for probabilistic forecasts of streamflow drought events falling below a daily varying threshold, based on a quantile derived from a runoff climatology. Although the forecasts have a tendency to overpredict streamflow droughts, it is shown that the relative economic value of the ensemble forecasts reaches up to 60%, in case a forecast user is able to take preventive action based on the forecast.

  12. Flow-dependent empirical singular vector with an ensemble Kalman filter data assimilation for El Nino prediction

    Energy Technology Data Exchange (ETDEWEB)

    Ham, Yoo-Geun [NASA/GSFC Code 610.1, Global Modeling and Assimilation Office, Greenbelt, MD (United States); Universities Space Research Association, Goddard Earth Sciences Technology and Research Studies and Investigations, Baltimore, MD (United States); Rienecker, Michele M. [NASA/GSFC Code 610.1, Global Modeling and Assimilation Office, Greenbelt, MD (United States)

    2012-10-15

    In this study, a new approach for extracting flow-dependent empirical singular vectors (FESVs) for seasonal prediction using ensemble perturbations obtained from an ensemble Kalman filter (EnKF) assimilation is presented. Due to the short interval between analyses, EnKF perturbations primarily contain instabilities related to fast weather variability. To isolate slower, coupled instabilities that would be more suitable for seasonal prediction, an empirical linear operator for seasonal time-scales (i.e. several months) is formulated using a causality hypothesis; then, the most unstable mode from the linear operator is extracted for seasonal time-scales. It is shown that the flow-dependent operator represents nonlinear integration results better than a conventional empirical linear operator static in time. Through 20 years of retrospective seasonal predictions, it is shown that the skill of forecasting equatorial SST anomalies using the FESV is systematically improved over that using Conventional ESV (CESV). For example, the correlation skill of the NINO3 SST index using FESV is higher, by about 0.1, than that of CESV at 8-month leads. In addition, the forecast skill improvement is significant over the locations where the correlation skill of conventional methods is relatively low, indicating that the FESV is effective where the initial uncertainty is large. (orig.)

  13. Combining structural modeling with ensemble machine learning to accurately predict protein fold stability and binding affinity effects upon mutation.

    Directory of Open Access Journals (Sweden)

    Niklas Berliner

    Full Text Available Advances in sequencing have led to a rapid accumulation of mutations, some of which are associated with diseases. However, to draw mechanistic conclusions, a biochemical understanding of these mutations is necessary. For coding mutations, accurate prediction of significant changes in either the stability of proteins or their affinity to their binding partners is required. Traditional methods have used semi-empirical force fields, while newer methods employ machine learning of sequence and structural features. Here, we show how combining both of these approaches leads to a marked boost in accuracy. We introduce ELASPIC, a novel ensemble machine learning approach that is able to predict stability effects upon mutation in both, domain cores and domain-domain interfaces. We combine semi-empirical energy terms, sequence conservation, and a wide variety of molecular details with a Stochastic Gradient Boosting of Decision Trees (SGB-DT algorithm. The accuracy of our predictions surpasses existing methods by a considerable margin, achieving correlation coefficients of 0.77 for stability, and 0.75 for affinity predictions. Notably, we integrated homology modeling to enable proteome-wide prediction and show that accurate prediction on modeled structures is possible. Lastly, ELASPIC showed significant differences between various types of disease-associated mutations, as well as between disease and common neutral mutations. Unlike pure sequence-based prediction methods that try to predict phenotypic effects of mutations, our predictions unravel the molecular details governing the protein instability, and help us better understand the molecular causes of diseases.

  14. Ensemble of different approaches for a reliable person re-identification system

    Directory of Open Access Journals (Sweden)

    Loris Nanni

    2016-07-01

    Full Text Available An ensemble of approaches for reliable person re-identification is proposed in this paper. The proposed ensemble is built combining widely used person re-identification systems using different color spaces and some variants of state-of-the-art approaches that are proposed in this paper. Different descriptors are tested, and both texture and color features are extracted from the images; then the different descriptors are compared using different distance measures (e.g., the Euclidean distance, angle, and the Jeffrey distance. To improve performance, a method based on skeleton detection, extracted from the depth map, is also applied when the depth map is available. The proposed ensemble is validated on three widely used datasets (CAVIAR4REID, IAS, and VIPeR, keeping the same parameter set of each approach constant across all tests to avoid overfitting and to demonstrate that the proposed system can be considered a general-purpose person re-identification system. Our experimental results show that the proposed system offers significant improvements over baseline approaches. The source code used for the approaches tested in this paper will be available at https://www.dei.unipd.it/node/2357 and http://robotics.dei.unipd.it/reid/.

  15. Improved predictive mapping of indoor radon concentrations using ensemble regression trees based on automatic clustering of geological units

    International Nuclear Information System (INIS)

    Kropat, Georg; Bochud, Francois; Jaboyedoff, Michel; Laedermann, Jean-Pascal; Murith, Christophe; Palacios, Martha; Baechler, Sébastien

    2015-01-01

    Purpose: According to estimations around 230 people die as a result of radon exposure in Switzerland. This public health concern makes reliable indoor radon prediction and mapping methods necessary in order to improve risk communication to the public. The aim of this study was to develop an automated method to classify lithological units according to their radon characteristics and to develop mapping and predictive tools in order to improve local radon prediction. Method: About 240 000 indoor radon concentration (IRC) measurements in about 150 000 buildings were available for our analysis. The automated classification of lithological units was based on k-medoids clustering via pair-wise Kolmogorov distances between IRC distributions of lithological units. For IRC mapping and prediction we used random forests and Bayesian additive regression trees (BART). Results: The automated classification groups lithological units well in terms of their IRC characteristics. Especially the IRC differences in metamorphic rocks like gneiss are well revealed by this method. The maps produced by random forests soundly represent the regional difference of IRCs in Switzerland and improve the spatial detail compared to existing approaches. We could explain 33% of the variations in IRC data with random forests. Additionally, the influence of a variable evaluated by random forests shows that building characteristics are less important predictors for IRCs than spatial/geological influences. BART could explain 29% of IRC variability and produced maps that indicate the prediction uncertainty. Conclusion: Ensemble regression trees are a powerful tool to model and understand the multidimensional influences on IRCs. Automatic clustering of lithological units complements this method by facilitating the interpretation of radon properties of rock types. This study provides an important element for radon risk communication. Future approaches should consider taking into account further variables

  16. An automatic counting and recording system (1963); Ensemble de comptage a enregistrement automatique (1963)

    Energy Technology Data Exchange (ETDEWEB)

    Pierre, B. [Commissariat a l' Energie Atomique, Saclay (France). Centre d' Etudes Nucleaires

    1961-09-15

    An automatic control, counting and programing system for the collection of single crystal diffractometry data was designed by the author for a neutron diffractometer in 1958 at C.E.N - Grenoble. A part of the whole instrument, 'The Automatic Counting and Recording System', is described in this paper. Its applications are numerous and extensive, e.g.: the system has been designed for neutron diffractometer, but it can easily be adapted either for use with X-rays or measurement of mean life in {beta} decay analysis. (author) [French] Un ensemble automatique de telecommande, comptage et programmation pour la diffractometrie a cristal unique a ete etudie et realise par l'auteur pour la diffraction des neutrons en 1958 au C.E.N - Grenoble. Le present rapport decrit a ''l'Ensemble de Comptage a Enregistrement Automatique'' qui est une partie de l'appareillage complet. Ses applications sont nombreuses et peuvent s'etendre a de nouveaux domaines. En effet cet ensemble qui a ete etudie pour fonctionner avec un diffractometre a neutron, peut facilement s'adapter a la technique de diffraction des rayons X ou par exemple a celle de decroisasnce d'activite {beta}. (auteur)

  17. An automatic counting and recording system (1963); Ensemble de comptage a enregistrement automatique (1963)

    Energy Technology Data Exchange (ETDEWEB)

    Pierre, B [Commissariat a l' Energie Atomique, Saclay (France). Centre d' Etudes Nucleaires

    1961-09-15

    An automatic control, counting and programing system for the collection of single crystal diffractometry data was designed by the author for a neutron diffractometer in 1958 at C.E.N - Grenoble. A part of the whole instrument, 'The Automatic Counting and Recording System', is described in this paper. Its applications are numerous and extensive, e.g.: the system has been designed for neutron diffractometer, but it can easily be adapted either for use with X-rays or measurement of mean life in {beta} decay analysis. (author) [French] Un ensemble automatique de telecommande, comptage et programmation pour la diffractometrie a cristal unique a ete etudie et realise par l'auteur pour la diffraction des neutrons en 1958 au C.E.N - Grenoble. Le present rapport decrit a ''l'Ensemble de Comptage a Enregistrement Automatique'' qui est une partie de l'appareillage complet. Ses applications sont nombreuses et peuvent s'etendre a de nouveaux domaines. En effet cet ensemble qui a ete etudie pour fonctionner avec un diffractometre a neutron, peut facilement s'adapter a la technique de diffraction des rayons X ou par exemple a celle de decroisasnce d'activite {beta}. (auteur)

  18. A Bayesian posterior predictive framework for weighting ensemble regional climate models

    Directory of Open Access Journals (Sweden)

    Y. Fan

    2017-06-01

    Full Text Available We present a novel Bayesian statistical approach to computing model weights in climate change projection ensembles in order to create probabilistic projections. The weight of each climate model is obtained by weighting the current day observed data under the posterior distribution admitted under competing climate models. We use a linear model to describe the model output and observations. The approach accounts for uncertainty in model bias, trend and internal variability, including error in the observations used. Our framework is general, requires very little problem-specific input, and works well with default priors. We carry out cross-validation checks that confirm that the method produces the correct coverage.

  19. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  20. Pre- and post-processing of hydro-meteorological ensembles for the Norwegian flood forecasting system in 145 basins.

    Science.gov (United States)

    Jahr Hegdahl, Trine; Steinsland, Ingelin; Merete Tallaksen, Lena; Engeland, Kolbjørn

    2016-04-01

    Probabilistic flood forecasting has an added value for decision making. The Norwegian flood forecasting service is based on a flood forecasting model that run for 145 basins. Covering all of Norway the basins differ in both size and hydrological regime. Currently the flood forecasting is based on deterministic meteorological forecasts, and an auto-regressive procedure is used to achieve probabilistic forecasts. An alternative approach is to use meteorological and hydrological ensemble forecasts to quantify the uncertainty in forecasted streamflow. The hydrological ensembles are based on forcing a hydrological model with meteorological ensemble forecasts of precipitation and temperature. However, the ensembles of precipitation are often biased and the spread is too small, especially for the shortest lead times, i.e. they are not calibrated. These properties will, to some extent, propagate to hydrological ensembles, that most likely will be uncalibrated as well. Pre- and post-processing methods are commonly used to obtain calibrated meteorological and hydrological ensembles respectively. Quantitative studies showing the effect of the combined processing of the meteorological (pre-processing) and the hydrological (post-processing) ensembles are however few. The aim of this study is to evaluate the influence of pre- and post-processing on the skill of streamflow predictions, and we will especially investigate if the forecasting skill depends on lead-time, basin size and hydrological regime. This aim is achieved by applying the 51 medium-range ensemble forecast of precipitation and temperature provided by the European Center of Medium-Range Weather Forecast (ECMWF). These ensembles are used as input to the operational Norwegian flood forecasting model, both raw and pre-processed. Precipitation ensembles are calibrated using a zero-adjusted gamma distribution. Temperature ensembles are calibrated using a Gaussian distribution and altitude corrected by a constant gradient

  1. Short-range ensemble predictions based on convection perturbations in the Eta Model for the Serra do Mar region in Brazil

    Science.gov (United States)

    Bustamante, J. F. F.; Chou, S. C.; Gomes, J. L.

    2009-04-01

    The Southeast Brazil, in the coastal and mountain region called Serra do Mar, between Sao Paulo and Rio de Janeiro, is subject to frequent events of landslides and floods. The Eta Model has been producing good quality forecasts over South America at about 40-km horizontal resolution. For that type of hazards, however, more detailed and probabilistic information on the risks should be provided with the forecasts. Thus, a short-range ensemble prediction system (SREPS) based on the Eta Model is being constructed. Ensemble members derived from perturbed initial and lateral boundary conditions did not provide enough spread for the forecasts. Members with model physics perturbation are being included and tested. The objective of this work is to construct more members for the Eta SREPS by adding physics perturbed members. The Eta Model is configured at 10-km resolution and 38 layers in the vertical. The domain covered is most of Southeast Brazil, centered over the Serra do Mar region. The constructed members comprise variations of the cumulus parameterization Betts-Miller-Janjic (BMJ) and Kain-Fritsch (KF) schemes. Three members were constructed from the BMJ scheme by varying the deficit of saturation pressure profile over land and sea, and 2 members of the KF scheme were included using the standard KF and a momentum flux added to KF scheme version. One of the runs with BMJ scheme is the control run as it was used for the initial condition perturbation SREPS. The forecasts were tested for 6 cases of South America Convergence Zone (SACZ) events. The SACZ is a common summer season feature of Southern Hemisphere that causes persistent rain for a few days over the Southeast Brazil and it frequently organizes over Serra do Mar region. These events are particularly interesting because of the persistent rains that can accumulate large amounts and cause generalized landslides and death. With respect to precipitation, the KF scheme versions have shown to be able to reach the

  2. Potential predictability and forecast skill in ensemble climate forecast: the skill-persistence rule

    Science.gov (United States)

    Jin, Y.; Rong, X.; Liu, Z.

    2017-12-01

    This study investigates the factors that impact the forecast skill for the real world (actual skill) and perfect model (perfect skill) in ensemble climate model forecast with a series of fully coupled general circulation model forecast experiments. It is found that the actual skill of sea surface temperature (SST) in seasonal forecast is substantially higher than the perfect skill on a large part of the tropical oceans, especially the tropical Indian Ocean and the central-eastern Pacific Ocean. The higher actual skill is found to be related to the higher observational SST persistence, suggesting a skill-persistence rule: a higher SST persistence in the real world than in the model could overwhelm the model bias to produce a higher forecast skill for the real world than for the perfect model. The relation between forecast skill and persistence is further examined using a first-order autoregressive model (AR1) analytically for theoretical solutions and numerically for analogue experiments. The AR1 model study shows that the skill-persistence rule is strictly valid in the case of infinite ensemble size, but can be distorted by the sampling error and non-AR1 processes.

  3. In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches

    Energy Technology Data Exchange (ETDEWEB)

    Singh, Kunwar P., E-mail: kpsingh_52@yahoo.com; Gupta, Shikha

    2014-03-15

    Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R{sup 2}) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R{sup 2} and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the

  4. In silico prediction of toxicity of non-congeneric industrial chemicals using ensemble learning based modeling approaches

    International Nuclear Information System (INIS)

    Singh, Kunwar P.; Gupta, Shikha

    2014-01-01

    Ensemble learning approach based decision treeboost (DTB) and decision tree forest (DTF) models are introduced in order to establish quantitative structure–toxicity relationship (QSTR) for the prediction of toxicity of 1450 diverse chemicals. Eight non-quantum mechanical molecular descriptors were derived. Structural diversity of the chemicals was evaluated using Tanimoto similarity index. Stochastic gradient boosting and bagging algorithms supplemented DTB and DTF models were constructed for classification and function optimization problems using the toxicity end-point in T. pyriformis. Special attention was drawn to prediction ability and robustness of the models, investigated both in external and 10-fold cross validation processes. In complete data, optimal DTB and DTF models rendered accuracies of 98.90%, 98.83% in two-category and 98.14%, 98.14% in four-category toxicity classifications. Both the models further yielded classification accuracies of 100% in external toxicity data of T. pyriformis. The constructed regression models (DTB and DTF) using five descriptors yielded correlation coefficients (R 2 ) of 0.945, 0.944 between the measured and predicted toxicities with mean squared errors (MSEs) of 0.059, and 0.064 in complete T. pyriformis data. The T. pyriformis regression models (DTB and DTF) applied to the external toxicity data sets yielded R 2 and MSE values of 0.637, 0.655; 0.534, 0.507 (marine bacteria) and 0.741, 0.691; 0.155, 0.173 (algae). The results suggest for wide applicability of the inter-species models in predicting toxicity of new chemicals for regulatory purposes. These approaches provide useful strategy and robust tools in the screening of ecotoxicological risk or environmental hazard potential of chemicals. - Graphical abstract: Importance of input variables in DTB and DTF classification models for (a) two-category, and (b) four-category toxicity intervals in T. pyriformis data. Generalization and predictive abilities of the

  5. Exploring uncertainty of Amazon dieback in a perturbed parameter Earth system ensemble.

    Science.gov (United States)

    Boulton, Chris A; Booth, Ben B B; Good, Peter

    2017-12-01

    The future of the Amazon rainforest is unknown due to uncertainties in projected climate change and the response of the forest to this change (forest resiliency). Here, we explore the effect of some uncertainties in climate and land surface processes on the future of the forest, using a perturbed physics ensemble of HadCM3C. This is the first time Amazon forest changes are presented using an ensemble exploring both land vegetation processes and physical climate feedbacks in a fully coupled modelling framework. Under three different emissions scenarios, we measure the change in the forest coverage by the end of the 21st century (the transient response) and make a novel adaptation to a previously used method known as "dry-season resilience" to predict the long-term committed response of the forest, should the state of the climate remain constant past 2100. Our analysis of this ensemble suggests that there will be a high chance of greater forest loss on longer timescales than is realized by 2100, especially for mid-range and low emissions scenarios. In both the transient and predicted committed responses, there is an increasing uncertainty in the outcome of the forest as the strength of the emissions scenarios increases. It is important to note however, that very few of the simulations produce future forest loss of the magnitude previously shown under the standard model configuration. We find that low optimum temperatures for photosynthesis and a high minimum leaf area index needed for the forest to compete for space appear to be precursors for dieback. We then decompose the uncertainty into that associated with future climate change and that associated with forest resiliency, finding that it is important to reduce the uncertainty in both of these if we are to better determine the Amazon's outcome. © 2017 John Wiley & Sons Ltd.

  6. Predictability of Precipitation Over the Conterminous U.S. Based on the CMIP5 Multi-Model Ensemble

    Science.gov (United States)

    Jiang, Mingkai; Felzer, Benjamin S.; Sahagian, Dork

    2016-01-01

    Characterizing precipitation seasonality and variability in the face of future uncertainty is important for a well-informed climate change adaptation strategy. Using the Colwell index of predictability and monthly normalized precipitation data from the Coupled Model Intercomparison Project Phase 5 (CMIP5) multi-model ensembles, this study identifies spatial hotspots of changes in precipitation predictability in the United States under various climate scenarios. Over the historic period (1950–2005), the recurrent pattern of precipitation is highly predictable in the East and along the coastal Northwest, and is less so in the arid Southwest. Comparing the future (2040–2095) to the historic period, larger changes in precipitation predictability are observed under Representative Concentration Pathways (RCP) 8.5 than those under RCP 4.5. Finally, there are region-specific hotspots of future changes in precipitation predictability, and these hotspots often coincide with regions of little projected change in total precipitation, with exceptions along the wetter East and parts of the drier central West. Therefore, decision-makers are advised to not rely on future total precipitation as an indicator of water resources. Changes in precipitation predictability and the subsequent changes on seasonality and variability are equally, if not more, important factors to be included in future regional environmental assessment. PMID:27425819

  7. Comparison of different incremental analysis update schemes in a realistic assimilation system with Ensemble Kalman Filter

    Science.gov (United States)

    Yan, Y.; Barth, A.; Beckers, J. M.; Brankart, J. M.; Brasseur, P.; Candille, G.

    2017-07-01

    In this paper, three incremental analysis update schemes (IAU 0, IAU 50 and IAU 100) are compared in the same assimilation experiments with a realistic eddy permitting primitive equation model of the North Atlantic Ocean using the Ensemble Kalman Filter. The difference between the three IAU schemes lies on the position of the increment update window. The relevance of each IAU scheme is evaluated through analyses on both thermohaline and dynamical variables. The validation of the assimilation results is performed according to both deterministic and probabilistic metrics against different sources of observations. For deterministic validation, the ensemble mean and the ensemble spread are compared to the observations. For probabilistic validation, the continuous ranked probability score (CRPS) is used to evaluate the ensemble forecast system according to reliability and resolution. The reliability is further decomposed into bias and dispersion by the reduced centred random variable (RCRV) score. The obtained results show that 1) the IAU 50 scheme has the same performance as the IAU 100 scheme 2) the IAU 50/100 schemes outperform the IAU 0 scheme in error covariance propagation for thermohaline variables in relatively stable region, while the IAU 0 scheme outperforms the IAU 50/100 schemes in dynamical variables estimation in dynamically active region 3) in case with sufficient number of observations and good error specification, the impact of IAU schemes is negligible. The differences between the IAU 0 scheme and the IAU 50/100 schemes are mainly due to different model integration time and different instability (density inversion, large vertical velocity, etc.) induced by the increment update. The longer model integration time with the IAU 50/100 schemes, especially the free model integration, on one hand, allows for better re-establishment of the equilibrium model state, on the other hand, smooths the strong gradients in dynamically active region.

  8. Reconstruction of ensembles of coupled time-delay systems from time series.

    Science.gov (United States)

    Sysoev, I V; Prokhorov, M D; Ponomarenko, V I; Bezruchko, B P

    2014-06-01

    We propose a method to recover from time series the parameters of coupled time-delay systems and the architecture of couplings between them. The method is based on a reconstruction of model delay-differential equations and estimation of statistical significance of couplings. It can be applied to networks composed of nonidentical nodes with an arbitrary number of unidirectional and bidirectional couplings. We test our method on chaotic and periodic time series produced by model equations of ensembles of diffusively coupled time-delay systems in the presence of noise, and apply it to experimental time series obtained from electronic oscillators with delayed feedback coupled by resistors.

  9. New approach to information fusion for Lipschitz classifiers ensembles: Application in multi-channel C-OTDR-monitoring systems

    Energy Technology Data Exchange (ETDEWEB)

    Timofeev, Andrey V.; Egorov, Dmitry V. [LPP “EqualiZoom”, Astana, 010000 (Kazakhstan)

    2016-06-08

    This paper presents new results concerning selection of an optimal information fusion formula for an ensemble of Lipschitz classifiers. The goal of information fusion is to create an integral classificatory which could provide better generalization ability of the ensemble while achieving a practically acceptable level of effectiveness. The problem of information fusion is very relevant for data processing in multi-channel C-OTDR-monitoring systems. In this case we have to effectively classify targeted events which appear in the vicinity of the monitored object. Solution of this problem is based on usage of an ensemble of Lipschitz classifiers each of which corresponds to a respective channel. We suggest a brand new method for information fusion in case of ensemble of Lipschitz classifiers. This method is called “The Weighing of Inversely as Lipschitz Constants” (WILC). Results of WILC-method practical usage in multichannel C-OTDR monitoring systems are presented.

  10. Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing

    KAUST Repository

    Toye, Habib

    2017-05-26

    We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation and of the Data Research Testbed (DART) for ensemble data assimilation. DART has been configured to integrate all members of an ensemble adjustment Kalman filter (EAKF) in parallel, based on which we adapted the ensemble operations in DART to use an invariant ensemble, i.e., an ensemble Optimal Interpolation (EnOI) algorithm. This approach requires only single forward model integration in the forecast step and therefore saves substantial computational cost. To deal with the strong seasonal variability of the Red Sea, the EnOI ensemble is then seasonally selected from a climatology of long-term model outputs. Observations of remote sensing sea surface height (SSH) and sea surface temperature (SST) are assimilated every 3 days. Real-time atmospheric fields from the National Center for Environmental Prediction (NCEP) and the European Center for Medium-Range Weather Forecasts (ECMWF) are used as forcing in different assimilation experiments. We investigate the behaviors of the EAKF and (seasonal-) EnOI and compare their performances for assimilating and forecasting the circulation of the Red Sea. We further assess the sensitivity of the assimilation system to various filtering parameters (ensemble size, inflation) and atmospheric forcing.

  11. A new strategy for snow-cover mapping using remote sensing data and ensemble based systems techniques

    Science.gov (United States)

    Roberge, S.; Chokmani, K.; De Sève, D.

    2012-04-01

    The snow cover plays an important role in the hydrological cycle of Quebec (Eastern Canada). Consequently, evaluating its spatial extent interests the authorities responsible for the management of water resources, especially hydropower companies. The main objective of this study is the development of a snow-cover mapping strategy using remote sensing data and ensemble based systems techniques. Planned to be tested in a near real-time operational mode, this snow-cover mapping strategy has the advantage to provide the probability of a pixel to be snow covered and its uncertainty. Ensemble systems are made of two key components. First, a method is needed to build an ensemble of classifiers that is diverse as much as possible. Second, an approach is required to combine the outputs of individual classifiers that make up the ensemble in such a way that correct decisions are amplified, and incorrect ones are cancelled out. In this study, we demonstrate the potential of ensemble systems to snow-cover mapping using remote sensing data. The chosen classifier is a sequential thresholds algorithm using NOAA-AVHRR data adapted to conditions over Eastern Canada. Its special feature is the use of a combination of six sequential thresholds varying according to the day in the winter season. Two versions of the snow-cover mapping algorithm have been developed: one is specific for autumn (from October 1st to December 31st) and the other for spring (from March 16th to May 31st). In order to build the ensemble based system, different versions of the algorithm are created by varying randomly its parameters. One hundred of the versions are included in the ensemble. The probability of a pixel to be snow, no-snow or cloud covered corresponds to the amount of votes the pixel has been classified as such by all classifiers. The overall performance of ensemble based mapping is compared to the overall performance of the chosen classifier, and also with ground observations at meteorological

  12. An OSSE Study for Deep Argo Array using the GFDL Ensemble Coupled Data Assimilation System

    Science.gov (United States)

    Chang, You-Soon; Zhang, Shaoqing; Rosati, Anthony; Vecchi, Gabriel A.; Yang, Xiaosong

    2018-03-01

    An observing system simulation experiment (OSSE) using an ensemble coupled data assimilation system was designed to investigate the impact of deep ocean Argo profile assimilation in a biased numerical climate system. Based on the modern Argo observational array and an artificial extension to full depth, "observations" drawn from one coupled general circulation model (CM2.0) were assimilated into another model (CM2.1). Our results showed that coupled data assimilation with simultaneous atmospheric and oceanic constraints plays a significant role in preventing deep ocean drift. However, the extension of the Argo array to full depth did not significantly improve the quality of the oceanic climate estimation within the bias magnitude in the twin experiment. Even in the "identical" twin experiment for the deep Argo array from the same model (CM2.1) with the assimilation model, no significant changes were shown in the deep ocean, such as in the Atlantic meridional overturning circulation and the Antarctic bottom water cell. The small ensemble spread and corresponding weak constraints by the deep Argo profiles with medium spatial and temporal resolution may explain why the deep Argo profiles did not improve the deep ocean features in the assimilation system. Additional studies using different assimilation methods with improved spatial and temporal resolution of the deep Argo array are necessary in order to more thoroughly understand the impact of the deep Argo array on the assimilation system.

  13. System size effects on the mechanical response of cohesive-frictional granular ensembles

    Directory of Open Access Journals (Sweden)

    Singh Saurabh

    2017-01-01

    Full Text Available Shear resistance in granular ensembles is a result of interparticle interaction and friction. However, even the presence of small amounts of cohesion between the particles changes the landscape of the mechanical response considerably. Very often such cohesive frictional (c-ϕ granular ensembles are encountered in nature as well as while handling and storage of granular materials in the pharmaceutical, construction and mining industries. Modeling of these c-ϕ materials, especially in engineering applications have relied on the oft-made assumption of a “continua” and have utilized the popular tenets of continuum plasticity theory. We present an experimental investigation on the fundamental mechanics of c-ϕ materials specifically; we investigate if there exists a system size effect and any additional length scales beyond the continuum length scale on their mechanical response. For this purpose, we conduct a series of 1-D compression (UC tests on cylindrical specimens reconstituted in the laboratory with a range of model particle–binder combinations such as sandcement, sand-epoxy, and glass ballotini-epoxy mixtures. Specimens are reconstituted to various diameters ranging from 10 mm to 150 mm (with an aspect ratio of 2 to a predefined packing fraction. In addition to the effect of the type of binder (cement, epoxy and system size, the mean particle size is also varied from 0.5 to 2.5 mm. The peak strength of these materials is significant as it signals the initiation of the cohesive-bond breaking and onset of mobilization of the inter particle frictional resistance. For these model systems, the peak strength is a strong function of the system size of the ensemble as well as the mean particle size. This intriguing observation is counter to the traditional notion of a continuum plastic typical granular ensemble. Microstructure studies in a computed-tomograph have revealed the existence of a web patterned ‘entangled-chain’ like structure

  14. System size effects on the mechanical response of cohesive-frictional granular ensembles

    Science.gov (United States)

    Singh, Saurabh; Kandasami, Ramesh Kannan; Mahendran, Rupesh Kumar; Murthy, Tejas

    2017-06-01

    Shear resistance in granular ensembles is a result of interparticle interaction and friction. However, even the presence of small amounts of cohesion between the particles changes the landscape of the mechanical response considerably. Very often such cohesive frictional (c-ϕ) granular ensembles are encountered in nature as well as while handling and storage of granular materials in the pharmaceutical, construction and mining industries. Modeling of these c-ϕ materials, especially in engineering applications have relied on the oft-made assumption of a "continua" and have utilized the popular tenets of continuum plasticity theory. We present an experimental investigation on the fundamental mechanics of c-ϕ materials specifically; we investigate if there exists a system size effect and any additional length scales beyond the continuum length scale on their mechanical response. For this purpose, we conduct a series of 1-D compression (UC) tests on cylindrical specimens reconstituted in the laboratory with a range of model particle-binder combinations such as sandcement, sand-epoxy, and glass ballotini-epoxy mixtures. Specimens are reconstituted to various diameters ranging from 10 mm to 150 mm (with an aspect ratio of 2) to a predefined packing fraction. In addition to the effect of the type of binder (cement, epoxy) and system size, the mean particle size is also varied from 0.5 to 2.5 mm. The peak strength of these materials is significant as it signals the initiation of the cohesive-bond breaking and onset of mobilization of the inter particle frictional resistance. For these model systems, the peak strength is a strong function of the system size of the ensemble as well as the mean particle size. This intriguing observation is counter to the traditional notion of a continuum plastic typical granular ensemble. Microstructure studies in a computed-tomograph have revealed the existence of a web patterned `entangled-chain' like structure, we argue that this ushers

  15. A CN-Based Ensembled Hydrological Model for Enhanced Watershed Runoff Prediction

    Directory of Open Access Journals (Sweden)

    Muhammad Ajmal

    2016-01-01

    Full Text Available A major structural inconsistency of the traditional curve number (CN model is its dependence on an unstable fixed initial abstraction, which normally results in sudden jumps in runoff estimation. Likewise, the lack of pre-storm soil moisture accounting (PSMA procedure is another inherent limitation of the model. To circumvent those problems, we used a variable initial abstraction after ensembling the traditional CN model and a French four-parameter (GR4J model to better quantify direct runoff from ungauged watersheds. To mimic the natural rainfall-runoff transformation at the watershed scale, our new parameterization designates intrinsic parameters and uses a simple structure. It exhibited more accurate and consistent results than earlier methods in evaluating data from 39 forest-dominated watersheds, both for small and large watersheds. In addition, based on different performance evaluation indicators, the runoff reproduction results show that the proposed model produced more consistent results for dry, normal, and wet watershed conditions than the other models used in this study.

  16. The prediction of surface temperature in the new seasonal prediction system based on the MPI-ESM coupled climate model

    Science.gov (United States)

    Baehr, J.; Fröhlich, K.; Botzet, M.; Domeisen, D. I. V.; Kornblueh, L.; Notz, D.; Piontek, R.; Pohlmann, H.; Tietsche, S.; Müller, W. A.

    2015-05-01

    A seasonal forecast system is presented, based on the global coupled climate model MPI-ESM as used for CMIP5 simulations. We describe the initialisation of the system and analyse its predictive skill for surface temperature. The presented system is initialised in the atmospheric, oceanic, and sea ice component of the model from reanalysis/observations with full field nudging in all three components. For the initialisation of the ensemble, bred vectors with a vertically varying norm are implemented in the ocean component to generate initial perturbations. In a set of ensemble hindcast simulations, starting each May and November between 1982 and 2010, we analyse the predictive skill. Bias-corrected ensemble forecasts for each start date reproduce the observed surface temperature anomalies at 2-4 months lead time, particularly in the tropics. Niño3.4 sea surface temperature anomalies show a small root-mean-square error and predictive skill up to 6 months. Away from the tropics, predictive skill is mostly limited to the ocean, and to regions which are strongly influenced by ENSO teleconnections. In summary, the presented seasonal prediction system based on a coupled climate model shows predictive skill for surface temperature at seasonal time scales comparable to other seasonal prediction systems using different underlying models and initialisation strategies. As the same model underlying our seasonal prediction system—with a different initialisation—is presently also used for decadal predictions, this is an important step towards seamless seasonal-to-decadal climate predictions.

  17. DYNAMIC STABILITY OF THE SOLAR SYSTEM: STATISTICALLY INCONCLUSIVE RESULTS FROM ENSEMBLE INTEGRATIONS

    Energy Technology Data Exchange (ETDEWEB)

    Zeebe, Richard E., E-mail: zeebe@soest.hawaii.edu [School of Ocean and Earth Science and Technology, University of Hawaii at Manoa, 1000 Pope Road, MSB 629, Honolulu, HI 96822 (United States)

    2015-01-01

    Due to the chaotic nature of the solar system, the question of its long-term stability can only be answered in a statistical sense, for instance, based on numerical ensemble integrations of nearby orbits. Destabilization of the inner planets, leading to close encounters and/or collisions can be initiated through a large increase in Mercury's eccentricity, with a currently assumed likelihood of ∼1%. However, little is known at present about the robustness of this number. Here I report ensemble integrations of the full equations of motion of the eight planets and Pluto over 5 Gyr, including contributions from general relativity. The results show that different numerical algorithms lead to statistically different results for the evolution of Mercury's eccentricity (e{sub M}). For instance, starting at present initial conditions (e{sub M}≃0.21), Mercury's maximum eccentricity achieved over 5 Gyr is, on average, significantly higher in symplectic ensemble integrations using heliocentric rather than Jacobi coordinates and stricter error control. In contrast, starting at a possible future configuration (e{sub M}≃0.53), Mercury's maximum eccentricity achieved over the subsequent 500 Myr is, on average, significantly lower using heliocentric rather than Jacobi coordinates. For example, the probability for e{sub M} to increase beyond 0.53 over 500 Myr is >90% (Jacobi) versus only 40%-55% (heliocentric). This poses a dilemma because the physical evolution of the real system—and its probabilistic behavior—cannot depend on the coordinate system or the numerical algorithm chosen to describe it. Some tests of the numerical algorithms suggest that symplectic integrators using heliocentric coordinates underestimate the odds for destabilization of Mercury's orbit at high initial e{sub M}.

  18. An ensemble based top performing approach for NCI-DREAM drug sensitivity prediction challenge.

    Directory of Open Access Journals (Sweden)

    Qian Wan

    Full Text Available We consider the problem of predicting sensitivity of cancer cell lines to new drugs based on supervised learning on genomic profiles. The genetic and epigenetic characterization of a cell line provides observations on various aspects of regulation including DNA copy number variations, gene expression, DNA methylation and protein abundance. To extract relevant information from the various data types, we applied a random forest based approach to generate sensitivity predictions from each type of data and combined the predictions in a linear regression model to generate the final drug sensitivity prediction. Our approach when applied to the NCI-DREAM drug sensitivity prediction challenge was a top performer among 47 teams and produced high accuracy predictions. Our results show that the incorporation of multiple genomic characterizations lowered the mean and variance of the estimated bootstrap prediction error. We also applied our approach to the Cancer Cell Line Encyclopedia database for sensitivity prediction and the ability to extract the top targets of an anti-cancer drug. The results illustrate the effectiveness of our approach in predicting drug sensitivity from heterogeneous genomic datasets.

  19. Relaxation in a two-body Fermi-Pasta-Ulam system in the canonical ensemble

    Science.gov (United States)

    Sen, Surajit; Barrett, Tyler

    The study of the dynamics of the Fermi-Pasta-Ulam (FPU) chain remains a challenging problem. Inspired by the recent work of Onorato et al. on thermalization in the FPU system, we report a study of relaxation processes in a two-body FPU system in the canonical ensemble. The studies have been carried out using the Recurrence Relations Method introduced by Zwanzig, Mori, Lee and others. We have obtained exact analytical expressions for the first thirteen levels of the continued fraction representation of the Laplace transformed velocity autocorrelation function of the system. Using simple and reasonable extrapolation schemes and known limits we are able to estimate the relaxation behavior of the oscillators in the two-body FPU system and recover the expected behavior in the harmonic limit. Generalizations of the calculations to larger systems will be discussed.

  20. Ensemble approach combining multiple methods improves human transcription start site prediction.

    LENUS (Irish Health Repository)

    Dineen, David G

    2010-01-01

    The computational prediction of transcription start sites is an important unsolved problem. Some recent progress has been made, but many promoters, particularly those not associated with CpG islands, are still difficult to locate using current methods. These methods use different features and training sets, along with a variety of machine learning techniques and result in different prediction sets.

  1. Ensemble modeling to predict habitat suitability for a large-scale disturbance specialist

    Science.gov (United States)

    Quresh S. Latif; Victoria A. Saab; Jonathan G. Dudley; Jeff P. Hollenbeck

    2013-01-01

    To conserve habitat for disturbance specialist species, ecologists must identify where individuals will likely settle in newly disturbed areas. Habitat suitability models can predict which sites at new disturbances will most likely attract specialists. Without validation data from newly disturbed areas, however, the best approach for maximizing predictive accuracy can...

  2. Predicting X-ray diffuse scattering from translation–libration–screw structural ensembles

    International Nuclear Information System (INIS)

    Van Benschoten, Andrew H.; Afonine, Pavel V.; Terwilliger, Thomas C.; Wall, Michael E.; Jackson, Colin J.; Sauter, Nicholas K.; Adams, Paul D.; Urzhumtsev, Alexandre; Fraser, James S.

    2015-01-01

    A method of simulating X-ray diffuse scattering from multi-model PDB files is presented. Despite similar agreement with Bragg data, different translation–libration–screw refinement strategies produce unique diffuse intensity patterns. Identifying the intramolecular motions of proteins and nucleic acids is a major challenge in macromolecular X-ray crystallography. Because Bragg diffraction describes the average positional distribution of crystalline atoms with imperfect precision, the resulting electron density can be compatible with multiple models of motion. Diffuse X-ray scattering can reduce this degeneracy by reporting on correlated atomic displacements. Although recent technological advances are increasing the potential to accurately measure diffuse scattering, computational modeling and validation tools are still needed to quantify the agreement between experimental data and different parameterizations of crystalline disorder. A new tool, phenix.diffuse, addresses this need by employing Guinier’s equation to calculate diffuse scattering from Protein Data Bank (PDB)-formatted structural ensembles. As an example case, phenix.diffuse is applied to translation–libration–screw (TLS) refinement, which models rigid-body displacement for segments of the macromolecule. To enable the calculation of diffuse scattering from TLS-refined structures, phenix.tls-as-xyz builds multi-model PDB files that sample the underlying T, L and S tensors. In the glycerophosphodiesterase GpdQ, alternative TLS-group partitioning and different motional correlations between groups yield markedly dissimilar diffuse scattering maps with distinct implications for molecular mechanism and allostery. These methods demonstrate how, in principle, X-ray diffuse scattering could extend macromolecular structural refinement, validation and analysis

  3. Predicting X-ray diffuse scattering from translation–libration–screw structural ensembles

    Energy Technology Data Exchange (ETDEWEB)

    Van Benschoten, Andrew H. [University of California San Francisco, San Francisco, CA 94158 (United States); Afonine, Pavel V. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Terwilliger, Thomas C.; Wall, Michael E. [Los Alamos National Laboratory, Los Alamos, NM 87545 (United States); Jackson, Colin J. [Australian National University, Canberra, ACT 2601 (Australia); Sauter, Nicholas K. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); Adams, Paul D. [Lawrence Berkeley National Laboratory, Berkeley, CA 94720 (United States); University of California Berkeley, Berkeley, CA 94720 (United States); Urzhumtsev, Alexandre [Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS–INSERM–UdS, 1 Rue Laurent Fries, BP 10142, 67404 Illkirch (France); Université de Lorraine, BP 239, 54506 Vandoeuvre-les-Nancy (France); Fraser, James S., E-mail: james.fraser@ucsf.edu [University of California San Francisco, San Francisco, CA 94158 (United States)

    2015-07-28

    A method of simulating X-ray diffuse scattering from multi-model PDB files is presented. Despite similar agreement with Bragg data, different translation–libration–screw refinement strategies produce unique diffuse intensity patterns. Identifying the intramolecular motions of proteins and nucleic acids is a major challenge in macromolecular X-ray crystallography. Because Bragg diffraction describes the average positional distribution of crystalline atoms with imperfect precision, the resulting electron density can be compatible with multiple models of motion. Diffuse X-ray scattering can reduce this degeneracy by reporting on correlated atomic displacements. Although recent technological advances are increasing the potential to accurately measure diffuse scattering, computational modeling and validation tools are still needed to quantify the agreement between experimental data and different parameterizations of crystalline disorder. A new tool, phenix.diffuse, addresses this need by employing Guinier’s equation to calculate diffuse scattering from Protein Data Bank (PDB)-formatted structural ensembles. As an example case, phenix.diffuse is applied to translation–libration–screw (TLS) refinement, which models rigid-body displacement for segments of the macromolecule. To enable the calculation of diffuse scattering from TLS-refined structures, phenix.tls-as-xyz builds multi-model PDB files that sample the underlying T, L and S tensors. In the glycerophosphodiesterase GpdQ, alternative TLS-group partitioning and different motional correlations between groups yield markedly dissimilar diffuse scattering maps with distinct implications for molecular mechanism and allostery. These methods demonstrate how, in principle, X-ray diffuse scattering could extend macromolecular structural refinement, validation and analysis.

  4. Predictive Systems Toxicology

    KAUST Repository

    Kiani, Narsis A.; Shang, Ming-Mei; Zenil, Hector; Tegner, Jesper

    2018-01-01

    In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.

  5. Predictive Systems Toxicology

    KAUST Repository

    Kiani, Narsis A.

    2018-01-15

    In this review we address to what extent computational techniques can augment our ability to predict toxicity. The first section provides a brief history of empirical observations on toxicity dating back to the dawn of Sumerian civilization. Interestingly, the concept of dose emerged very early on, leading up to the modern emphasis on kinetic properties, which in turn encodes the insight that toxicity is not solely a property of a compound but instead depends on the interaction with the host organism. The next logical step is the current conception of evaluating drugs from a personalized medicine point-of-view. We review recent work on integrating what could be referred to as classical pharmacokinetic analysis with emerging systems biology approaches incorporating multiple omics data. These systems approaches employ advanced statistical analytical data processing complemented with machine learning techniques and use both pharmacokinetic and omics data. We find that such integrated approaches not only provide improved predictions of toxicity but also enable mechanistic interpretations of the molecular mechanisms underpinning toxicity and drug resistance. We conclude the chapter by discussing some of the main challenges, such as how to balance the inherent tension between the predictive capacity of models, which in practice amounts to constraining the number of features in the models versus allowing for rich mechanistic interpretability, i.e. equipping models with numerous molecular features. This challenge also requires patient-specific predictions on toxicity, which in turn requires proper stratification of patients as regards how they respond, with or without adverse toxic effects. In summary, the transformation of the ancient concept of dose is currently successfully operationalized using rich integrative data encoded in patient-specific models.

  6. Ensemble encoding of nociceptive stimulus intensity in the rat medial and lateral pain systems

    Directory of Open Access Journals (Sweden)

    Woodward Donald J

    2011-08-01

    Full Text Available Abstract Background The ability to encode noxious stimulus intensity is essential for the neural processing of pain perception. It is well accepted that the intensity information is transmitted within both sensory and affective pathways. However, it remains unclear what the encoding patterns are in the thalamocortical brain regions, and whether the dual pain systems share similar responsibility in intensity coding. Results Multichannel single-unit recordings were used to investigate the activity of individual neurons and neuronal ensembles in the rat brain following the application of noxious laser stimuli of increasing intensity to the hindpaw. Four brain regions were monitored, including two within the lateral sensory pain pathway, namely, the ventral posterior lateral thalamic nuclei and the primary somatosensory cortex, and two in the medial pathway, namely, the medial dorsal thalamic nuclei and the anterior cingulate cortex. Neuron number, firing rate, and ensemble spike count codings were examined in this study. Our results showed that the noxious laser stimulation evoked double-peak responses in all recorded brain regions. Significant correlations were found between the laser intensity and the number of responsive neurons, the firing rates, as well as the mass spike counts (MSCs. MSC coding was generally more efficient than the other two methods. Moreover, the coding capacities of neurons in the two pathways were comparable. Conclusion This study demonstrated the collective contribution of medial and lateral pathway neurons to the noxious intensity coding. Additionally, we provide evidence that ensemble spike count may be the most reliable method for coding pain intensity in the brain.

  7. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria

    NARCIS (Netherlands)

    Chevrette, Marc G.; Aicheler, Fabian; Kohlbacher, Oliver; Currie, Cameron R.; Medema, M.H.

    2017-01-01

    Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables

  8. Creating Weather System Ensembles Through Synergistic Process Modeling and Machine Learning

    Science.gov (United States)

    Chen, B.; Posselt, D. J.; Nguyen, H.; Wu, L.; Su, H.; Braverman, A. J.

    2017-12-01

    Earth's weather and climate are sensitive to a variety of control factors (e.g., initial state, forcing functions, etc). Characterizing the response of the atmosphere to a change in initial conditions or model forcing is critical for weather forecasting (ensemble prediction) and climate change assessment. Input - response relationships can be quantified by generating an ensemble of multiple (100s to 1000s) realistic realizations of weather and climate states. Atmospheric numerical models generate simulated data through discretized numerical approximation of the partial differential equations (PDEs) governing the underlying physics. However, the computational expense of running high resolution atmospheric state models makes generation of more than a few simulations infeasible. Here, we discuss an experiment wherein we approximate the numerical PDE solver within the Weather Research and Forecasting (WRF) Model using neural networks trained on a subset of model run outputs. Once trained, these neural nets can produce large number of realization of weather states from a small number of deterministic simulations with speeds that are orders of magnitude faster than the underlying PDE solver. Our neural network architecture is inspired by the governing partial differential equations. These equations are location-invariant, and consist of first and second derivations. As such, we use a 3x3 lon-lat grid of atmospheric profiles as the predictor in the neural net to provide the network the information necessary to compute the first and second moments. Results indicate that the neural network algorithm can approximate the PDE outputs with high degree of accuracy (less than 1% error), and that this error increases as a function of the prediction time lag.

  9. Visualization of uncertainty and ensemble data: Exploration of climate modeling and weather forecast data with integrated ViSUS-CDAT systems

    International Nuclear Information System (INIS)

    Potter, Kristin; Pascucci, Valerio; Johhson, Chris; Wilson, Andrew; Bremer, Peer-Timo; Williams, Dean; Doutriaux, Charles

    2009-01-01

    Climate scientists and meteorologists are working towards a better understanding of atmospheric conditions and global climate change. To explore the relationships present in numerical predictions of the atmosphere, ensemble datasets are produced that combine time- and spatially-varying simulations generated using multiple numeric models, sampled input conditions, and perturbed parameters. These data sets mitigate as well as describe the uncertainty present in the data by providing insight into the effects of parameter perturbation, sensitivity to initial conditions, and inconsistencies in model outcomes. As such, massive amounts of data are produced, creating challenges both in data analysis and in visualization. This work presents an approach to understanding ensembles by using a collection of statistical descriptors to summarize the data, and displaying these descriptors using variety of visualization techniques which are familiar to domain experts. The resulting techniques are integrated into the ViSUS/Climate Data and Analysis Tools (CDAT) system designed to provide a directly accessible, complex visualization framework to atmospheric researchers.

  10. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments.

    Science.gov (United States)

    Zheng, Ce; Kurgan, Lukasz

    2008-10-10

    beta-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of beta-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based beta-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM) values serve as an input to the support vector machine (SVM) predictor. We show that (1) all four predicted secondary structures are useful; (2) the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3) the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential beta-turns, while the remaining four amino acids are useful to predict non-beta-turns. Empirical evaluation using three nonredundant datasets shows favorable Q total, Q predicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Q total barrier and achieves Q total = 80.9%, MCC = 0.47, and Q predicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC) competing methods, respectively. Experiments show that the proposed method constitutes an improvement over the competing prediction

  11. Predicting and understanding law-making with word vectors and an ensemble model.

    Science.gov (United States)

    Nay, John J

    2017-01-01

    Out of nearly 70,000 bills introduced in the U.S. Congress from 2001 to 2015, only 2,513 were enacted. We developed a machine learning approach to forecasting the probability that any bill will become law. Starting in 2001 with the 107th Congress, we trained models on data from previous Congresses, predicted all bills in the current Congress, and repeated until the 113th Congress served as the test. For prediction we scored each sentence of a bill with a language model that embeds legislative vocabulary into a high-dimensional, semantic-laden vector space. This language representation enables our investigation into which words increase the probability of enactment for any topic. To test the relative importance of text and context, we compared the text model to a context-only model that uses variables such as whether the bill's sponsor is in the majority party. To test the effect of changes to bills after their introduction on our ability to predict their final outcome, we compared using the bill text and meta-data available at the time of introduction with using the most recent data. At the time of introduction context-only predictions outperform text-only, and with the newest data text-only outperforms context-only. Combining text and context always performs best. We conducted a global sensitivity analysis on the combined model to determine important variables predicting enactment.

  12. Advance and prospectus of seasonal prediction: assessment of the APCC/CliPAS 14-model ensemble retrospective seasonal prediction (1980-2004)

    Science.gov (United States)

    Wang, Bin; Lee, June-Yi; Kang, In-Sik; Shukla, J.; Park, C.-K.; Kumar, A.; Schemm, J.; Cocke, S.; Kug, J.-S.; Luo, J.-J.; Zhou, T.; Wang, B.; Fu, X.; Yun, W.-T.; Alves, O.; Jin, E. K.; Kinter, J.; Kirtman, B.; Krishnamurti, T.; Lau, N. C.; Lau, W.; Liu, P.; Pegion, P.; Rosati, T.; Schubert, S.; Stern, W.; Suarez, M.; Yamagata, T.

    2009-07-01

    We assessed current status of multi-model ensemble (MME) deterministic and probabilistic seasonal prediction based on 25-year (1980-2004) retrospective forecasts performed by 14 climate model systems (7 one-tier and 7 two-tier systems) that participate in the Climate Prediction and its Application to Society (CliPAS) project sponsored by the Asian-Pacific Economic Cooperation Climate Center (APCC). We also evaluated seven DEMETER models’ MME for the period of 1981-2001 for comparison. Based on the assessment, future direction for improvement of seasonal prediction is discussed. We found that two measures of probabilistic forecast skill, the Brier Skill Score (BSS) and Area under the Relative Operating Characteristic curve (AROC), display similar spatial patterns as those represented by temporal correlation coefficient (TCC) score of deterministic MME forecast. A TCC score of 0.6 corresponds approximately to a BSS of 0.1 and an AROC of 0.7 and beyond these critical threshold values, they are almost linearly correlated. The MME method is demonstrated to be a valuable approach for reducing errors and quantifying forecast uncertainty due to model formulation. The MME prediction skill is substantially better than the averaged skill of all individual models. For instance, the TCC score of CliPAS one-tier MME forecast of Niño 3.4 index at a 6-month lead initiated from 1 May is 0.77, which is significantly higher than the corresponding averaged skill of seven individual coupled models (0.63). The MME made by using 14 coupled models from both DEMETER and CliPAS shows an even higher TCC score of 0.87. Effectiveness of MME depends on the averaged skill of individual models and their mutual independency. For probabilistic forecast the CliPAS MME gains considerable skill from increased forecast reliability as the number of model being used increases; the forecast resolution also increases for 2 m temperature but slightly decreases for precipitation. Equatorial Sea Surface

  13. Advance and prospectus of seasonal prediction: assessment of the APCC/CliPAS 14-model ensemble retrospective seasonal prediction (1980-2004)

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Bin; Lee, June-Yi; Fu, X.; Liu, P. [University of Hawaii, Department of Meteorology and International Pacific Research Center, IPRC, School of Ocean and Earth Science and Technology, Honolulu, HI (United States); Kang, In-Sik; Kug, J.S. [Seoul National University, School of Earth and Environmental Sciences, Seoul (Korea); Shukla, J.; Jin, E.K.; Kinter, J.; Kirtman, B. [George Mason University and COLA, Climate Dynamics Program, Calverton, MD (United States); Park, C.K. [APEC Climate Center, Busan (Korea); Kumar, A.; Schemm, J. [Climate Prediction Center/NCEP, Camp Springs, MD (United States); Cocke, S.; Krishnamurti, T. [Florida State University, Tallahassee, FL (United States); Luo, J.J. [Frontier Research Center for Global Chnage, Yokohama (Japan); Zhou, T.; Wang, B. [Chinese Academy of Sciences, LASG/Institute of Atmospheric Physics, Beijing (China); Yun, W.T. [Korean Meteorological Administration, Seoul (Korea); Alves, O. [Bureau of Meteorology Research Center, Melburne (Australia); Lau, N.C.; Rosati, T.; Stern, W. [Princeton University, Geophysical Fluid Dynamics Laboratory/NOAA, Princeton, NJ (United States); Lau, W.; Pegion, P.; Schubert, S.; Suarez, M. [Godard Space Flight Center/NASA, Greenbelt, MD (United States)

    2009-07-15

    We assessed current status of multi-model ensemble (MME) deterministic and probabilistic seasonal prediction based on 25-year (1980-2004) retrospective forecasts performed by 14 climate model systems (7 one-tier and 7 two-tier systems) that participate in the Climate Prediction and its Application to Society (CliPAS) project sponsored by the Asian-Pacific Economic Cooperation Climate Center (APCC). We also evaluated seven DEMETER models' MME for the period of 1981-2001 for comparison. Based on the assessment, future direction for improvement of seasonal prediction is discussed. We found that two measures of probabilistic forecast skill, the Brier Skill Score (BSS) and Area under the Relative Operating Characteristic curve (AROC), display similar spatial patterns as those represented by temporal correlation coefficient (TCC) score of deterministic MME forecast. A TCC score of 0.6 corresponds approximately to a BSS of 0.1 and an AROC of 0.7 and beyond these critical threshold values, they are almost linearly correlated. The MME method is demonstrated to be a valuable approach for reducing errors and quantifying forecast uncertainty due to model formulation. The MME prediction skill is substantially better than the averaged skill of all individual models. For instance, the TCC score of CliPAS one-tier MME forecast of Nino 3.4 index at a 6-month lead initiated from 1 May is 0.77, which is significantly higher than the corresponding averaged skill of seven individual coupled models (0.63). The MME made by using 14 coupled models from both DEMETER and CliPAS shows an even higher TCC score of 0.87. Effectiveness of MME depends on the averaged skill of individual models and their mutual independency. For probabilistic forecast the CliPAS MME gains considerable skill from increased forecast reliability as the number of model being used increases; the forecast resolution also increases for 2 m temperature but slightly decreases for precipitation. Equatorial Sea Surface

  14. PERPADUAN COMBINED SAMPLING DAN ENSEMBLE OF SUPPORT VECTOR MACHINE (ENSVM UNTUK MENANGANI KASUS CHURN PREDICTION PERUSAHAAN TELEKOMUNIKASI

    Directory of Open Access Journals (Sweden)

    Fernandy Marbun

    2010-07-01

    Full Text Available Churn prediction adalah suatu cara untuk memprediksi pelanggan yang berpotensial untuk churn. Data mining khususnya klasifikasi tampaknya dapat menjadi alternatif solusi dalam membuat model churn prediction yang akurat. Namun hasil klasifikasi menjadi tidak akurat disebabkan karena data churn bersifat imbalance. Kelas data menjadi tidak stabil karena data akan lebih condong ke bagian data yang memiliki komposisi data yang lebih besar. Salah satu cara untuk menangani permasalahan ini adalah dengan memodifikasi dataset yang digunakan atau yang lebih dikenal dengan metode resampling. Teknik resampling ini meliputi over-sampling, under-sampling, dan combined-sampling. Metode Ensemble of SVM (EnSVM diharapkan dapat meminimalisir kesalahan klasifikasi kelas mayor dan minor yang dihasilkan oleh classifier SVM tunggal. Dalam penelitian ini akan dicoba untuk memadukan combined sampling dan EnSVM untuk churn predicition. Pengujian dilakukan dengan membandingkan hasil klasifikasi CombinedSampling-EnSVM dengan SMOTE-SVM (perpaduan oversamping-SVM dan pure-SVM. Hasil pengujian menunjukkan bahwa metode CombinedSampling-EnSVM secara umum hanya mampu menghasilkan performansi Gini Index yang lebih baik daripada metode SMOTE-SVM dan tanpa resampling (pure-SVM.

  15. Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments

    Directory of Open Access Journals (Sweden)

    Kurgan Lukasz

    2008-10-01

    Full Text Available Abstract Background β-turn is a secondary protein structure type that plays significant role in protein folding, stability, and molecular recognition. To date, several methods for prediction of β-turns from protein sequences were developed, but they are characterized by relatively poor prediction quality. The novelty of the proposed sequence-based β-turn predictor stems from the usage of a window based information extracted from four predicted three-state secondary structures, which together with a selected set of position specific scoring matrix (PSSM values serve as an input to the support vector machine (SVM predictor. Results We show that (1 all four predicted secondary structures are useful; (2 the most useful information extracted from the predicted secondary structure includes the structure of the predicted residue, secondary structure content in a window around the predicted residue, and features that indicate whether the predicted residue is inside a secondary structure segment; (3 the PSSM values of Asn, Asp, Gly, Ile, Leu, Met, Pro, and Val were among the top ranked features, which corroborates with recent studies. The Asn, Asp, Gly, and Pro indicate potential β-turns, while the remaining four amino acids are useful to predict non-β-turns. Empirical evaluation using three nonredundant datasets shows favorable Qtotal, Qpredicted and MCC values when compared with over a dozen of modern competing methods. Our method is the first to break the 80% Qtotal barrier and achieves Qtotal = 80.9%, MCC = 0.47, and Qpredicted higher by over 6% when compared with the second best method. We use feature selection to reduce the dimensionality of the feature vector used as the input for the proposed prediction method. The applied feature set is smaller by 86, 62 and 37% when compared with the second and two third-best (with respect to MCC competing methods, respectively. Conclusion Experiments show that the proposed method constitutes an

  16. An Ensemble Learning for Predicting Breakdown Field Strength of Polyimide Nanocomposite Films

    Directory of Open Access Journals (Sweden)

    Hai Guo

    2015-01-01

    Full Text Available Using the method of Stochastic Gradient Boosting, ten SMO-SVR are constructed into a strong prediction model (SGBS model that is efficient in predicting the breakdown field strength. Adopting the method of in situ polymerization, thirty-two samples of nanocomposite films with different percentage compositions, components, and thicknesses are prepared. Then, the breakdown field strength is tested by using voltage test equipment. From the test results, the correlation coefficient (CC, the mean absolute error (MAE, the root mean squared error (RMSE, the relative absolute error (RAE, and the root relative squared error (RRSE are 0.9664, 14.2598, 19.684, 22.26%, and 25.01% with SGBS model. The result indicates that the predicted values fit well with the measured ones. Comparisons between models such as linear regression, BP, GRNN, SVR, and SMO-SVR have also been made under the same conditions. They show that CC of the SGBS model is higher than those of other models. Nevertheless, the MAE, RMSE, RAE, and RRSE of the SGBS model are lower than those of other models. This demonstrates that the SGBS model is better than other models in predicting the breakdown field strength of polyimide nanocomposite films.

  17. Integrating piecewise linear representation and ensemble neural network for stock price prediction

    OpenAIRE

    Asaduzzaman, Md.; Shahjahan, Md.; Ahmed, Fatema Johera; Islam, Md. Monirul; Murase, Kazuyuki

    2014-01-01

    Stock Prices are considered to be very dynamic and susceptible to quick changes because of the underlying nature of the financial domain, and in part because of the interchange between known parameters and unknown factors. Of late, several researchers have used Piecewise Linear Representation (PLR) to predict the stock market pricing. However, some improvements are needed to avoid the appropriate threshold of the trading decision, choosing the input index as well as improving the overall perf...

  18. Flood Forecasting Based on TIGGE Precipitation Ensemble Forecast

    Directory of Open Access Journals (Sweden)

    Jinyin Ye

    2016-01-01

    Full Text Available TIGGE (THORPEX International Grand Global Ensemble was a major part of the THORPEX (Observing System Research and Predictability Experiment. It integrates ensemble precipitation products from all the major forecast centers in the world and provides systematic evaluation on the multimodel ensemble prediction system. Development of meteorologic-hydrologic coupled flood forecasting model and early warning model based on the TIGGE precipitation ensemble forecast can provide flood probability forecast, extend the lead time of the flood forecast, and gain more time for decision-makers to make the right decision. In this study, precipitation ensemble forecast products from ECMWF, NCEP, and CMA are used to drive distributed hydrologic model TOPX. We focus on Yi River catchment and aim to build a flood forecast and early warning system. The results show that the meteorologic-hydrologic coupled model can satisfactorily predict the flow-process of four flood events. The predicted occurrence time of peak discharges is close to the observations. However, the magnitude of the peak discharges is significantly different due to various performances of the ensemble prediction systems. The coupled forecasting model can accurately predict occurrence of the peak time and the corresponding risk probability of peak discharge based on the probability distribution of peak time and flood warning, which can provide users a strong theoretical foundation and valuable information as a promising new approach.

  19. Similarity-based multi-model ensemble approach for 1-15-day advance prediction of monsoon rainfall over India

    Science.gov (United States)

    Jaiswal, Neeru; Kishtawal, C. M.; Bhomia, Swati

    2018-04-01

    The southwest (SW) monsoon season (June, July, August and September) is the major period of rainfall over the Indian region. The present study focuses on the development of a new multi-model ensemble approach based on the similarity criterion (SMME) for the prediction of SW monsoon rainfall in the extended range. This approach is based on the assumption that training with the similar type of conditions may provide the better forecasts in spite of the sequential training which is being used in the conventional MME approaches. In this approach, the training dataset has been selected by matching the present day condition to the archived dataset and days with the most similar conditions were identified and used for training the model. The coefficients thus generated were used for the rainfall prediction. The precipitation forecasts from four general circulation models (GCMs), viz. European Centre for Medium-Range Weather Forecasts (ECMWF), United Kingdom Meteorological Office (UKMO), National Centre for Environment Prediction (NCEP) and China Meteorological Administration (CMA) have been used for developing the SMME forecasts. The forecasts of 1-5, 6-10 and 11-15 days were generated using the newly developed approach for each pentad of June-September during the years 2008-2013 and the skill of the model was analysed using verification scores, viz. equitable skill score (ETS), mean absolute error (MAE), Pearson's correlation coefficient and Nash-Sutcliffe model efficiency index. Statistical analysis of SMME forecasts shows superior forecast skill compared to the conventional MME and the individual models for all the pentads, viz. 1-5, 6-10 and 11-15 days.

  20. Entropy of network ensembles

    Science.gov (United States)

    Bianconi, Ginestra

    2009-03-01

    In this paper we generalize the concept of random networks to describe network ensembles with nontrivial features by a statistical mechanics approach. This framework is able to describe undirected and directed network ensembles as well as weighted network ensembles. These networks might have nontrivial community structure or, in the case of networks embedded in a given space, they might have a link probability with a nontrivial dependence on the distance between the nodes. These ensembles are characterized by their entropy, which evaluates the cardinality of networks in the ensemble. In particular, in this paper we define and evaluate the structural entropy, i.e., the entropy of the ensembles of undirected uncorrelated simple networks with given degree sequence. We stress the apparent paradox that scale-free degree distributions are characterized by having small structural entropy while they are so widely encountered in natural, social, and technological complex systems. We propose a solution to the paradox by proving that scale-free degree distributions are the most likely degree distribution with the corresponding value of the structural entropy. Finally, the general framework we present in this paper is able to describe microcanonical ensembles of networks as well as canonical or hidden-variable network ensembles with significant implications for the formulation of network-constructing algorithms.

  1. Ensembles-based predictions of climate change impacts on bioclimatic zones in Northeast Asia

    Science.gov (United States)

    Choi, Y.; Jeon, S. W.; Lim, C. H.; Ryu, J.

    2017-12-01

    Biodiversity is rapidly declining globally and efforts are needed to mitigate this continually increasing loss of species. Clustering of areas with similar habitats can be used to prioritize protected areas and distribute resources for the conservation of species, selection of representative sample areas for research, and evaluation of impacts due to environmental changes. In this study, Northeast Asia (NEA) was classified into 14 bioclimatic zones using statistical techniques, which are correlation analysis and principal component analysis (PCA), and the iterative self-organizing data analysis technique algorithm (ISODATA). Based on these bioclimatic classification, we predicted shift of bioclimatic zones due to climate change. The input variables include the current climatic data (1960-1990) and the future climatic data of the HadGEM2-AO model (RCP 4.5(2050, 2070) and 8.5(2050, 2070)) provided by WorldClim. Using these data, multi-modeling methods including maximum likelihood classification, random forest, and species distribution modelling have been used to project the impact of climate change on the spatial distribution of bioclimatic zones within NEA. The results of various models were compared and analyzed by overlapping each result. As the result, significant changes in bioclimatic conditions can be expected throughout the NEA by 2050s and 2070s. The overall zones moved upward and some zones were predicted to disappear. This analysis provides the basis for understanding potential impacts of climate change on biodiversity and ecosystem. Also, this could be used more effectively to support decision making on climate change adaptation.

  2. Phthalocyanine-nanocarbon ensembles: from discrete molecular and supramolecular systems to hybrid nanomaterials.

    Science.gov (United States)

    Bottari, Giovanni; de la Torre, Gema; Torres, Tomas

    2015-04-21

    Phthalocyanines (Pcs) are macrocyclic and aromatic compounds that present unique electronic features such as high molar absorption coefficients, rich redox chemistry, and photoinduced energy/electron transfer abilities that can be modulated as a function of the electronic character of their counterparts in donor-acceptor (D-A) ensembles. In this context, carbon nanostructures such as fullerenes, carbon nanotubes (CNTs), and, more recently, graphene are among the most suitable Pc "companions". Pc-C60 ensembles have been for a long time the main actors in this field, due to the commercial availability of C60 and the well-established synthetic methods for its functionalization. As a result, many Pc-C60 architectures have been prepared, featuring different connectivities (covalent or supramolecular), intermolecular interactions (self-organized or molecularly dispersed species), and Pc HOMO/LUMO levels. All these elements provide a versatile toolbox for tuning the photophysical properties in terms of the type of process (photoinduced energy/electron transfer), the nature of the interactions between the electroactive units (through bond or space), and the kinetics of the formation/decay of the photogenerated species. Some recent trends in this field include the preparation of stimuli-responsive multicomponent systems with tunable photophysical properties and highly ordered nanoarchitectures and surface-supported systems showing high charge mobilities. A breakthrough in the Pc-nanocarbon field was the appearance of CNTs and graphene, which opened a new avenue for the preparation of intriguing photoresponsive hybrid ensembles showing light-stimulated charge separation. The scarce solubility of these 1-D and 2-D nanocarbons, together with their lower reactivity with respect to C60 stemming from their less strained sp(2) carbon networks, has not meant an unsurmountable limitation for the preparation of variety of Pc-based hybrids. These systems, which show improved

  3. Downscaling Satellite Data for Predicting Catchment-scale Root Zone Soil Moisture with Ground-based Sensors and an Ensemble Kalman Filter

    Science.gov (United States)

    Lin, H.; Baldwin, D. C.; Smithwick, E. A. H.

    2015-12-01

    Predicting root zone (0-100 cm) soil moisture (RZSM) content at a catchment-scale is essential for drought and flood predictions, irrigation planning, weather forecasting, and many other applications. Satellites, such as the NASA Soil Moisture Active Passive (SMAP), can estimate near-surface (0-5 cm) soil moisture content globally at coarse spatial resolutions. We develop a hierarchical Ensemble Kalman Filter (EnKF) data assimilation modeling system to downscale satellite-based near-surface soil moisture and to estimate RZSM content across the Shale Hills Critical Zone Observatory at a 1-m resolution in combination with ground-based soil moisture sensor data. In this example, a simple infiltration model within the EnKF-model has been parameterized for 6 soil-terrain units to forecast daily RZSM content in the catchment from 2009 - 2012 based on AMSRE. LiDAR-derived terrain variables define intra-unit RZSM variability using a novel covariance localization technique. This method also allows the mapping of uncertainty with our RZSM estimates for each time-step. A catchment-wide satellite-to-surface downscaling parameter, which nudges the satellite measurement closer to in situ near-surface data, is also calculated for each time-step. We find significant differences in predicted root zone moisture storage for different terrain units across the experimental time-period. Root mean square error from a cross-validation analysis of RZSM predictions using an independent dataset of catchment-wide in situ Time-Domain Reflectometry (TDR) measurements ranges from 0.060-0.096 cm3 cm-3, and the RZSM predictions are significantly (p < 0.05) correlated with TDR measurements [r = 0.47-0.68]. The predictive skill of this data assimilation system is similar to the Penn State Integrated Hydrologic Modeling (PIHM) system. Uncertainty estimates are significantly (p < 0.05) correlated to cross validation error during wet and dry conditions, but more so in dry summer seasons. Developing an

  4. Investigating properties of the cardiovascular system using innovative analysis algorithms based on ensemble empirical mode decomposition.

    Science.gov (United States)

    Yeh, Jia-Rong; Lin, Tzu-Yu; Chen, Yun; Sun, Wei-Zen; Abbod, Maysam F; Shieh, Jiann-Shing

    2012-01-01

    Cardiovascular system is known to be nonlinear and nonstationary. Traditional linear assessments algorithms of arterial stiffness and systemic resistance of cardiac system accompany the problem of nonstationary or inconvenience in practical applications. In this pilot study, two new assessment methods were developed: the first is ensemble empirical mode decomposition based reflection index (EEMD-RI) while the second is based on the phase shift between ECG and BP on cardiac oscillation. Both methods utilise the EEMD algorithm which is suitable for nonlinear and nonstationary systems. These methods were used to investigate the properties of arterial stiffness and systemic resistance for a pig's cardiovascular system via ECG and blood pressure (BP). This experiment simulated a sequence of continuous changes of blood pressure arising from steady condition to high blood pressure by clamping the artery and an inverse by relaxing the artery. As a hypothesis, the arterial stiffness and systemic resistance should vary with the blood pressure due to clamping and relaxing the artery. The results show statistically significant correlations between BP, EEMD-based RI, and the phase shift between ECG and BP on cardiac oscillation. The two assessments results demonstrate the merits of the EEMD for signal analysis.

  5. NYYD Ensemble

    Index Scriptorium Estoniae

    2002-01-01

    NYYD Ensemble'i duost Traksmann - Lukk E.-S. Tüüri teosega "Symbiosis", mis on salvestatud ka hiljuti ilmunud NYYD Ensemble'i CDle. 2. märtsil Rakvere Teatri väikeses saalis ja 3. märtsil Rotermanni Soolalaos, kavas Tüür, Kaumann, Berio, Reich, Yun, Hauta-aho, Buckinx

  6. Systematic Analysis of Quantitative Logic Model Ensembles Predicts Drug Combination Effects on Cell Signaling Networks

    Science.gov (United States)

    2016-08-27

    bovine serum albumin (BSA) diluted to the amount corresponding to that in the media of the stimulated cells. Phospho-JNK comprises two isoforms whose...information accompanies this paper on the CPT: Pharmacometrics & Systems Pharmacology website (http://www.wileyonlinelibrary.com/psp4) Systematic Analysis of Quantitative Logic Model Morris et al. 553 www.wileyonlinelibrary/psp4

  7. Ensemble Data Mining Methods

    Science.gov (United States)

    Oza, Nikunj C.

    2004-01-01

    Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve better prediction accuracy than any of the individual models could on their own. The basic goal when designing an ensemble is the same as when establishing a committee of people: each member of the committee should be as competent as possible, but the members should be complementary to one another. If the members are not complementary, Le., if they always agree, then the committee is unnecessary---any one member is sufficient. If the members are complementary, then when one or a few members make an error, the probability is high that the remaining members can correct this error. Research in ensemble methods has largely revolved around designing ensembles consisting of competent yet complementary models.

  8. Parameter estimation for stiff deterministic dynamical systems via ensemble Kalman filter

    International Nuclear Information System (INIS)

    Arnold, Andrea; Calvetti, Daniela; Somersalo, Erkki

    2014-01-01

    A commonly encountered problem in numerous areas of applications is to estimate the unknown coefficients of a dynamical system from direct or indirect observations at discrete times of some of the components of the state vector. A related problem is to estimate unobserved components of the state. An egregious example of such a problem is provided by metabolic models, in which the numerous model parameters and the concentrations of the metabolites in tissue are to be estimated from concentration data in the blood. A popular method for addressing similar questions in stochastic and turbulent dynamics is the ensemble Kalman filter (EnKF), a particle-based filtering method that generalizes classical Kalman filtering. In this work, we adapt the EnKF algorithm for deterministic systems in which the numerical approximation error is interpreted as a stochastic drift with variance based on classical error estimates of numerical integrators. This approach, which is particularly suitable for stiff systems where the stiffness may depend on the parameters, allows us to effectively exploit the parallel nature of particle methods. Moreover, we demonstrate how spatial prior information about the state vector, which helps the stability of the computed solution, can be incorporated into the filter. The viability of the approach is shown by computed examples, including a metabolic system modeling an ischemic episode in skeletal muscle, with a high number of unknown parameters. (paper)

  9. Development of multimodel ensemble based district level medium ...

    Indian Academy of Sciences (India)

    tively by computing the anomaly correlation coef- ficient between the predicted rainfall and observed rainfall. High resolution (lat./long.) gridded data ..... particularly in the prediction of intensity and mesoscale rainfall features causing inland flooding. During recent years, Ensemble. Prediction System (EPS) has emerged as ...

  10. Improving quantitative precipitation nowcasting with a local ensemble transform Kalman filter radar data assimilation system: observing system simulation experiments

    Directory of Open Access Journals (Sweden)

    Chih-Chien Tsai

    2014-03-01

    Full Text Available This study develops a Doppler radar data assimilation system, which couples the local ensemble transform Kalman filter with the Weather Research and Forecasting model. The benefits of this system to quantitative precipitation nowcasting (QPN are evaluated with observing system simulation experiments on Typhoon Morakot (2009, which brought record-breaking rainfall and extensive damage to central and southern Taiwan. The results indicate that the assimilation of radial velocity and reflectivity observations improves the three-dimensional winds and rain-mixing ratio most significantly because of the direct relations in the observation operator. The patterns of spiral rainbands become more consistent between different ensemble members after radar data assimilation. The rainfall intensity and distribution during the 6-hour deterministic nowcast are also improved, especially for the first 3 hours. The nowcasts with and without radar data assimilation have similar evolution trends driven by synoptic-scale conditions. Furthermore, we carry out a series of sensitivity experiments to develop proper assimilation strategies, in which a mixed localisation method is proposed for the first time and found to give further QPN improvement in this typhoon case.

  11. Extracting foreground ensemble features to detect abnormal crowd behavior in intelligent video-surveillance systems

    Science.gov (United States)

    Chan, Yi-Tung; Wang, Shuenn-Jyi; Tsai, Chung-Hsien

    2017-09-01

    Public safety is a matter of national security and people's livelihoods. In recent years, intelligent video-surveillance systems have become important active-protection systems. A surveillance system that provides early detection and threat assessment could protect people from crowd-related disasters and ensure public safety. Image processing is commonly used to extract features, e.g., people, from a surveillance video. However, little research has been conducted on the relationship between foreground detection and feature extraction. Most current video-surveillance research has been developed for restricted environments, in which the extracted features are limited by having information from a single foreground; they do not effectively represent the diversity of crowd behavior. This paper presents a general framework based on extracting ensemble features from the foreground of a surveillance video to analyze a crowd. The proposed method can flexibly integrate different foreground-detection technologies to adapt to various monitored environments. Furthermore, the extractable representative features depend on the heterogeneous foreground data. Finally, a classification algorithm is applied to these features to automatically model crowd behavior and distinguish an abnormal event from normal patterns. The experimental results demonstrate that the proposed method's performance is both comparable to that of state-of-the-art methods and satisfies the requirements of real-time applications.

  12. An artificial neural network ensemble method for fault diagnosis of proton exchange membrane fuel cell system

    International Nuclear Information System (INIS)

    Shao, Meng; Zhu, Xin-Jian; Cao, Hong-Fei; Shen, Hai-Feng

    2014-01-01

    The commercial viability of PEMFC (proton exchange membrane fuel cell) systems depends on using effective fault diagnosis technologies in PEMFC systems. However, many researchers have experimentally studied PEMFC (proton exchange membrane fuel cell) systems without considering certain fault conditions. In this paper, an ANN (artificial neural network) ensemble method is presented that improves the stability and reliability of the PEMFC systems. In the first part, a transient model giving it flexibility in application to some exceptional conditions is built. The PEMFC dynamic model is built and simulated using MATLAB. In the second, using this model and experiments, the mechanisms of four different faults in PEMFC systems are analyzed in detail. Third, the ANN ensemble for the fault diagnosis is built and modeled. This model is trained and tested by the data. The test result shows that, compared with the previous method for fault diagnosis of PEMFC systems, the proposed fault diagnosis method has higher diagnostic rate and generalization ability. Moreover, the partial structure of this method can be altered easily, along with the change of the PEMFC systems. In general, this method for diagnosis of PEMFC has value for certain applications. - Highlights: • We analyze the principles and mechanisms of the four faults in PEMFC (proton exchange membrane fuel cell) system. • We design and model an ANN (artificial neural network) ensemble method for the fault diagnosis of PEMFC system. • This method has high diagnostic rate and strong generalization ability

  13. Ensemble based system for whole-slide prostate cancer probability mapping using color texture features.

    LENUS (Irish Health Repository)

    DiFranco, Matthew D

    2011-01-01

    We present a tile-based approach for producing clinically relevant probability maps of prostatic carcinoma in histological sections from radical prostatectomy. Our methodology incorporates ensemble learning for feature selection and classification on expert-annotated images. Random forest feature selection performed over varying training sets provides a subset of generalized CIEL*a*b* co-occurrence texture features, while sample selection strategies with minimal constraints reduce training data requirements to achieve reliable results. Ensembles of classifiers are built using expert-annotated tiles from training images, and scores for the probability of cancer presence are calculated from the responses of each classifier in the ensemble. Spatial filtering of tile-based texture features prior to classification results in increased heat-map coherence as well as AUC values of 95% using ensembles of either random forests or support vector machines. Our approach is designed for adaptation to different imaging modalities, image features, and histological decision domains.

  14. Ensembl 2017

    OpenAIRE

    Aken, Bronwen L.; Achuthan, Premanand; Akanni, Wasiu; Amode, M. Ridwan; Bernsdorff, Friederike; Bhai, Jyothish; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Gil, Laurent; Gir?n, Carlos Garc?a; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E.

    2016-01-01

    Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access ...

  15. Ensemble Sampling

    OpenAIRE

    Lu, Xiuyuan; Van Roy, Benjamin

    2017-01-01

    Thompson sampling has emerged as an effective heuristic for a broad range of online decision problems. In its basic form, the algorithm requires computing and sampling from a posterior distribution over models, which is tractable only for simple special cases. This paper develops ensemble sampling, which aims to approximate Thompson sampling while maintaining tractability even in the face of complex models such as neural networks. Ensemble sampling dramatically expands on the range of applica...

  16. The canonical ensemble redefined - 1: Formalism

    International Nuclear Information System (INIS)

    Venkataraman, R.

    1984-12-01

    For studying the thermodynamic properties of systems we propose an ensemble that lies in between the familiar canonical and microcanonical ensembles. We point out the transition from the canonical to microcanonical ensemble and prove from a comparative study that all these ensembles do not yield the same results even in the thermodynamic limit. An investigation of the coupling between two or more systems with these ensembles suggests that the state of thermodynamical equilibrium is a special case of statistical equilibrium. (author)

  17. JEnsembl: a version-aware Java API to Ensembl data systems.

    Science.gov (United States)

    Paterson, Trevor; Law, Andy

    2012-11-01

    The Ensembl Project provides release-specific Perl APIs for efficient high-level programmatic access to data stored in various Ensembl database schema. Although Perl scripts are perfectly suited for processing large volumes of text-based data, Perl is not ideal for developing large-scale software applications nor embedding in graphical interfaces. The provision of a novel Java API would facilitate type-safe, modular, object-orientated development of new Bioinformatics tools with which to access, analyse and visualize Ensembl data. The JEnsembl API implementation provides basic data retrieval and manipulation functionality from the Core, Compara and Variation databases for all species in Ensembl and EnsemblGenomes and is a platform for the development of a richer API to Ensembl datasources. The JEnsembl architecture uses a text-based configuration module to provide evolving, versioned mappings from database schema to code objects. A single installation of the JEnsembl API can therefore simultaneously and transparently connect to current and previous database instances (such as those in the public archive) thus facilitating better analysis repeatability and allowing 'through time' comparative analyses to be performed. Project development, released code libraries, Maven repository and documentation are hosted at SourceForge (http://jensembl.sourceforge.net).

  18. Prediction of N-Methyl-D-Aspartate Receptor GluN1-Ligand Binding Affinity by a Novel SVM-Pose/SVM-Score Combinatorial Ensemble Docking Scheme.

    Science.gov (United States)

    Leong, Max K; Syu, Ren-Guei; Ding, Yi-Lung; Weng, Ching-Feng

    2017-01-06

    The glycine-binding site of the N-methyl-D-aspartate receptor (NMDAR) subunit GluN1 is a potential pharmacological target for neurodegenerative disorders. A novel combinatorial ensemble docking scheme using ligand and protein conformation ensembles and customized support vector machine (SVM)-based models to select the docked pose and to predict the docking score was generated for predicting the NMDAR GluN1-ligand binding affinity. The predicted root mean square deviation (RMSD) values in pose by SVM-Pose models were found to be in good agreement with the observed values (n = 30, r 2  = 0.928-0.988,  = 0.894-0.954, RMSE = 0.002-0.412, s = 0.001-0.214), and the predicted pK i values by SVM-Score were found to be in good agreement with the observed values for the training samples (n = 24, r 2  = 0.967,  = 0.899, RMSE = 0.295, s = 0.170) and test samples (n = 13, q 2  = 0.894, RMSE = 0.437, s = 0.202). When subjected to various statistical validations, the developed SVM-Pose and SVM-Score models consistently met the most stringent criteria. A mock test asserted the predictivity of this novel docking scheme. Collectively, this accurate novel combinatorial ensemble docking scheme can be used to predict the NMDAR GluN1-ligand binding affinity for facilitating drug discovery.

  19. Flood susceptibility mapping using novel ensembles of adaptive neuro fuzzy inference system and metaheuristic algorithms.

    Science.gov (United States)

    Razavi Termeh, Seyed Vahid; Kornejady, Aiding; Pourghasemi, Hamid Reza; Keesstra, Saskia

    2018-02-15

    Flood is one of the most destructive natural disasters which cause great financial and life losses per year. Therefore, producing susceptibility maps for flood management are necessary in order to reduce its harmful effects. The aim of the present study is to map flood hazard over the Jahrom Township in Fars Province using a combination of adaptive neuro-fuzzy inference systems (ANFIS) with different metaheuristics algorithms such as ant colony optimization (ACO), genetic algorithm (GA), and particle swarm optimization (PSO) and comparing their accuracy. A total number of 53 flood locations areas were identified, 35 locations of which were randomly selected in order to model flood susceptibility and the remaining 16 locations were used to validate the models. Learning vector quantization (LVQ), as one of the supervised neural network methods, was employed in order to estimate factors' importance. Nine flood conditioning factors namely: slope degree, plan curvature, altitude, topographic wetness index (TWI), stream power index (SPI), distance from river, land use/land cover, rainfall, and lithology were selected and the corresponding maps were prepared in ArcGIS. The frequency ratio (FR) model was used to assign weights to each class within particular controlling factor, then the weights was transferred into MATLAB software for further analyses and to combine with metaheuristic models. The ANFIS-PSO was found to be the most practical model in term of producing the highly focused flood susceptibility map with lesser spatial distribution related to highly susceptible classes. The chi-square result attests the same, where the ANFIS-PSO had the highest spatial differentiation within flood susceptibility classes over the study area. The area under the curve (AUC) obtained from ROC curve indicated the accuracy of 91.4%, 91.8%, 92.6% and 94.5% for the respective models of FR, ANFIS-ACO, ANFIS-GA, and ANFIS-PSO ensembles. So, the ensemble of ANFIS-PSO was introduced as the

  20. Predictive systems ecology

    OpenAIRE

    Evans, Matthew R.; Bithell, Mike; Cornell, Stephen J.; Dall, Sasha R. X.; D?az, Sandra; Emmott, Stephen; Ernande, Bruno; Grimm, Volker; Hodgson, David J.; Lewis, Simon L.; Mace, Georgina M.; Morecroft, Michael; Moustakas, Aristides; Murphy, Eugene; Newbold, Tim

    2013-01-01

    Human societies, and their well-being, depend to a significant extent on the state of the ecosystems that surround them. These ecosystems are changing rapidly usually in response to anthropogenic changes in the environment. To determine the likely impact of environmental change on ecosystems and the best ways to manage them, it would be desirable to be able to predict their future states. We present a proposal to develop the paradigm of ...

  1. Diversity in random subspacing ensembles

    NARCIS (Netherlands)

    Tsymbal, A.; Pechenizkiy, M.; Cunningham, P.; Kambayashi, Y.; Mohania, M.K.; Wöß, W.

    2004-01-01

    Ensembles of learnt models constitute one of the main current directions in machine learning and data mining. It was shown experimentally and theoretically that in order for an ensemble to be effective, it should consist of classifiers having diversity in their predictions. A number of ways are

  2. An Integrated Scenario Ensemble-Based Framework for Hurricane Evacuation Modeling: Part 1-Decision Support System.

    Science.gov (United States)

    Davidson, Rachel A; Nozick, Linda K; Wachtendorf, Tricia; Blanton, Brian; Colle, Brian; Kolar, Randall L; DeYoung, Sarah; Dresback, Kendra M; Yi, Wenqi; Yang, Kun; Leonardo, Nicholas

    2018-03-30

    This article introduces a new integrated scenario-based evacuation (ISE) framework to support hurricane evacuation decision making. It explicitly captures the dynamics, uncertainty, and human-natural system interactions that are fundamental to the challenge of hurricane evacuation, but have not been fully captured in previous formal evacuation models. The hazard is represented with an ensemble of probabilistic scenarios, population behavior with a dynamic decision model, and traffic with a dynamic user equilibrium model. The components are integrated in a multistage stochastic programming model that minimizes risk and travel times to provide a tree of evacuation order recommendations and an evaluation of the risk and travel time performance for that solution. The ISE framework recommendations offer an advance in the state of the art because they: (1) are based on an integrated hazard assessment (designed to ultimately include inland flooding), (2) explicitly balance the sometimes competing objectives of minimizing risk and minimizing travel time, (3) offer a well-hedged solution that is robust under the range of ways the hurricane might evolve, and (4) leverage the substantial value of increasing information (or decreasing degree of uncertainty) over the course of a hurricane event. A case study for Hurricane Isabel (2003) in eastern North Carolina is presented to demonstrate how the framework is applied, the type of results it can provide, and how it compares to available methods of a single scenario deterministic analysis and a two-stage stochastic program. © 2018 Society for Risk Analysis.

  3. Bias Correction Techniques to Improve Air Quality Ensemble Predictions: Focus on O3 and PM Over Portugal

    NARCIS (Netherlands)

    Monteiro, A.; Ribeiro, I.; Tchepel, O.; Sá, E.; Ferreira, J.; Carvalho, A.; Martins, V.; Strunk, A.; Galmarini, S.; Elbern, H.; Schaap, M.; Builtjes, P.; Miranda, A.I.; Borrego, C.

    2013-01-01

    Five air quality models were applied over Portugal for July 2006 and used as ensemble members. Each model was used, with its original set up in terms of meteorology, parameterizations, boundary conditions and chemical mechanisms, but with the same emission data. The validation of the individual

  4. Stochastic Prediction of Wind Generating Resources Using the Enhanced Ensemble Model for Jeju Island’s Wind Farms in South Korea

    Directory of Open Access Journals (Sweden)

    Deockho Kim

    2017-05-01

    Full Text Available Due to the intermittency of wind power generation, it is very hard to manage its system operation and planning. In order to incorporate higher wind power penetrations into power systems that maintain secure and economic power system operation, an accurate and efficient estimation of wind power outputs is needed. In this paper, we propose the stochastic prediction of wind generating resources using an enhanced ensemble model for Jeju Island’s wind farms in South Korea. When selecting the potential sites of wind farms, wind speed data at points of interest are not always available. We apply the Kriging method, which is one of spatial interpolation, to estimate wind speed at potential sites. We also consider a wind profile power law to correct wind speed along the turbine height and terrain characteristics. After that, we used estimated wind speed data to calculate wind power output and select the best wind farm sites using a Weibull distribution. Probability density function (PDF or cumulative density function (CDF is used to estimate the probability of wind speed. The wind speed data is classified along the manufacturer’s power curve data. Therefore, the probability of wind speed is also given in accordance with classified values. The average wind power output is estimated in the form of a confidence interval. The empirical data of meteorological towers from Jeju Island in Korea is used to interpolate the wind speed data spatially at potential sites. Finally, we propose the best wind farm site among the four potential wind farm sites.

  5. Ensembl variation resources

    Directory of Open Access Journals (Sweden)

    Marin-Garcia Pablo

    2010-05-01

    Full Text Available Abstract Background The maturing field of genomics is rapidly increasing the number of sequenced genomes and producing more information from those previously sequenced. Much of this additional information is variation data derived from sampling multiple individuals of a given species with the goal of discovering new variants and characterising the population frequencies of the variants that are already known. These data have immense value for many studies, including those designed to understand evolution and connect genotype to phenotype. Maximising the utility of the data requires that it be stored in an accessible manner that facilitates the integration of variation data with other genome resources such as gene annotation and comparative genomics. Description The Ensembl project provides comprehensive and integrated variation resources for a wide variety of chordate genomes. This paper provides a detailed description of the sources of data and the methods for creating the Ensembl variation databases. It also explores the utility of the information by explaining the range of query options available, from using interactive web displays, to online data mining tools and connecting directly to the data servers programmatically. It gives a good overview of the variation resources and future plans for expanding the variation data within Ensembl. Conclusions Variation data is an important key to understanding the functional and phenotypic differences between individuals. The development of new sequencing and genotyping technologies is greatly increasing the amount of variation data known for almost all genomes. The Ensembl variation resources are integrated into the Ensembl genome browser and provide a comprehensive way to access this data in the context of a widely used genome bioinformatics system. All Ensembl data is freely available at http://www.ensembl.org and from the public MySQL database server at ensembldb.ensembl.org.

  6. Neural network ensemble based CAD system for focal liver lesions from B-mode ultrasound.

    Science.gov (United States)

    Virmani, Jitendra; Kumar, Vinod; Kalra, Naveen; Khandelwal, Niranjan

    2014-08-01

    A neural network ensemble (NNE) based computer-aided diagnostic (CAD) system to assist radiologists in differential diagnosis between focal liver lesions (FLLs), including (1) typical and atypical cases of Cyst, hemangioma (HEM) and metastatic carcinoma (MET) lesions, (2) small and large hepatocellular carcinoma (HCC) lesions, along with (3) normal (NOR) liver tissue is proposed in the present work. Expert radiologists, visualize the textural characteristics of regions inside and outside the lesions to differentiate between different FLLs, accordingly texture features computed from inside lesion regions of interest (IROIs) and texture ratio features computed from IROIs and surrounding lesion regions of interests (SROIs) are taken as input. Principal component analysis (PCA) is used for reducing the dimensionality of the feature space before classifier design. The first step of classification module consists of a five class PCA-NN based primary classifier which yields probability outputs for five liver image classes. The second step of classification module consists of ten binary PCA-NN based secondary classifiers for NOR/Cyst, NOR/HEM, NOR/HCC, NOR/MET, Cyst/HEM, Cyst/HCC, Cyst/MET, HEM/HCC, HEM/MET and HCC/MET classes. The probability outputs of five class PCA-NN based primary classifier is used to determine the first two most probable classes for a test instance, based on which it is directed to the corresponding binary PCA-NN based secondary classifier for crisp classification between two classes. By including the second step of the classification module, classification accuracy increases from 88.7 % to 95 %. The promising results obtained by the proposed system indicate its usefulness to assist radiologists in differential diagnosis of FLLs.

  7. Surface drift prediction in the Adriatic Sea using hyper-ensemble statistics on atmospheric, ocean and wave models: Uncertainties and probability distribution areas

    Science.gov (United States)

    Rixen, M.; Ferreira-Coelho, E.; Signell, R.

    2008-01-01

    Despite numerous and regular improvements in underlying models, surface drift prediction in the ocean remains a challenging task because of our yet limited understanding of all processes involved. Hence, deterministic approaches to the problem are often limited by empirical assumptions on underlying physics. Multi-model hyper-ensemble forecasts, which exploit the power of an optimal local combination of available information including ocean, atmospheric and wave models, may show superior forecasting skills when compared to individual models because they allow for local correction and/or bias removal. In this work, we explore in greater detail the potential and limitations of the hyper-ensemble method in the Adriatic Sea, using a comprehensive surface drifter database. The performance of the hyper-ensembles and the individual models are discussed by analyzing associated uncertainties and probability distribution maps. Results suggest that the stochastic method may reduce position errors significantly for 12 to 72??h forecasts and hence compete with pure deterministic approaches. ?? 2007 NATO Undersea Research Centre (NURC).

  8. Evaluation of medium-range ensemble flood forecasting based on calibration strategies and ensemble methods in Lanjiang Basin, Southeast China

    Science.gov (United States)

    Liu, Li; Gao, Chao; Xuan, Weidong; Xu, Yue-Ping

    2017-11-01

    Ensemble flood forecasts by hydrological models using numerical weather prediction products as forcing data are becoming more commonly used in operational flood forecasting applications. In this study, a hydrological ensemble flood forecasting system comprised of an automatically calibrated Variable Infiltration Capacity model and quantitative precipitation forecasts from TIGGE dataset is constructed for Lanjiang Basin, Southeast China. The impacts of calibration strategies and ensemble methods on the performance of the system are then evaluated. The hydrological model is optimized by the parallel programmed ε-NSGA II multi-objective algorithm. According to the solutions by ε-NSGA II, two differently parameterized models are determined to simulate daily flows and peak flows at each of the three hydrological stations. Then a simple yet effective modular approach is proposed to combine these daily and peak flows at the same station into one composite series. Five ensemble methods and various evaluation metrics are adopted. The results show that ε-NSGA II can provide an objective determination on parameter estimation, and the parallel program permits a more efficient simulation. It is also demonstrated that the forecasts from ECMWF have more favorable skill scores than other Ensemble Prediction Systems. The multimodel ensembles have advantages over all the single model ensembles and the multimodel methods weighted on members and skill scores outperform other methods. Furthermore, the overall performance at three stations can be satisfactory up to ten days, however the hydrological errors can degrade the skill score by approximately 2 days, and the influence persists until a lead time of 10 days with a weakening trend. With respect to peak flows selected by the Peaks Over Threshold approach, the ensemble means from single models or multimodels are generally underestimated, indicating that the ensemble mean can bring overall improvement in forecasting of flows. For

  9. EnsembleGASVR: A novel ensemble method for classifying missense single nucleotide polymorphisms

    KAUST Repository

    Rapakoulia, Trisevgeni; Theofilatos, Konstantinos A.; Kleftogiannis, Dimitrios A.; Likothanasis, Spiridon D.; Tsakalidis, Athanasios K.; Mavroudi, Seferina P.

    2014-01-01

    do not support their predictions with confidence scores. Results: To overcome these limitations, a novel ensemble computational methodology is proposed. EnsembleGASVR facilitates a twostep algorithm, which in its first step applies a novel

  10. Applying a Multi-Model Ensemble Method for Long-Term Runoff Prediction under Climate Change Scenarios for the Yellow River Basin, China

    Directory of Open Access Journals (Sweden)

    Linus Zhang

    2018-03-01

    Full Text Available Given the substantial impacts that are expected due to climate change, it is crucial that accurate rainfall–runoff results are provided for various decision-making purposes. However, these modeling results often generate uncertainty or bias due to the imperfect character of individual models. In this paper, a genetic algorithm together with a Bayesian model averaging method are employed to provide a multi-model ensemble (MME and combined runoff prediction under climate change scenarios produced from eight rainfall–runoff models for the Yellow River Basin. The results show that the multi-model ensemble method, especially the genetic algorithm method, can produce more reliable predictions than the other considered rainfall–runoff models. These results show that it is possible to reduce the uncertainty and thus improve the accuracy for future projections using different models because an MME approach evens out the bias involved in the individual model. For the study area, the final combined predictions reveal that less runoff is expected under most climatic scenarios, which will threaten water security of the basin.

  11. Sensitivity of monthly streamflow forecasts to the quality of rainfall forcing: When do dynamical climate forecasts outperform the Ensemble Streamflow Prediction (ESP) method?

    Science.gov (United States)

    Tanguy, M.; Prudhomme, C.; Harrigan, S.; Smith, K. A.; Parry, S.

    2017-12-01

    Forecasting hydrological extremes is challenging, especially at lead times over 1 month for catchments with limited hydrological memory and variable climates. One simple way to derive monthly or seasonal hydrological forecasts is to use historical climate data to drive hydrological models using the Ensemble Streamflow Prediction (ESP) method. This gives a range of possible future streamflow given known initial hydrologic conditions alone. The degree of skill of ESP depends highly on the forecast initialisation month and catchment type. Using dynamic rainfall forecasts as driving data instead of historical data could potentially improve streamflow predictions. A lot of effort is being invested within the meteorological community to improve these forecasts. However, while recent progress shows promise (e.g. NAO in winter), the skill of these forecasts at monthly to seasonal timescales is generally still limited, and the extent to which they might lead to improved hydrological forecasts is an area of active research. Additionally, these meteorological forecasts are currently being produced at 1 month or seasonal time-steps in the UK, whereas hydrological models require forcings at daily or sub-daily time-steps. Keeping in mind these limitations of available rainfall forecasts, the objectives of this study are to find out (i) how accurate monthly dynamical rainfall forecasts need to be to outperform ESP, and (ii) how the method used to disaggregate monthly rainfall forecasts into daily rainfall time series affects results. For the first objective, synthetic rainfall time series were created by increasingly degrading observed data (proxy for a `perfect forecast') from 0 % to +/-50 % error. For the second objective, three different methods were used to disaggregate monthly rainfall data into daily time series. These were used to force a simple lumped hydrological model (GR4J) to generate streamflow predictions at a one-month lead time for over 300 catchments

  12. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction

    International Nuclear Information System (INIS)

    Lehtivarjo, Juuso; Tuppurainen, Kari; Hassinen, Tommi; Laatikainen, Reino; Peräkylä, Mikael

    2012-01-01

    While chemical shifts are invaluable for obtaining structural information from proteins, they also offer one of the rare ways to obtain information about protein dynamics. A necessary tool in transforming chemical shifts into structural and dynamic information is chemical shift prediction. In our previous work we developed a method for 4D prediction of protein 1 H chemical shifts in which molecular motions, the 4th dimension, were modeled using molecular dynamics (MD) simulations. Although the approach clearly improved the prediction, the X-ray structures and single NMR conformers used in the model cannot be considered fully realistic models of protein in solution. In this work, NMR ensembles (NMRE) were used to expand the conformational space of proteins (e.g. side chains, flexible loops, termini), followed by MD simulations for each conformer to map the local fluctuations. Compared with the non-dynamic model, the NMRE+MD model gave 6–17% lower root-mean-square (RMS) errors for different backbone nuclei. The improved prediction indicates that NMR ensembles with MD simulations can be used to obtain a more realistic picture of protein structures in solutions and moreover underlines the importance of short and long time-scale dynamics for the prediction. The RMS errors of the NMRE+MD model were 0.24, 0.43, 0.98, 1.03, 1.16 and 2.39 ppm for 1 Hα, 1 HN, 13 Cα, 13 Cβ, 13 CO and backbone 15 N chemical shifts, respectively. The model is implemented in the prediction program 4DSPOT, available at http://www.uef.fi/4dspothttp://www.uef.fi/4dspot.

  13. Combining NMR ensembles and molecular dynamics simulations provides more realistic models of protein structures in solution and leads to better chemical shift prediction

    Energy Technology Data Exchange (ETDEWEB)

    Lehtivarjo, Juuso, E-mail: juuso.lehtivarjo@uef.fi; Tuppurainen, Kari; Hassinen, Tommi; Laatikainen, Reino [University of Eastern Finland, School of Pharmacy (Finland); Peraekylae, Mikael [University of Eastern Finland, Institute of Biomedicine (Finland)

    2012-03-15

    While chemical shifts are invaluable for obtaining structural information from proteins, they also offer one of the rare ways to obtain information about protein dynamics. A necessary tool in transforming chemical shifts into structural and dynamic information is chemical shift prediction. In our previous work we developed a method for 4D prediction of protein {sup 1}H chemical shifts in which molecular motions, the 4th dimension, were modeled using molecular dynamics (MD) simulations. Although the approach clearly improved the prediction, the X-ray structures and single NMR conformers used in the model cannot be considered fully realistic models of protein in solution. In this work, NMR ensembles (NMRE) were used to expand the conformational space of proteins (e.g. side chains, flexible loops, termini), followed by MD simulations for each conformer to map the local fluctuations. Compared with the non-dynamic model, the NMRE+MD model gave 6-17% lower root-mean-square (RMS) errors for different backbone nuclei. The improved prediction indicates that NMR ensembles with MD simulations can be used to obtain a more realistic picture of protein structures in solutions and moreover underlines the importance of short and long time-scale dynamics for the prediction. The RMS errors of the NMRE+MD model were 0.24, 0.43, 0.98, 1.03, 1.16 and 2.39 ppm for {sup 1}H{alpha}, {sup 1}HN, {sup 13}C{alpha}, {sup 13}C{beta}, {sup 13}CO and backbone {sup 15}N chemical shifts, respectively. The model is implemented in the prediction program 4DSPOT, available at http://www.uef.fi/4dspothttp://www.uef.fi/4dspot.

  14. Evaluation of the Plant-Craig stochastic convection scheme (v2.0) in the ensemble forecasting system MOGREPS-R (24 km) based on the Unified Model (v7.3)

    Science.gov (United States)

    Keane, Richard J.; Plant, Robert S.; Tennant, Warren J.

    2016-05-01

    The Plant-Craig stochastic convection parameterization (version 2.0) is implemented in the Met Office Regional Ensemble Prediction System (MOGREPS-R) and is assessed in comparison with the standard convection scheme with a simple stochastic scheme only, from random parameter variation. A set of 34 ensemble forecasts, each with 24 members, is considered, over the month of July 2009. Deterministic and probabilistic measures of the precipitation forecasts are assessed. The Plant-Craig parameterization is found to improve probabilistic forecast measures, particularly the results for lower precipitation thresholds. The impact on deterministic forecasts at the grid scale is neutral, although the Plant-Craig scheme does deliver improvements when forecasts are made over larger areas. The improvements found are greater in conditions of relatively weak synoptic forcing, for which convective precipitation is likely to be less predictable.

  15. On the incidence of meteorological and hydrological processors: Effect of resolution, sharpness and reliability of hydrological ensemble forecasts

    Science.gov (United States)

    Abaza, Mabrouk; Anctil, François; Fortin, Vincent; Perreault, Luc

    2017-12-01

    Meteorological and hydrological ensemble prediction systems are imperfect. Their outputs could often be improved through the use of a statistical processor, opening up the question of the necessity of using both processors (meteorological and hydrological), only one of them, or none. This experiment compares the predictive distributions from four hydrological ensemble prediction systems (H-EPS) utilising the Ensemble Kalman filter (EnKF) probabilistic sequential data assimilation scheme. They differ in the inclusion or not of the Distribution Based Scaling (DBS) method for post-processing meteorological forecasts and the ensemble Bayesian Model Averaging (ensemble BMA) method for hydrological forecast post-processing. The experiment is implemented on three large watersheds and relies on the combination of two meteorological reforecast products: the 4-member Canadian reforecasts from the Canadian Centre for Meteorological and Environmental Prediction (CCMEP) and the 10-member American reforecasts from the National Oceanic and Atmospheric Administration (NOAA), leading to 14 members at each time step. Results show that all four tested H-EPS lead to resolution and sharpness values that are quite similar, with an advantage to DBS + EnKF. The ensemble BMA is unable to compensate for any bias left in the precipitation ensemble forecasts. On the other hand, it succeeds in calibrating ensemble members that are otherwise under-dispersed. If reliability is preferred over resolution and sharpness, DBS + EnKF + ensemble BMA performs best, making use of both processors in the H-EPS system. Conversely, for enhanced resolution and sharpness, DBS is the preferred method.

  16. Ensemble Kalman filter assimilation of temperature and altimeter data with bias correction and application to seasonal prediction

    Directory of Open Access Journals (Sweden)

    C. L. Keppenne

    2005-01-01

    Full Text Available To compensate for a poorly known geoid, satellite altimeter data is usually analyzed in terms of anomalies from the time mean record. When such anomalies are assimilated into an ocean model, the bias between the climatologies of the model and data is problematic. An ensemble Kalman filter (EnKF is modified to account for the presence of a forecast-model bias and applied to the assimilation of TOPEX/Poseidon (T/P altimeter data. The online bias correction (OBC algorithm uses the same ensemble of model state vectors to estimate biased-error and unbiased-error covariance matrices. Covariance localization is used but the bias covariances have different localization scales from the unbiased-error covariances, thereby accounting for the fact that the bias in a global ocean model could have much larger spatial scales than the random error.The method is applied to a 27-layer version of the Poseidon global ocean general circulation model with about 30-million state variables. Experiments in which T/P altimeter anomalies are assimilated show that the OBC reduces the RMS observation minus forecast difference for sea-surface height (SSH over a similar EnKF run in which OBC is not used. Independent in situ temperature observations show that the temperature field is also improved. When the T/P data and in situ temperature data are assimilated in the same run and the configuration of the ensemble at the end of the run is used to initialize the ocean component of the GMAO coupled forecast model, seasonal SSH hindcasts made with the coupled model are generally better than those initialized with optimal interpolation of temperature observations without altimeter data. The analysis of the corresponding sea-surface temperature hindcasts is not as conclusive.

  17. Generalized ensemble method applied to study systems with strong first order transitions

    Science.gov (United States)

    Małolepsza, E.; Kim, J.; Keyes, T.

    2015-09-01

    At strong first-order phase transitions, the entropy versus energy or, at constant pressure, enthalpy, exhibits convex behavior, and the statistical temperature curve correspondingly exhibits an S-loop or back-bending. In the canonical and isothermal-isobaric ensembles, with temperature as the control variable, the probability density functions become bimodal with peaks localized outside of the S-loop region. Inside, states are unstable, and as a result simulation of equilibrium phase coexistence becomes impossible. To overcome this problem, a method was proposed by Kim, Keyes and Straub [1], where optimally designed generalized ensemble sampling was combined with replica exchange, and denoted generalized replica exchange method (gREM). This new technique uses parametrized effective sampling weights that lead to a unimodal energy distribution, transforming unstable states into stable ones. In the present study, the gREM, originally developed as a Monte Carlo algorithm, was implemented to work with molecular dynamics in an isobaric ensemble and coded into LAMMPS, a highly optimized open source molecular simulation package. The method is illustrated in a study of the very strong solid/liquid transition in water.

  18. Ensemble Forecasts with Useful Skill-Spread Relationships for African meningitis and Asia Streamflow Forecasting

    Science.gov (United States)

    Hopson, T. M.

    2014-12-01

    One potential benefit of an ensemble prediction system (EPS) is its capacity to forecast its own forecast error through the ensemble spread-error relationship. In practice, an EPS is often quite limited in its ability to represent the variable expectation of forecast error through the variable dispersion of the ensemble, and perhaps more fundamentally, in its ability to provide enough variability in the ensembles dispersion to make the skill-spread relationship even potentially useful (irrespective of whether the EPS is well-calibrated or not). In this paper we examine the ensemble skill-spread relationship of an ensemble constructed from the TIGGE (THORPEX Interactive Grand Global Ensemble) dataset of global forecasts and a combination of multi-model and post-processing approaches. Both of the multi-model and post-processing techniques are based on quantile regression (QR) under a step-wise forward selection framework leading to ensemble forecasts with both good reliability and sharpness. The methodology utilizes the ensemble's ability to self-diagnose forecast instability to produce calibrated forecasts with informative skill-spread relationships. A context for these concepts is provided by assessing the constructed ensemble in forecasting district-level humidity impacting the incidence of meningitis in the meningitis belt of Africa, and in forecasting flooding events in the Brahmaputra and Ganges basins of South Asia.

  19. HIGH-RESOLUTION ATMOSPHERIC ENSEMBLE MODELING AT SRNL

    Energy Technology Data Exchange (ETDEWEB)

    Buckley, R.; Werth, D.; Chiswell, S.; Etherton, B.

    2011-05-10

    The High-Resolution Mid-Atlantic Forecasting Ensemble (HME) is a federated effort to improve operational forecasts related to precipitation, convection and boundary layer evolution, and fire weather utilizing data and computing resources from a diverse group of cooperating institutions in order to create a mesoscale ensemble from independent members. Collaborating organizations involved in the project include universities, National Weather Service offices, and national laboratories, including the Savannah River National Laboratory (SRNL). The ensemble system is produced from an overlapping numerical weather prediction model domain and parameter subsets provided by each contributing member. The coordination, synthesis, and dissemination of the ensemble information are performed by the Renaissance Computing Institute (RENCI) at the University of North Carolina-Chapel Hill. This paper discusses background related to the HME effort, SRNL participation, and example results available from the RENCI website.

  20. A multi-model ensemble approach to seabed mapping

    Science.gov (United States)

    Diesing, Markus; Stephens, David

    2015-06-01

    Seabed habitat mapping based on swath acoustic data and ground-truth samples is an emergent and active marine science discipline. Significant progress could be achieved by transferring techniques and approaches that have been successfully developed and employed in such fields as terrestrial land cover mapping. One such promising approach is the multiple classifier system, which aims at improving classification performance by combining the outputs of several classifiers. Here we present results of a multi-model ensemble applied to multibeam acoustic data covering more than 5000 km2 of seabed in the North Sea with the aim to derive accurate spatial predictions of seabed substrate. A suite of six machine learning classifiers (k-Nearest Neighbour, Support Vector Machine, Classification Tree, Random Forest, Neural Network and Naïve Bayes) was trained with ground-truth sample data classified into seabed substrate classes and their prediction accuracy was assessed with an independent set of samples. The three and five best performing models were combined to classifier ensembles. Both ensembles led to increased prediction accuracy as compared to the best performing single classifier. The improvements were however not statistically significant at the 5% level. Although the three-model ensemble did not perform significantly better than its individual component models, we noticed that the five-model ensemble did perform significantly better than three of the five component models. A classifier ensemble might therefore be an effective strategy to improve classification performance. Another advantage is the fact that the agreement in predicted substrate class between the individual models of the ensemble could be used as a measure of confidence. We propose a simple and spatially explicit measure of confidence that is based on model agreement and prediction accuracy.

  1. On the forecast skill of a convection-permitting ensemble

    Science.gov (United States)

    Schellander-Gorgas, Theresa; Wang, Yong; Meier, Florian; Weidle, Florian; Wittmann, Christoph; Kann, Alexander

    2017-01-01

    The 2.5 km convection-permitting (CP) ensemble AROME-EPS (Applications of Research to Operations at Mesoscale - Ensemble Prediction System) is evaluated by comparison with the regional 11 km ensemble ALADIN-LAEF (Aire Limitée Adaption dynamique Développement InterNational - Limited Area Ensemble Forecasting) to show whether a benefit is provided by a CP EPS. The evaluation focuses on the abilities of the ensembles to quantitatively predict precipitation during a 3-month convective summer period over areas consisting of mountains and lowlands. The statistical verification uses surface observations and 1 km × 1 km precipitation analyses, and the verification scores involve state-of-the-art statistical measures for deterministic and probabilistic forecasts as well as novel spatial verification methods. The results show that the convection-permitting ensemble with higher-resolution AROME-EPS outperforms its mesoscale counterpart ALADIN-LAEF for precipitation forecasts. The positive impact is larger for the mountainous areas than for the lowlands. In particular, the diurnal precipitation cycle is improved in AROME-EPS, which leads to a significant improvement of scores at the concerned times of day (up to approximately one-third of the scored verification measure). Moreover, there are advantages for higher precipitation thresholds at small spatial scales, which are due to the improved simulation of the spatial structure of precipitation.

  2. Sub-Ensemble Coastal Flood Forecasting: A Case Study of Hurricane Sandy

    Directory of Open Access Journals (Sweden)

    Justin A. Schulte

    2017-12-01

    Full Text Available In this paper, it is proposed that coastal flood ensemble forecasts be partitioned into sub-ensemble forecasts using cluster analysis in order to produce representative statistics and to measure forecast uncertainty arising from the presence of clusters. After clustering the ensemble members, the ability to predict the cluster into which the observation will fall can be measured using a cluster skill score. Additional sub-ensemble and composite skill scores are proposed for assessing the forecast skill of a clustered ensemble forecast. A recently proposed method for statistically increasing the number of ensemble members is used to improve sub-ensemble probabilistic estimates. Through the application of the proposed methodology to Sandy coastal flood reforecasts, it is demonstrated that statistics computed using only ensemble members belonging to a specific cluster are more representative than those computed using all ensemble members simultaneously. A cluster skill-cluster uncertainty index relationship is identified, which is the cluster analog of the documented spread-skill relationship. Two sub-ensemble skill scores are shown to be positively correlated with cluster forecast skill, suggesting that skillfully forecasting the cluster into which the observation will fall is important to overall forecast skill. The identified relationships also suggest that the number of ensemble members within in each cluster can be used as guidance for assessing the potential for forecast error. The inevitable existence of ensemble member clusters in tidally dominated total water level prediction systems suggests that clustering is a necessary post-processing step for producing representative and skillful total water level forecasts.

  3. Squeezing of Collective Excitations in Spin Ensembles

    DEFF Research Database (Denmark)

    Kraglund Andersen, Christian; Mølmer, Klaus

    2012-01-01

    We analyse the possibility to create two-mode spin squeezed states of two separate spin ensembles by inverting the spins in one ensemble and allowing spin exchange between the ensembles via a near resonant cavity field. We investigate the dynamics of the system using a combination of numerical an...

  4. Room-temperature and temperature-dependent QSRR modelling for predicting the nitrate radical reaction rate constants of organic chemicals using ensemble learning methods.

    Science.gov (United States)

    Gupta, S; Basant, N; Mohan, D; Singh, K P

    2016-07-01

    Experimental determinations of the rate constants of the reaction of NO3 with a large number of organic chemicals are tedious, and time and resource intensive; and the development of computational methods has widely been advocated. In this study, we have developed room-temperature (298 K) and temperature-dependent quantitative structure-reactivity relationship (QSRR) models based on the ensemble learning approaches (decision tree forest (DTF) and decision treeboost (DTB)) for predicting the rate constant of the reaction of NO3 radicals with diverse organic chemicals, under OECD guidelines. Predictive powers of the developed models were established in terms of statistical coefficients. In the test phase, the QSRR models yielded a correlation (r(2)) of >0.94 between experimental and predicted rate constants. The applicability domains of the constructed models were determined. An attempt has been made to provide the mechanistic interpretation of the selected features for QSRR development. The proposed QSRR models outperformed the previous reports, and the temperature-dependent models offered a much wider applicability domain. This is the first report presenting a temperature-dependent QSRR model for predicting the nitrate radical reaction rate constant at different temperatures. The proposed models can be useful tools in predicting the reactivities of chemicals towards NO3 radicals in the atmosphere, hence, their persistence and exposure risk assessment.

  5. Ensembl 2002: accommodating comparative genomics.

    Science.gov (United States)

    Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E

    2003-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.

  6. Ensemble composition and activity levels of insectivorous bats in response to management intensification in coffee agroforestry systems.

    Science.gov (United States)

    Williams-Guillén, Kimberly; Perfecto, Ivette

    2011-01-26

    Shade coffee plantations have received attention for their role in biodiversity conservation. Bats are among the most diverse mammalian taxa in these systems; however, previous studies of bats in coffee plantations have focused on the largely herbivorous leaf-nosed bats (Phyllostomidae). In contrast, we have virtually no information on how ensembles of aerial insectivorous bats--nearly half the Neotropical bat species--change in response to habitat modification. To evaluate the effects of agroecosystem management on insectivorous bats, we studied their diversity and activity in southern Chiapas, Mexico, a landscape dominated by coffee agroforestry. We used acoustic monitoring and live captures to characterize the insectivorous bat ensemble in forest fragments and coffee plantations differing in the structural and taxonomic complexity of shade trees. We captured bats of 12 non-phyllostomid species; acoustic monitoring revealed the presence of at least 12 more species of aerial insectivores. Richness of forest bats was the same across all land-use types; in contrast, species richness of open-space bats increased in low shade, intensively managed coffee plantations. Conversely, only forest bats demonstrated significant differences in ensemble structure (as measured by similarity indices) across land-use types. Both overall activity and feeding activity of forest bats declined significantly with increasing management intensity, while the overall activity, but not feeding activity, of open-space bats increased. We conclude that diverse shade coffee plantations in our study area serve as valuable foraging and commuting habitat for aerial insectivorous bats, and several species also commute through or forage in low shade coffee monocultures.

  7. Ensemble composition and activity levels of insectivorous bats in response to management intensification in coffee agroforestry systems.

    Directory of Open Access Journals (Sweden)

    Kimberly Williams-Guillén

    Full Text Available Shade coffee plantations have received attention for their role in biodiversity conservation. Bats are among the most diverse mammalian taxa in these systems; however, previous studies of bats in coffee plantations have focused on the largely herbivorous leaf-nosed bats (Phyllostomidae. In contrast, we have virtually no information on how ensembles of aerial insectivorous bats--nearly half the Neotropical bat species--change in response to habitat modification. To evaluate the effects of agroecosystem management on insectivorous bats, we studied their diversity and activity in southern Chiapas, Mexico, a landscape dominated by coffee agroforestry. We used acoustic monitoring and live captures to characterize the insectivorous bat ensemble in forest fragments and coffee plantations differing in the structural and taxonomic complexity of shade trees. We captured bats of 12 non-phyllostomid species; acoustic monitoring revealed the presence of at least 12 more species of aerial insectivores. Richness of forest bats was the same across all land-use types; in contrast, species richness of open-space bats increased in low shade, intensively managed coffee plantations. Conversely, only forest bats demonstrated significant differences in ensemble structure (as measured by similarity indices across land-use types. Both overall activity and feeding activity of forest bats declined significantly with increasing management intensity, while the overall activity, but not feeding activity, of open-space bats increased. We conclude that diverse shade coffee plantations in our study area serve as valuable foraging and commuting habitat for aerial insectivorous bats, and several species also commute through or forage in low shade coffee monocultures.

  8. Combining 2-m temperature nowcasting and short range ensemble forecasting

    Directory of Open Access Journals (Sweden)

    A. Kann

    2011-12-01

    Full Text Available During recent years, numerical ensemble prediction systems have become an important tool for estimating the uncertainties of dynamical and physical processes as represented in numerical weather models. The latest generation of limited area ensemble prediction systems (LAM-EPSs allows for probabilistic forecasts at high resolution in both space and time. However, these systems still suffer from systematic deficiencies. Especially for nowcasting (0–6 h applications the ensemble spread is smaller than the actual forecast error. This paper tries to generate probabilistic short range 2-m temperature forecasts by combining a state-of-the-art nowcasting method and a limited area ensemble system, and compares the results with statistical methods. The Integrated Nowcasting Through Comprehensive Analysis (INCA system, which has been in operation at the Central Institute for Meteorology and Geodynamics (ZAMG since 2006 (Haiden et al., 2011, provides short range deterministic forecasts at high temporal (15 min–60 min and spatial (1 km resolution. An INCA Ensemble (INCA-EPS of 2-m temperature forecasts is constructed by applying a dynamical approach, a statistical approach, and a combined dynamic-statistical method. The dynamical method takes uncertainty information (i.e. ensemble variance from the operational limited area ensemble system ALADIN-LAEF (Aire Limitée Adaptation Dynamique Développement InterNational Limited Area Ensemble Forecasting which is running operationally at ZAMG (Wang et al., 2011. The purely statistical method assumes a well-calibrated spread-skill relation and applies ensemble spread according to the skill of the INCA forecast of the most recent past. The combined dynamic-statistical approach adapts the ensemble variance gained from ALADIN-LAEF with non-homogeneous Gaussian regression (NGR which yields a statistical mbox{correction} of the first and second moment (mean bias and dispersion for Gaussian distributed continuous

  9. Aerosol Observability and Predictability: From Research to Operations for Chemical Weather Forecasting. Lagrangian Displacement Ensembles for Aerosol Data Assimilation

    Science.gov (United States)

    da Silva, Arlindo

    2010-01-01

    A challenge common to many constituent data assimilation applications is the fact that one observes a much smaller fraction of the phase space that one wishes to estimate. For example, remotely sensed estimates of the column average concentrations are available, while one is faced with the problem of estimating 3D concentrations for initializing a prognostic model. This problem is exacerbated in the case of aerosols because the observable Aerosol Optical Depth (AOD) is not only a column integrated quantity, but it also sums over a large number of species (dust, sea-salt, carbonaceous and sulfate aerosols. An aerosol transport model when driven by high-resolution, state-of-the-art analysis of meteorological fields and realistic emissions can produce skillful forecasts even when no aerosol data is assimilated. The main task of aerosol data assimilation is to address the bias arising from inaccurate emissions, and Lagrangian misplacement of plumes induced by errors in the driving meteorological fields. As long as one decouples the meteorological and aerosol assimilation as we do here, the classic baroclinic growth of error is no longer the main order of business. We will describe an aerosol data assimilation scheme in which the analysis update step is conducted in observation space, using an adaptive maximum-likelihood scheme for estimating background errors in AOD space. This scheme includes e explicit sequential bias estimation as in Dee and da Silva. Unlikely existing aerosol data assimilation schemes we do not obtain analysis increments of the 3D concentrations by scaling the background profiles. Instead we explore the Lagrangian characteristics of the problem for generating local displacement ensembles. These high-resolution state-dependent ensembles are then used to parameterize the background errors and generate 3D aerosol increments. The algorithm has computational complexity running at a resolution of 1/4 degree, globally. We will present the result of

  10. On predicting monitoring system effectiveness

    Science.gov (United States)

    Cappello, Carlo; Sigurdardottir, Dorotea; Glisic, Branko; Zonta, Daniele; Pozzi, Matteo

    2015-03-01

    While the objective of structural design is to achieve stability with an appropriate level of reliability, the design of systems for structural health monitoring is performed to identify a configuration that enables acquisition of data with an appropriate level of accuracy in order to understand the performance of a structure or its condition state. However, a rational standardized approach for monitoring system design is not fully available. Hence, when engineers design a monitoring system, their approach is often heuristic with performance evaluation based on experience, rather than on quantitative analysis. In this contribution, we propose a probabilistic model for the estimation of monitoring system effectiveness based on information available in prior condition, i.e. before acquiring empirical data. The presented model is developed considering the analogy between structural design and monitoring system design. We assume that the effectiveness can be evaluated based on the prediction of the posterior variance or covariance matrix of the state parameters, which we assume to be defined in a continuous space. Since the empirical measurements are not available in prior condition, the estimation of the posterior variance or covariance matrix is performed considering the measurements as a stochastic variable. Moreover, the model takes into account the effects of nuisance parameters, which are stochastic parameters that affect the observations but cannot be estimated using monitoring data. Finally, we present an application of the proposed model to a real structure. The results show how the model enables engineers to predict whether a sensor configuration satisfies the required performance.

  11. ENSEMBLE and AMET: Two Systems and Approaches to a Harmonized, Simplified and Efficient Facility for Air Quality Models Development and Evaluation

    Science.gov (United States)

    The complexity of air quality modeling systems, air quality monitoring data make ad-hoc systems for model evaluation important aids to the modeling community. Among those are the ENSEMBLE system developed by the EC-Joint Research Center, and the AMET software developed by the US-...

  12. pSuc-Lys: Predict lysine succinylation sites in proteins with PseAAC and ensemble random forest approach.

    Science.gov (United States)

    Jia, Jianhua; Liu, Zi; Xiao, Xuan; Liu, Bingxiang; Chou, Kuo-Chen

    2016-04-07

    Being one type of post-translational modifications (PTMs), protein lysine succinylation is important in regulating varieties of biological processes. It is also involved with some diseases, however. Consequently, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence having many Lys residues therein, which ones can be succinylated, and which ones cannot? To address this problem, we have developed a predictor called pSuc-Lys through (1) incorporating the sequence-coupled information into the general pseudo amino acid composition, (2) balancing out skewed training dataset by random sampling, and (3) constructing an ensemble predictor by fusing a series of individual random forest classifiers. Rigorous cross-validations indicated that it remarkably outperformed the existing methods. A user-friendly web-server for pSuc-Lys has been established at http://www.jci-bioinfo.cn/pSuc-Lys, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can also be used to analyze many other problems in computational proteomics. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. A novel computer-aided diagnosis system for breast MRI based on feature selection and ensemble learning.

    Science.gov (United States)

    Lu, Wei; Li, Zhe; Chu, Jinghui

    2017-04-01

    Breast cancer is a common cancer among women. With the development of modern medical science and information technology, medical imaging techniques have an increasingly important role in the early detection and diagnosis of breast cancer. In this paper, we propose an automated computer-aided diagnosis (CADx) framework for magnetic resonance imaging (MRI). The scheme consists of an ensemble of several machine learning-based techniques, including ensemble under-sampling (EUS) for imbalanced data processing, the Relief algorithm for feature selection, the subspace method for providing data diversity, and Adaboost for improving the performance of base classifiers. We extracted morphological, various texture, and Gabor features. To clarify the feature subsets' physical meaning, subspaces are built by combining morphological features with each kind of texture or Gabor feature. We tested our proposal using a manually segmented Region of Interest (ROI) data set, which contains 438 images of malignant tumors and 1898 images of normal tissues or benign tumors. Our proposal achieves an area under the ROC curve (AUC) value of 0.9617, which outperforms most other state-of-the-art breast MRI CADx systems. Compared with other methods, our proposal significantly reduces the false-positive classification rate. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. A gain-loss framework based on ensemble flow forecasts to switch the urban drainage-wastewater system management towards energy optimization during dry periods

    Science.gov (United States)

    Courdent, Vianney; Grum, Morten; Munk-Nielsen, Thomas; Mikkelsen, Peter S.

    2017-05-01

    Precipitation is the cause of major perturbation to the flow in urban drainage and wastewater systems. Flow forecasts, generated by coupling rainfall predictions with a hydrologic runoff model, can potentially be used to optimize the operation of integrated urban drainage-wastewater systems (IUDWSs) during both wet and dry weather periods. Numerical weather prediction (NWP) models have significantly improved in recent years, having increased their spatial and temporal resolution. Finer resolution NWP are suitable for urban-catchment-scale applications, providing longer lead time than radar extrapolation. However, forecasts are inevitably uncertain, and fine resolution is especially challenging for NWP. This uncertainty is commonly addressed in meteorology with ensemble prediction systems (EPSs). Handling uncertainty is challenging for decision makers and hence tools are necessary to provide insight on ensemble forecast usage and to support the rationality of decisions (i.e. forecasts are uncertain and therefore errors will be made; decision makers need tools to justify their choices, demonstrating that these choices are beneficial in the long run). This study presents an economic framework to support the decision-making process by providing information on when acting on the forecast is beneficial and how to handle the EPS. The relative economic value (REV) approach associates economic values with the potential outcomes and determines the preferential use of the EPS forecast. The envelope curve of the REV diagram combines the results from each probability forecast to provide the highest relative economic value for a given gain-loss ratio. This approach is traditionally used at larger scales to assess mitigation measures for adverse events (i.e. the actions are taken when events are forecast). The specificity of this study is to optimize the energy consumption in IUDWS during low-flow periods by exploiting the electrical smart grid market (i.e. the actions are taken

  15. Light localization in cold and dense atomic ensemble

    International Nuclear Information System (INIS)

    Sokolov, Igor

    2017-01-01

    We report on results of theoretical analysis of possibilities of light strong (Anderson) localization in a cold atomic ensemble. We predict appearance of localization in dense atomic systems in strong magnetic field. We prove that in absence of the field the light localization is impossible. (paper)

  16. Ensemble Sensitivity Analysis of a Severe Downslope Windstorm in Complex Terrain: Implications for Forecast Predictability Scales and Targeted Observing Networks

    Science.gov (United States)

    2013-09-01

    observations, linear regression finds the straight line that explains the linear relationship of the sample. This line is given by the equation y = mx + b...SENSITIVITY ANALYSIS OF A SEVERE DOWNSLOPE WINDSTORM IN COMPLEX TERRAIN: IMPLICATIONS FOR FORECAST PREDICTABILITY SCALES AND TARGETED OBSERVING...SENSITIVITY ANALYSIS OF A SEVERE DOWNSLOPE WINDSTORM IN COMPLEX TERRAIN: IMPLICATIONS FOR FORECAST PREDICTABILITY SCALES AND TARGETED OBSERVING NETWORKS

  17. A Standardized Evaluation System for Decadal Climate Prediction

    Science.gov (United States)

    Kadow, C.; Cubasch, U.

    2012-12-01

    The evaluation of decadal prediction systems is a scientific challenge as well as a technical challenge in the climate research. The major project MiKlip (www.fona-miklip.de) for medium-term climate prediction funded by the Federal Ministry of Education and Research in Germany (BMBF) has the aim to create a model system that can provide reliable decadal forecasts on climate and weather. The model system to be developed will be novel in several aspects, with great challenges for the methodology development. This concerns especially the determination of the initial conditions, the inclusion into the model of processes relevant to decadal predictions, the increase of the spatial resolution through regionalisation, the improvement or adjustment of statistical post-processing, and finally the synthesis and validation of the entire model system. Therefore, a standardized evaluation system will be part of the MiKlip system to validate it - developed by the project 'Integrated data and evaluation system for decadal scale prediction' (INTEGRATION). The presentation gives an overview of the different linkages of such a project, shows the different development stages and gives an outlook for users and possible end users in climate service. The technical interface combines all projects inside of MiKlip and invites them to participate in a common evaluation system. The system design and the validation strategy from a standalone tool in the beginning to a user friendly web based system using GRID technologies to an integrated part of the operational MiKlip system for industry and society will give the opportunity to enhance the MiKlip strategy. First results of different possibilities of such a system will be shown to present the scientific background through Taylor diagrams, ensemble skill scores and e.g. climatological means to show the usability and possibilities of MiKlip and the INTEGRATION project.

  18. Evaluating model performance of an ensemble-based chemical data assimilation system during INTEX-B field mission

    Directory of Open Access Journals (Sweden)

    A. F. Arellano Jr.

    2007-11-01

    Full Text Available We present a global chemical data assimilation system using a global atmosphere model, the Community Atmosphere Model (CAM3 with simplified chemistry and the Data Assimilation Research Testbed (DART assimilation package. DART is a community software facility for assimilation studies using the ensemble Kalman filter approach. Here, we apply the assimilation system to constrain global tropospheric carbon monoxide (CO by assimilating meteorological observations of temperature and horizontal wind velocity and satellite CO retrievals from the Measurement of Pollution in the Troposphere (MOPITT satellite instrument. We verify the system performance using independent CO observations taken on board the NSF/NCAR C-130 and NASA DC-8 aircrafts during the April 2006 part of the Intercontinental Chemical Transport Experiment (INTEX-B. Our evaluations show that MOPITT data assimilation provides significant improvements in terms of capturing the observed CO variability relative to no MOPITT assimilation (i.e. the correlation improves from 0.62 to 0.71, significant at 99% confidence. The assimilation provides evidence of median CO loading of about 150 ppbv at 700 hPa over the NE Pacific during April 2006. This is marginally higher than the modeled CO with no MOPITT assimilation (~140 ppbv. Our ensemble-based estimates of model uncertainty also show model overprediction over the source region (i.e. China and underprediction over the NE Pacific, suggesting model errors that cannot be readily explained by emissions alone. These results have important implications for improving regional chemical forecasts and for inverse modeling of CO sources and further demonstrate the utility of the assimilation system in comparing non-coincident measurements, e.g. comparing satellite retrievals of CO with in-situ aircraft measurements.

  19. Impact of Uncertainty Characterization of Satellite Rainfall Inputs and Model Parameters on Hydrological Data Assimilation with the Ensemble Kalman Filter for Flood Prediction

    Science.gov (United States)

    Vergara, H. J.; Kirstetter, P.; Hong, Y.; Gourley, J. J.; Wang, X.

    2013-12-01

    The Ensemble Kalman Filter (EnKF) is arguably the assimilation approach that has found the widest application in hydrologic modeling. Its relatively easy implementation and computational efficiency makes it an attractive method for research and operational purposes. However, the scientific literature featuring this approach lacks guidance on how the errors in the forecast need to be characterized so as to get the required corrections from the assimilation process. Moreover, several studies have indicated that the performance of the EnKF is 'sub-optimal' when assimilating certain hydrologic observations. Likewise, some authors have suggested that the underlying assumptions of the Kalman Filter and its dependence on linear dynamics make the EnKF unsuitable for hydrologic modeling. Such assertions are often based on ineffectiveness and poor robustness of EnKF implementations resulting from restrictive specification of error characteristics and the absence of a-priori information of error magnitudes. Therefore, understanding the capabilities and limitations of the EnKF to improve hydrologic forecasts require studying its sensitivity to the manner in which errors in the hydrologic modeling system are represented through ensembles. This study presents a methodology that explores various uncertainty representation configurations to characterize the errors in the hydrologic forecasts in a data assimilation context. The uncertainty in rainfall inputs is represented through a Generalized Additive Model for Location, Scale, and Shape (GAMLSS), which provides information about second-order statistics of quantitative precipitation estimates (QPE) error. The uncertainty in model parameters is described adding perturbations based on parameters covariance information. The method allows for the identification of rainfall and parameter perturbation combinations for which the performance of the EnKF is 'optimal' given a set of objective functions. In this process, information about

  20. Integrated cumulus ensemble and turbulence (ICET): An integrated parameterization system for general circulation models (GCMs)

    Energy Technology Data Exchange (ETDEWEB)

    Evans, J.L.; Frank, W.M.; Young, G.S. [Pennsylvania State Univ., University Park, PA (United States)

    1996-04-01

    Successful simulations of the global circulation and climate require accurate representation of the properties of shallow and deep convective clouds, stable-layer clouds, and the interactions between various cloud types, the boundary layer, and the radiative fluxes. Each of these phenomena play an important role in the global energy balance, and each must be parameterized in a global climate model. These processes are highly interactive. One major problem limiting the accuracy of parameterizations of clouds and other processes in general circulation models (GCMs) is that most of the parameterization packages are not linked with a common physical basis. Further, these schemes have not, in general, been rigorously verified against observations adequate to the task of resolving subgrid-scale effects. To address these problems, we are designing a new Integrated Cumulus Ensemble and Turbulence (ICET) parameterization scheme, installing it in a climate model (CCM2), and evaluating the performance of the new scheme using data from Atmospheric Radiation Measurement (ARM) Program Cloud and Radiation Testbed (CART) sites.

  1. Predicting the downstream impact of ensembles of small reservoirs with special reference to the Volta Basin, West Africa

    Science.gov (United States)

    van de Giesen, N.; Andreini, M.; Liebe, J.; Steenhuis, T.; Huber-Lee, A.

    2005-12-01

    After a strong reduction in investments in water infrastructure in Sub-Saharan Africa, we now see a revival and increased interest to start water-related projects. The global political willingness to work towards the UN millennium goals are an important driver behind this recent development. Large scale irrigation projects, such as were constructed at tremendous costs in the 1970's and early 1980's, are no longer seen as the way forward. Instead, the construction of a large number of small, village-level irrigation schemes is thought to be a more effective way to improve food production. Such small schemes would fit better in existing and functioning governance structures. An important question now becomes what the cumulative (downstream) impact is of a large number of small irrigation projects, especially when they threaten to deplete transboundary water resources. The Volta Basin in West Africa is a transboundary river catchment, divided over six countries. Of these six countries, upstream Burkina Faso and downstream Ghana are the most important and cover 43% and 42% of the basin, respectively. In Burkina Faso (and also North Ghana), small reservoirs and associated irrigation schemes are already an important means to improve the livelihoods of the rural population. In fact, over two thousand such schemes have already been constructed in Burkina Faso and further construction is to be expected in the light of the UN millennium goals. The cumulative impact of these schemes would affect the Akosombo Reservoir, one of the largest manmade lakes in the world and an important motor behind the economic development in (South) Ghana. This presentation will put forward an analytical framework that allows for the impact assessment of (large) ensembles of small reservoirs. It will be shown that despite their relatively low water use efficiencies, the overall impact remains low compared to the impact of large dams. The tools developed can be used in similar settings elsewhere

  2. Variable-Resolution Ensemble Climatology Modeling of Sierra Nevada Snowpack within the Community Earth System Model (CESM)

    Science.gov (United States)

    Rhoades, A.; Ullrich, P. A.; Zarzycki, C. M.; Levy, M.; Taylor, M.

    2014-12-01

    Snowpack is crucial for the western USA, providing around 75% of the total fresh water supply (Cayan et al., 1996) and buffering against seasonal aridity impacts on agricultural, ecosystem, and urban water demands. The resilience of the California water system is largely dependent on natural stores provided by snowpack. This resilience has shown vulnerabilities due to anthropogenic global climate change. Historically, the northern Sierras showed a net decline of 50-75% in snow water equivalent (SWE) while the southern Sierras showed a net accumulation of 30% (Mote et al., 2005). Future trends of SWE highlight that western USA SWE may decline by 40-70% (Pierce and Cayan, 2013), snowfall may decrease by 25-40% (Pierce and Cayan, 2013), and more winter storms may tend towards rain rather than snow (Bales et al., 2006). The volatility of Sierran snowpack presents a need for scientific tools to help water managers and policy makers assess current and future trends. A burgeoning tool to analyze these trends comes in the form of variable-resolution global climate modeling (VRGCM). VRGCMs serve as a bridge between regional and global models and provide added resolution in areas of need, eliminate lateral boundary forcings, provide model runtime speed up, and utilize a common dynamical core, physics scheme and sub-grid scale parameterization package. A cubed-sphere variable-resolution grid with 25 km horizontal resolution over the western USA was developed for use in the Community Atmosphere Model (CAM) within the Community Earth System Model (CESM). A 25-year three-member ensemble climatology (1980-2005) is presented and major snowpack metrics such as SWE, snow depth, snow cover, and two-meter surface temperature are assessed. The ensemble simulation is also compared to observational, reanalysis, and WRF model datasets. The variable-resolution model provides a mechanism for reaching towards non-hydrostatic scales and simulations are currently being developed with refined

  3. Stacking Ensemble Learning for Short-Term Electricity Consumption Forecasting

    Directory of Open Access Journals (Sweden)

    Federico Divina

    2018-04-01

    Full Text Available The ability to predict short-term electric energy demand would provide several benefits, both at the economic and environmental level. For example, it would allow for an efficient use of resources in order to face the actual demand, reducing the costs associated to the production as well as the emission of CO 2 . To this aim, in this paper we propose a strategy based on ensemble learning in order to tackle the short-term load forecasting problem. In particular, our approach is based on a stacking ensemble learning scheme, where the predictions produced by three base learning methods are used by a top level method in order to produce final predictions. We tested the proposed scheme on a dataset reporting the energy consumption in Spain over more than nine years. The obtained experimental results show that an approach for short-term electricity consumption forecasting based on ensemble learning can help in combining predictions produced by weaker learning methods in order to obtain superior results. In particular, the system produces a lower error with respect to the existing state-of-the art techniques used on the same dataset. More importantly, this case study has shown that using an ensemble scheme can achieve very accurate predictions, and thus that it is a suitable approach for addressing the short-term load forecasting problem.

  4. Representing Color Ensembles.

    Science.gov (United States)

    Chetverikov, Andrey; Campana, Gianluca; Kristjánsson, Árni

    2017-10-01

    Colors are rarely uniform, yet little is known about how people represent color distributions. We introduce a new method for studying color ensembles based on intertrial learning in visual search. Participants looked for an oddly colored diamond among diamonds with colors taken from either uniform or Gaussian color distributions. On test trials, the targets had various distances in feature space from the mean of the preceding distractor color distribution. Targets on test trials therefore served as probes into probabilistic representations of distractor colors. Test-trial response times revealed a striking similarity between the physical distribution of colors and their internal representations. The results demonstrate that the visual system represents color ensembles in a more detailed way than previously thought, coding not only mean and variance but, most surprisingly, the actual shape (uniform or Gaussian) of the distribution of colors in the environment.

  5. Tailored Random Graph Ensembles

    International Nuclear Information System (INIS)

    Roberts, E S; Annibale, A; Coolen, A C C

    2013-01-01

    Tailored graph ensembles are a developing bridge between biological networks and statistical mechanics. The aim is to use this concept to generate a suite of rigorous tools that can be used to quantify and compare the topology of cellular signalling networks, such as protein-protein interaction networks and gene regulation networks. We calculate exact and explicit formulae for the leading orders in the system size of the Shannon entropies of random graph ensembles constrained with degree distribution and degree-degree correlation. We also construct an ergodic detailed balance Markov chain with non-trivial acceptance probabilities which converges to a strictly uniform measure and is based on edge swaps that conserve all degrees. The acceptance probabilities can be generalized to define Markov chains that target any alternative desired measure on the space of directed or undirected graphs, in order to generate graphs with more sophisticated topological features.

  6. Performance of the FV3-powered Next Generation Global Prediction System for Harvey and Irma, and a vision for a "beyond weather timescale" prediction system for long-range hurricane track and intensity predictions

    Science.gov (United States)

    Lin, S. J.; Bender, M.; Harris, L.; Hazelton, A.

    2017-12-01

    The performance of a GFDL developed FV3-based Next Generation Global Prediction System (NGGPS) for Harvey and Irma will be reported. We will report on aspects of track and intensity errors (vs operational models), heavy precipitation (Harvey), rapid intensification, and simulated structure (in comparison with ground based radar), and point to a need of a future long-range (from day-5 up to 30 days) physically based ensemble hurricane prediction system for providing useful information to the forecasters, beyond the usual weather timescale.

  7. Universal critical wrapping probabilities in the canonical ensemble

    Directory of Open Access Journals (Sweden)

    Hao Hu

    2015-09-01

    Full Text Available Universal dimensionless quantities, such as Binder ratios and wrapping probabilities, play an important role in the study of critical phenomena. We study the finite-size scaling behavior of the wrapping probability for the Potts model in the random-cluster representation, under the constraint that the total number of occupied bonds is fixed, so that the canonical ensemble applies. We derive that, in the limit L→∞, the critical values of the wrapping probability are different from those of the unconstrained model, i.e. the model in the grand-canonical ensemble, but still universal, for systems with 2yt−d>0 where yt=1/ν is the thermal renormalization exponent and d is the spatial dimension. Similar modifications apply to other dimensionless quantities, such as Binder ratios. For systems with 2yt−d≤0, these quantities share same critical universal values in the two ensembles. It is also derived that new finite-size corrections are induced. These findings apply more generally to systems in the canonical ensemble, e.g. the dilute Potts model with a fixed total number of vacancies. Finally, we formulate an efficient cluster-type algorithm for the canonical ensemble, and confirm these predictions by extensive simulations.

  8. Ensemble methods for seasonal limited area forecasts

    DEFF Research Database (Denmark)

    Arritt, Raymond W.; Anderson, Christopher J.; Takle, Eugene S.

    2004-01-01

    The ensemble prediction methods used for seasonal limited area forecasts were examined by comparing methods for generating ensemble simulations of seasonal precipitation. The summer 1993 model over the north-central US was used as a test case. The four methods examined included the lagged-average...

  9. Gridded Calibration of Ensemble Wind Vector Forecasts Using Ensemble Model Output Statistics

    Science.gov (United States)

    Lazarus, S. M.; Holman, B. P.; Splitt, M. E.

    2017-12-01

    A computationally efficient method is developed that performs gridded post processing of ensemble wind vector forecasts. An expansive set of idealized WRF model simulations are generated to provide physically consistent high resolution winds over a coastal domain characterized by an intricate land / water mask. Ensemble model output statistics (EMOS) is used to calibrate the ensemble wind vector forecasts at observation locations. The local EMOS predictive parameters (mean and variance) are then spread throughout the grid utilizing flow-dependent statistical relationships extracted from the downscaled WRF winds. Using data withdrawal and 28 east central Florida stations, the method is applied to one year of 24 h wind forecasts from the Global Ensemble Forecast System (GEFS). Compared to the raw GEFS, the approach improves both the deterministic and probabilistic forecast skill. Analysis of multivariate rank histograms indicate the post processed forecasts are calibrated. Two downscaling case studies are presented, a quiescent easterly flow event and a frontal passage. Strengths and weaknesses of the approach are presented and discussed.

  10. Predictive Analytics in Information Systems Research

    OpenAIRE

    Shmueli, Galit; Koppius, Otto

    2011-01-01

    textabstractThis research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as well as methods for assessing predictive power. Predictive analytics not only assist in creating practically useful models, they also play an important role alongside explanatory modeling in theory bu...

  11. Revealing skill of the MiKlip decadal prediction system by three-dimensional probabilistic evaluation

    Directory of Open Access Journals (Sweden)

    Sophie Stolzenberger

    2016-12-01

    Full Text Available Decadal climate predictions and their verification are part of ongoing research. This article studies different methods applied to decadal hindcasts of three-dimensional atmospheric variables to evaluate the MiKlip (Mittelfristige Klimaprognosen prediction system. Variables such as upper air temperature are tight to the core of the prediction system and hence help to reveal its power and deficiencies. The verification uses both, necessary and sufficient probabilistic measures. We analyze annual and multi-year averages of air temperature and geopotential height and the parametrized quantity net water flux at the ocean surface, the so-called freshwater flux, also known as E‑P (evaporation minus precipitation, as an important variable for atmosphere-ocean coupling. The model data stem from various versions of the MiKlip prediction system and constitute different sets of ensemble hindcasts covering 1979–2012. The results reveal that the freshwater flux is far more sensitive to model deficiencies than the basic dynamical variables and the predictability decays much earlier with prediction lead time. Initializing the atmospheric component is more important for the predictability than the difference in resolution between two model versions. The combined initialization of atmosphere and ocean has the effect of increasing the predictability in the inner tropics from 1 to 2 years compared to the ocean only initialization. For prediction year 7–10, the hindcasts are still closer to each other than to the uninitialized historical runs indicating that the prediction system is still influenced by the initial conditions. The skill for prediction year 7–10 is, however, only marginally larger than the skill of the uninitialized ensemble. The three-dimensional skill analysis reveals a clear indication of a mid-tropospheric temperature error developing in the tropical Pacific area.

  12. Ensemble-based Kalman Filters in Strongly Nonlinear Dynamics

    Institute of Scientific and Technical Information of China (English)

    Zhaoxia PU; Joshua HACKER

    2009-01-01

    This study examines the effectiveness of ensemble Kalman filters in data assimilation with the strongly nonlinear dynamics of the Lorenz-63 model, and in particular their use in predicting the regime transition that occurs when the model jumps from one basin of attraction to the other. Four configurations of the ensemble-based Kalman filtering data assimilation techniques, including the ensemble Kalman filter, ensemble adjustment Kalman filter, ensemble square root filter and ensemble transform Kalman filter, are evaluated with their ability in predicting the regime transition (also called phase transition) and also are compared in terms of their sensitivity to both observational and sampling errors. The sensitivity of each ensemble-based filter to the size of the ensemble is also examined.

  13. Enhancing COSMO-DE ensemble forecasts by inexpensive techniques

    Directory of Open Access Journals (Sweden)

    Zied Ben Bouallègue

    2013-02-01

    Full Text Available COSMO-DE-EPS, a convection-permitting ensemble prediction system based on the high-resolution numerical weather prediction model COSMO-DE, is pre-operational since December 2010, providing probabilistic forecasts which cover Germany. This ensemble system comprises 20 members based on variations of the lateral boundary conditions, the physics parameterizations and the initial conditions. In order to increase the sample size in a computationally inexpensive way, COSMO-DE-EPS is combined with alternative ensemble techniques: the neighborhood method and the time-lagged approach. Their impact on the quality of the resulting probabilistic forecasts is assessed. Objective verification is performed over a six months period, scores based on the Brier score and its decomposition are shown for June 2011. The combination of the ensemble system with the alternative approaches improves probabilistic forecasts of precipitation in particular for high precipitation thresholds. Moreover, combining COSMO-DE-EPS with only the time-lagged approach improves the skill of area probabilities for precipitation and does not deteriorate the skill of 2 m-temperature and wind gusts forecasts.

  14. Power flow prediction in vibrating systems via model reduction

    Science.gov (United States)

    Li, Xianhui

    This dissertation focuses on power flow prediction in vibrating systems. Reduced order models (ROMs) are built based on rational Krylov model reduction which preserve power flow information in the original systems over a specified frequency band. Stiffness and mass matrices of the ROMs are obtained by projecting the original system matrices onto the subspaces spanned by forced responses. A matrix-free algorithm is designed to construct ROMs directly from the power quantities at selected interpolation frequencies. Strategies for parallel implementation of the algorithm via message passing interface are proposed. The quality of ROMs is iteratively refined according to the error estimate based on residual norms. Band capacity is proposed to provide a priori estimate of the sizes of good quality ROMs. Frequency averaging is recast as ensemble averaging and Cauchy distribution is used to simplify the computation. Besides model reduction for deterministic systems, details of constructing ROMs for parametric and nonparametric random systems are also presented. Case studies have been conducted on testbeds from Harwell-Boeing collections. Input and coupling power flow are computed for the original systems and the ROMs. Good agreement is observed in all cases.

  15. Ensemble using different Planetary Boundary Layer schemes in WRF model for wind speed and direction prediction over Apulia region

    Science.gov (United States)

    Tateo, Andrea; Marcello Miglietta, Mario; Fedele, Francesca; Menegotto, Micaela; Monaco, Alfonso; Bellotti, Roberto

    2017-04-01

    The Weather Research and Forecasting mesoscale model (WRF) was used to simulate hourly 10 m wind speed and direction over the city of Taranto, Apulia region (south-eastern Italy). This area is characterized by a large industrial complex including the largest European steel plant and is subject to a Regional Air Quality Recovery Plan. This plan constrains industries in the area to reduce by 10 % the mean daily emissions by diffuse and point sources during specific meteorological conditions named wind days. According to the Recovery Plan, the Regional Environmental Agency ARPA-PUGLIA is responsible for forecasting these specific meteorological conditions with 72 h in advance and possibly issue the early warning. In particular, an accurate wind simulation is required. Unfortunately, numerical weather prediction models suffer from errors, especially for what concerns near-surface fields. These errors depend primarily on uncertainties in the initial and boundary conditions provided by global models and secondly on the model formulation, in particular the physical parametrizations used to represent processes such as turbulence, radiation exchange, cumulus and microphysics. In our work, we tried to compensate for the latter limitation by using different Planetary Boundary Layer (PBL) parameterization schemes. Five combinations of PBL and Surface Layer (SL) schemes were considered. Simulations are implemented in a real-time configuration since our intention is to analyze the same configuration implemented by ARPA-PUGLIA for operational runs; the validation is focused over a time range extending from 49 to 72 h with hourly time resolution. The assessment of the performance was computed by comparing the WRF model output with ground data measured at a weather monitoring station in Taranto, near the steel plant. After the analysis of the simulations performed with different PBL schemes, both simple (e.g. average) and more complex post-processing methods (e.g. weighted average

  16. World Music Ensemble: Kulintang

    Science.gov (United States)

    Beegle, Amy C.

    2012-01-01

    As instrumental world music ensembles such as steel pan, mariachi, gamelan and West African drums are becoming more the norm than the exception in North American school music programs, there are other world music ensembles just starting to gain popularity in particular parts of the United States. The kulintang ensemble, a drum and gong ensemble…

  17. Daily temperature changes and variability in ENSEMBLES regional models predictions: Evaluation and intercomparison for the Ebro Valley (NE Iberia)

    KAUST Repository

    El Kenawy, Ahmed M.

    2014-12-18

    We employ a suite of regional climate models (RCMs) to assess future changes in summer (JJA) maximum temperature (Tmax) over the Ebro basin, the largest hydrological division in the Iberian Peninsula. Under the A1B emission scenario, future changes in both mean values and their corresponding time varying percentiles were examined by comparing the control period (1971-2000) with two future time slices: 2021-2050 and 2071-2100. Here, the rationale is to assess how lower/upper tails of temperature distributions will change in the future and whether these changes will be consistent with those of the mean. The model validation results demonstrate significant differences among the models in terms of their capability to representing the statistical characteristics (e.g., mean, skewness and asymmetry) of the observed climate. The results also indicate that the current substantial warming observed in the Ebro basin is expected to continue during the 21st century, with more intense warming occurring at higher altitudes and in areas with greater distance from coastlines. All models suggest that the region will experience significant positive changes in both the cold and warm tails of temperature distributions. However, the results emphasize that future changes in the lower and upper tails of the summer Tmax distribution may not follow the same warming rate as the mean condition. In particular, the projected changes in the warm tail of the summer Tmax are shown to be significantly larger than changes in both mean values and the cold tail, especially at the end of the 21st century. The finding suggests that much of the changes in the summer Tmax percentiles will be driven by a shift in the entire distribution of temperature rather than only changes in the central tendency. Better understanding of the possible implications of future climate systems provides information useful for vulnerability assessments and the development of local adaptation strategies for multi

  18. Predictive Analytics in Information Systems Research

    NARCIS (Netherlands)

    G. Shmueli (Galit); O.R. Koppius (Otto)

    2011-01-01

    textabstractThis research essay highlights the need to integrate predictive analytics into information systems research and shows several concrete ways in which this goal can be accomplished. Predictive analytics include empirical methods (statistical and other) that generate data predictions as

  19. Improving wave forecasting by integrating ensemble modelling and machine learning

    Science.gov (United States)

    O'Donncha, F.; Zhang, Y.; James, S. C.

    2017-12-01

    Modern smart-grid networks use technologies to instantly relay information on supply and demand to support effective decision making. Integration of renewable-energy resources with these systems demands accurate forecasting of energy production (and demand) capacities. For wave-energy converters, this requires wave-condition forecasting to enable estimates of energy production. Current operational wave forecasting systems exhibit substantial errors with wave-height RMSEs of 40 to 60 cm being typical, which limits the reliability of energy-generation predictions thereby impeding integration with the distribution grid. In this study, we integrate physics-based models with statistical learning aggregation techniques that combine forecasts from multiple, independent models into a single "best-estimate" prediction of the true state. The Simulating Waves Nearshore physics-based model is used to compute wind- and currents-augmented waves in the Monterey Bay area. Ensembles are developed based on multiple simulations perturbing input data (wave characteristics supplied at the model boundaries and winds) to the model. A learning-aggregation technique uses past observations and past model forecasts to calculate a weight for each model. The aggregated forecasts are compared to observation data to quantify the performance of the model ensemble and aggregation techniques. The appropriately weighted ensemble model outperforms an individual ensemble member with regard to forecasting wave conditions.

  20. Operational hydrological forecasting in Bavaria. Part II: Ensemble forecasting

    Science.gov (United States)

    Ehret, U.; Vogelbacher, A.; Moritz, K.; Laurent, S.; Meyer, I.; Haag, I.

    2009-04-01

    In part I of this study, the operational flood forecasting system in Bavaria and an approach to identify and quantify forecast uncertainty was introduced. The approach is split into the calculation of an empirical 'overall error' from archived forecasts and the calculation of an empirical 'model error' based on hydrometeorological forecast tests, where rainfall observations were used instead of forecasts. The 'model error' can especially in upstream catchments where forecast uncertainty is strongly dependent on the current predictability of the atrmosphere be superimposed on the spread of a hydrometeorological ensemble forecast. In Bavaria, two meteorological ensemble prediction systems are currently tested for operational use: the 16-member COSMO-LEPS forecast and a poor man's ensemble composed of DWD GME, DWD Cosmo-EU, NCEP GFS, Aladin-Austria, MeteoSwiss Cosmo-7. The determination of the overall forecast uncertainty is dependent on the catchment characteristics: 1. Upstream catchment with high influence of weather forecast a) A hydrological ensemble forecast is calculated using each of the meteorological forecast members as forcing. b) Corresponding to the characteristics of the meteorological ensemble forecast, each resulting forecast hydrograph can be regarded as equally likely. c) The 'model error' distribution, with parameters dependent on hydrological case and lead time, is added to each forecast timestep of each ensemble member d) For each forecast timestep, the overall (i.e. over all 'model error' distribution of each ensemble member) error distribution is calculated e) From this distribution, the uncertainty range on a desired level (here: the 10% and 90% percentile) is extracted and drawn as forecast envelope. f) As the mean or median of an ensemble forecast does not necessarily exhibit meteorologically sound temporal evolution, a single hydrological forecast termed 'lead forecast' is chosen and shown in addition to the uncertainty bounds. This can be

  1. Climate change effects on wildland fire risk in the Northeastern and Great Lakes states predicted by a downscaled multi-model ensemble

    Science.gov (United States)

    Kerr, Gaige Hunter; DeGaetano, Arthur T.; Stoof, Cathelijne R.; Ward, Daniel

    2018-01-01

    This study is among the first to investigate wildland fire risk in the Northeastern and the Great Lakes states under a changing climate. We use a multi-model ensemble (MME) of regional climate models from the Coordinated Regional Downscaling Experiment (CORDEX) together with the Canadian Forest Fire Weather Index System (CFFWIS) to understand changes in wildland fire risk through differences between historical simulations and future projections. Our results are relatively homogeneous across the focus region and indicate modest increases in the magnitude of fire weather indices (FWIs) during northern hemisphere summer. The most pronounced changes occur in the date of the initialization of CFFWIS and peak of the wildland fire season, which in the future are trending earlier in the year, and in the significant increases in the length of high-risk episodes, defined by the number of consecutive days with FWIs above the current 95th percentile. Further analyses show that these changes are most closely linked to expected changes in the focus region's temperature and precipitation. These findings relate to the current understanding of particulate matter vis-à-vis wildfires and have implications for human health and local and regional changes in radiative forcings. When considering current fire management strategies which could be challenged by increasing wildland fire risk, fire management agencies could adapt new strategies to improve awareness, prevention, and resilience to mitigate potential impacts to critical infrastructure and population.

  2. Spin–Orbit Alignment of Exoplanet Systems: Ensemble Analysis Using Asteroseismology

    DEFF Research Database (Denmark)

    Campante, T. L.; Lund, M. N.; Kuszlewicz, James S.

    2016-01-01

    seems to be well aligned with the stellar spin axis ( ##IMG## [http://ej.iop.org/images/0004-637X/819/1/85/apj522683ieqn2.gif] $psi =12rc. 6_-11.0^+6.7$ ). While the latter result is in apparent contradiction with a statement made previously in the literature that the multi-transiting system Kepler-25...... observed with NASA’s Kepler satellite. Our results for i s are consistent with alignment at the 2 σ level for all stars in the sample, meaning that the system surrounding the red-giant star Kepler-56 remains as the only unambiguous misaligned multiple-planet system detected to date. The availability...... of a measurement of the projected spin–orbit angle λ for two of the systems allows us to estimate ψ . We find that the orbit of the hot Jupiter HAT-P-7b is likely to be retrograde ( ##IMG## [http://ej.iop.org/images/0004-637X/819/1/85/apj522683ieqn1.gif] $psi =116rc. 4_-14.7^+30.2$ ), whereas that of Kepler-25c...

  3. Development and Testing of a Coupled Ocean-atmosphere Mesoscale Ensemble Prediction System

    Science.gov (United States)

    2011-06-28

    member 0; see text for a detailed description of the physics parameters) Member abl mixlen Flux w-kf tinc-lcl cld -rad precip Graupel Auto-conv Rain-int...increment has an impact on the convective initiation. 7. The cloud updraft radius used in the K–F parameteri- zation: The radius cld -rad (m) varies

  4. Using ensemble weather forecast in a risk based real time optimization of urban drainage systems

    DEFF Research Database (Denmark)

    Courdent, Vianney Augustin Thomas; Vezzaro, Luca; Mikkelsen, Peter Steen

    2015-01-01

    Global Real Time Control (RTC) of urban drainage system is increasingly seen as cost-effective solution in order to respond to increasing performance demand (e.g. reduction of Combined Sewer Overflow, protection of sensitive areas as bathing water etc.). The Dynamic Overflow Risk Assessment (DORA......) strategy was developed to operate Urban Drainage Systems (UDS) in order to minimize the expected overflow risk by considering the water volume presently stored in the drainage network, the expected runoff volume based on a 2-hours radar forecast model and an estimated uncertainty of the runoff forecast....... However, such temporal horizon (1-2 hours) is relatively short when used for the operation of large storage facilities, which may require a few days to be emptied. This limits the performance of the optimization and control in reducing combined sewer overflow and in preparing for possible flooding. Based...

  5. Multimodel hydrological ensemble forecasts for the Baskatong catchment in Canada using the TIGGE database.

    Science.gov (United States)

    Tito Arandia Martinez, Fabian

    2014-05-01

    combined to form a grand ensemble. Results show that the hydrological forecasts derived from the grand ensemble perform better than the pseudo ensemble forecasts actually used operationally at Hydro-Québec. References: [1] M. Verbunt, A. Walser, J. Gurtz et al., "Probabilistic flood forecasting with a limited-area ensemble prediction system: Selected case studies," Journal of Hydrometeorology, vol. 8, no. 4, pp. 897-909, Aug, 2007. [2] N. Evora, Valorisation des prévisions météorologiques d'ensemble, Institu de recherceh d'Hydro-Québec 2005. [3] V. Fortin, Le modèle météo-apport HSAMI: historique, théorie et application, Institut de recherche d'Hydro-Québec, 2000.

  6. Verification and process oriented validation of the MiKlip decadal prediction system

    Directory of Open Access Journals (Sweden)

    Frank Kaspar

    2016-12-01

    Full Text Available Decadal prediction systems are designed to become a valuable tool for decision making in different sectors of economy, administration or politics. Progress in decadal predictions is also expected to improve our scientific understanding of the climate system. The German Federal Ministry for Education and Research (BMBF therefore funds the German national research project MiKlip (Mittelfristige Klimaprognosen. A network of German research institutions contributes to the development of the system by conducting individual research projects. This special issue presents a collection of papers with results of the evaluation activities within the first phase of MiKlip. They document the improvements of the MiKlip decadal prediction system which were achieved during the first phase. Key aspects are the role of initialization strategies, model resolution or ensemble size. Additional topics are the evaluation of specific weather parameters in selected regions and the use of specific observational datasets for the evaluation.

  7. Advanced Atmospheric Ensemble Modeling Techniques

    Energy Technology Data Exchange (ETDEWEB)

    Buckley, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Chiswell, S. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Kurzeja, R. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Maze, G. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Viner, B. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL); Werth, D. [Savannah River Site (SRS), Aiken, SC (United States). Savannah River National Lab. (SRNL)

    2017-09-29

    Ensemble modeling (EM), the creation of multiple atmospheric simulations for a given time period, has become an essential tool for characterizing uncertainties in model predictions. We explore two novel ensemble modeling techniques: (1) perturbation of model parameters (Adaptive Programming, AP), and (2) data assimilation (Ensemble Kalman Filter, EnKF). The current research is an extension to work from last year and examines transport on a small spatial scale (<100 km) in complex terrain, for more rigorous testing of the ensemble technique. Two different release cases were studied, a coastal release (SF6) and an inland release (Freon) which consisted of two release times. Observations of tracer concentration and meteorology are used to judge the ensemble results. In addition, adaptive grid techniques have been developed to reduce required computing resources for transport calculations. Using a 20- member ensemble, the standard approach generated downwind transport that was quantitatively good for both releases; however, the EnKF method produced additional improvement for the coastal release where the spatial and temporal differences due to interior valley heating lead to the inland movement of the plume. The AP technique showed improvements for both release cases, with more improvement shown in the inland release. This research demonstrated that transport accuracy can be improved when models are adapted to a particular location/time or when important local data is assimilated into the simulation and enhances SRNL’s capability in atmospheric transport modeling in support of its current customer base and local site missions, as well as our ability to attract new customers within the intelligence community.

  8. Predictive Model of Systemic Toxicity (SOT)

    Science.gov (United States)

    In an effort to ensure chemical safety in light of regulatory advances away from reliance on animal testing, USEPA and L’Oréal have collaborated to develop a quantitative systemic toxicity prediction model. Prediction of human systemic toxicity has proved difficult and remains a ...

  9. On the calculation of single ion activity coefficients in homogeneous ionic systems by application of the grand canonical ensemble

    DEFF Research Database (Denmark)

    Sloth, Peter

    1993-01-01

    The grand canonical ensemble has been used to study the evaluation of single ion activity coefficients in homogeneous ionic fluids. In this work, the Coulombic interactions are truncated according to the minimum image approximation, and the ions are assumed to be placed in a structureless......, homogeneous dielectric continuum. Grand canonical ensemble Monte Carlo calculation results for two primitive model electrolyte solutions are presented. Also, a formula involving the second moments of the total correlation functions is derived from fluctuation theory, which applies for the derivatives...... of the individual ionic activity coefficients with respect to the total ionic concentration. This formula has previously been proposed on the basis of somewhat different considerations....

  10. Fire spread estimation on forest wildfire using ensemble kalman filter

    Science.gov (United States)

    Syarifah, Wardatus; Apriliani, Erna

    2018-04-01

    Wildfire is one of the most frequent disasters in the world, for example forest wildfire, causing population of forest decrease. Forest wildfire, whether naturally occurring or prescribed, are potential risks for ecosystems and human settlements. These risks can be managed by monitoring the weather, prescribing fires to limit available fuel, and creating firebreaks. With computer simulations we can predict and explore how fires may spread. The model of fire spread on forest wildfire was established to determine the fire properties. The fire spread model is prepared based on the equation of the diffusion reaction model. There are many methods to estimate the spread of fire. The Kalman Filter Ensemble Method is a modified estimation method of the Kalman Filter algorithm that can be used to estimate linear and non-linear system models. In this research will apply Ensemble Kalman Filter (EnKF) method to estimate the spread of fire on forest wildfire. Before applying the EnKF method, the fire spread model will be discreted using finite difference method. At the end, the analysis obtained illustrated by numerical simulation using software. The simulation results show that the Ensemble Kalman Filter method is closer to the system model when the ensemble value is greater, while the covariance value of the system model and the smaller the measurement.

  11. Real­-Time Ensemble Forecasting of Coronal Mass Ejections Using the Wsa-Enlil+Cone Model

    Science.gov (United States)

    Mays, M. L.; Taktakishvili, A.; Pulkkinen, A. A.; Odstrcil, D.; MacNeice, P. J.; Rastaetter, L.; LaSota, J. A.

    2014-12-01

    complete a parametric event case study of the sensitivity of the CME arrival time prediction to free parameters for ambient solar wind model and CME. The parameter sensitivity study suggests future directions for the system, such as running ensembles using various magnetogram inputs to the WSA model.

  12. Ensemble-based flash-flood modelling: Taking into account hydrodynamic parameters and initial soil moisture uncertainties

    Science.gov (United States)

    Edouard, Simon; Vincendon, Béatrice; Ducrocq, Véronique

    2018-05-01

    Intense precipitation events in the Mediterranean often lead to devastating flash floods (FF). FF modelling is affected by several kinds of uncertainties and Hydrological Ensemble Prediction Systems (HEPS) are designed to take those uncertainties into account. The major source of uncertainty comes from rainfall forcing and convective-scale meteorological ensemble prediction systems can manage it for forecasting purpose. But other sources are related to the hydrological modelling part of the HEPS. This study focuses on the uncertainties arising from the hydrological model parameters and initial soil moisture with aim to design an ensemble-based version of an hydrological model dedicated to Mediterranean fast responding rivers simulations, the ISBA-TOP coupled system. The first step consists in identifying the parameters that have the strongest influence on FF simulations by assuming perfect precipitation. A sensitivity study is carried out first using a synthetic framework and then for several real events and several catchments. Perturbation methods varying the most sensitive parameters as well as initial soil moisture allow designing an ensemble-based version of ISBA-TOP. The first results of this system on some real events are presented. The direct perspective of this work will be to drive this ensemble-based version with the members of a convective-scale meteorological ensemble prediction system to design a complete HEPS for FF forecasting.

  13. MVL spatiotemporal analysis for model intercomparison in EPS: application to the DEMETER multi-model ensemble

    Science.gov (United States)

    Fernández, J.; Primo, C.; Cofiño, A. S.; Gutiérrez, J. M.; Rodríguez, M. A.

    2009-08-01

    In a recent paper, Gutiérrez et al. (Nonlinear Process Geophys 15(1):109-114, 2008) introduced a new characterization of spatiotemporal error growth—the so called mean-variance logarithmic (MVL) diagram—and applied it to study ensemble prediction systems (EPS); in particular, they analyzed single-model ensembles obtained by perturbing the initial conditions. In the present work, the MVL diagram is applied to multi-model ensembles analyzing also the effect of model formulation differences. To this aim, the MVL diagram is systematically applied to the multi-model ensemble produced in the EU-funded DEMETER project. It is shown that the shared building blocks (atmospheric and ocean components) impose similar dynamics among different models and, thus, contribute to poorly sampling the model formulation uncertainty. This dynamical similarity should be taken into account, at least as a pre-screening process, before applying any objective weighting method.

  14. Phase structures of the black Dp-D(p+4)-brane system in various ensembles II: electrical and thermodynamic stability

    International Nuclear Information System (INIS)

    Xiao, Zhiguang; Zhou, Da

    2015-01-01

    By incorporating the electrical stability condition into the discussion, we continue the study on the thermodynamic phase structures of the Dp-D(p+4) black brane in GG, GC, CG, CC ensembles defined in our previous paper http://dx.doi.org/10.1007/JHEP07(2015)134. We find that including the electrical stability conditions in addition to the thermal stability conditions does not modify the phase structure of the GG ensemble but puts more constraints on the parameter space where black branes can stably exist in GC, CG, CC ensembles. In particular, the van der Waals-like phase structure which was supposed to be present in these ensembles when only thermal stability condition is considered would no longer be visible, since the phase of the small black brane is unstable under electrical fluctuations. However, the symmetry of the phase structure by interchanging the two kinds of brane charges and potentials is still preserved, which is argued to be the result of T-duality.

  15. Fully automated microchip system for the detection of quantal exocytosis from single and small ensembles of cells

    DEFF Research Database (Denmark)

    Spégel, Christer; Heiskanen, Arto; Pedersen, Simon

    2008-01-01

    A lab-on-a-chip device that enables positioning of single or small ensembles of cells on an aperture in close proximity to a mercaptopropionic acid (MPA) modified sensing electrode has been developed and characterized. The microchip was used for the detection of Ca2+-dependent quantal catecholamine...

  16. IASI Radiance Data Assimilation in Local Ensemble Transform Kalman Filter

    Science.gov (United States)

    Cho, K.; Hyoung-Wook, C.; Jo, Y.

    2016-12-01

    Korea institute of Atmospheric Prediction Systems (KIAPS) is developing NWP model with data assimilation systems. Local Ensemble Transform Kalman Filter (LETKF) system, one of the data assimilation systems, has been developed for KIAPS Integrated Model (KIM) based on cubed-sphere grid and has successfully assimilated real data. LETKF data assimilation system has been extended to 4D- LETKF which considers time-evolving error covariance within assimilation window and IASI radiance data assimilation using KPOP (KIAPS package for observation processing) with RTTOV (Radiative Transfer for TOVS). The LETKF system is implementing semi operational prediction including conventional (sonde, aircraft) observation and AMSU-A (Advanced Microwave Sounding Unit-A) radiance data from April. Recently, the semi operational prediction system updated radiance observations including GPS-RO, AMV, IASI (Infrared Atmospheric Sounding Interferometer) data at July. A set of simulation of KIM with ne30np4 and 50 vertical levels (of top 0.3hPa) were carried out for short range forecast (10days) within semi operation prediction LETKF system with ensemble forecast 50 members. In order to only IASI impact, our experiments used only conventional and IAIS radiance data to same semi operational prediction set. We carried out sensitivity test for IAIS thinning method (3D and 4D). IASI observation number was increased by temporal (4D) thinning and the improvement of IASI radiance data impact on the forecast skill of model will expect.

  17. ESPC Coupled Global Prediction System

    Science.gov (United States)

    2015-09-30

    through an improvement to the sea ice albedo . Fig. 3: 2-m Temperature bias (deg C) of 120-h forecasts for the month of May 2014 for the Arctic...forecast system (NAVGEM) and ocean- sea ice forecast system (HYCOM/CICE) have never been coupled at high resolution. The coupled processes will be...winds and currents across the interface. The sea - ice component of this project requires modification of CICE versions 4 and 5 to run in the coupled

  18. Model Predictive Control for Smart Energy Systems

    DEFF Research Database (Denmark)

    Halvgaard, Rasmus

    pumps, heat tanks, electrical vehicle battery charging/discharging, wind farms, power plants). 2.Embed forecasting methodologies for the weather (e.g. temperature, solar radiation), the electricity consumption, and the electricity price in a predictive control system. 3.Develop optimization algorithms....... Chapter 3 introduces Model Predictive Control (MPC) including state estimation, filtering and prediction for linear models. Chapter 4 simulates the models from Chapter 2 with the certainty equivalent MPC from Chapter 3. An economic MPC minimizes the costs of consumption based on real electricity prices...... that determined the flexibility of the units. A predictive control system easily handles constraints, e.g. limitations in power consumption, and predicts the future behavior of a unit by integrating predictions of electricity prices, consumption, and weather variables. The simulations demonstrate the expected...

  19. Potentialities of ensemble strategies for flood forecasting over the Milano urban area

    Science.gov (United States)

    Ravazzani, Giovanni; Amengual, Arnau; Ceppi, Alessandro; Homar, Víctor; Romero, Romu; Lombardi, Gabriele; Mancini, Marco

    2016-08-01

    Analysis of ensemble forecasting strategies, which can provide a tangible backing for flood early warning procedures and mitigation measures over the Mediterranean region, is one of the fundamental motivations of the international HyMeX programme. Here, we examine two severe hydrometeorological episodes that affected the Milano urban area and for which the complex flood protection system of the city did not completely succeed. Indeed, flood damage have exponentially increased during the last 60 years, due to industrial and urban developments. Thus, the improvement of the Milano flood control system needs a synergism between structural and non-structural approaches. First, we examine how land-use changes due to urban development have altered the hydrological response to intense rainfalls. Second, we test a flood forecasting system which comprises the Flash-flood Event-based Spatially distributed rainfall-runoff Transformation, including Water Balance (FEST-WB) and the Weather Research and Forecasting (WRF) models. Accurate forecasts of deep moist convection and extreme precipitation are difficult to be predicted due to uncertainties arising from the numeric weather prediction (NWP) physical parameterizations and high sensitivity to misrepresentation of the atmospheric state; however, two hydrological ensemble prediction systems (HEPS) have been designed to explicitly cope with uncertainties in the initial and lateral boundary conditions (IC/LBCs) and physical parameterizations of the NWP model. No substantial differences in skill have been found between both ensemble strategies when considering an enhanced diversity of IC/LBCs for the perturbed initial conditions ensemble. Furthermore, no additional benefits have been found by considering more frequent LBCs in a mixed physics ensemble, as ensemble spread seems to be reduced. These findings could help to design the most appropriate ensemble strategies before these hydrometeorological extremes, given the computational

  20. Climate change effects on wildland fire risk in the Northeastern and Great Lakes states predicted by a downscaled multi-model ensemble

    NARCIS (Netherlands)

    Kerr, Gaige Hunter; DeGaetano, Arthur T.; Stoof, Cathelijne R.; Ward, Daniel

    2018-01-01

    This study is among the first to investigate wildland fire risk in the Northeastern and the Great Lakes states under a changing climate. We use a multi-model ensemble (MME) of regional climate models from the Coordinated Regional Downscaling Experiment (CORDEX) together with the Canadian Forest

  1. Examining dynamic interactions among experimental factors influencing hydrologic data assimilation with the ensemble Kalman filter

    Science.gov (United States)

    Wang, S.; Huang, G. H.; Baetz, B. W.; Cai, X. M.; Ancell, B. C.; Fan, Y. R.

    2017-11-01

    The ensemble Kalman filter (EnKF) is recognized as a powerful data assimilation technique that generates an ensemble of model variables through stochastic perturbations of forcing data and observations. However, relatively little guidance exists with regard to the proper specification of the magnitude of the perturbation and the ensemble size, posing a significant challenge in optimally implementing the EnKF. This paper presents a robust data assimilation system (RDAS), in which a multi-factorial design of the EnKF experiments is first proposed for hydrologic ensemble predictions. A multi-way analysis of variance is then used to examine potential interactions among factors affecting the EnKF experiments, achieving optimality of the RDAS with maximized performance of hydrologic predictions. The RDAS is applied to the Xiangxi River watershed which is the most representative watershed in China's Three Gorges Reservoir region to demonstrate its validity and applicability. Results reveal that the pairwise interaction between perturbed precipitation and streamflow observations has the most significant impact on the performance of the EnKF system, and their interactions vary dynamically across different settings of the ensemble size and the evapotranspiration perturbation. In addition, the interactions among experimental factors vary greatly in magnitude and direction depending on different statistical metrics for model evaluation including the Nash-Sutcliffe efficiency and the Box-Cox transformed root-mean-square error. It is thus necessary to test various evaluation metrics in order to enhance the robustness of hydrologic prediction systems.

  2. MSEBAG: a dynamic classifier ensemble generation based on `minimum-sufficient ensemble' and bagging

    Science.gov (United States)

    Chen, Lei; Kamel, Mohamed S.

    2016-01-01

    In this paper, we propose a dynamic classifier system, MSEBAG, which is characterised by searching for the 'minimum-sufficient ensemble' and bagging at the ensemble level. It adopts an 'over-generation and selection' strategy and aims to achieve a good bias-variance trade-off. In the training phase, MSEBAG first searches for the 'minimum-sufficient ensemble', which maximises the in-sample fitness with the minimal number of base classifiers. Then, starting from the 'minimum-sufficient ensemble', a backward stepwise algorithm is employed to generate a collection of ensembles. The objective is to create a collection of ensembles with a descending fitness on the data, as well as a descending complexity in the structure. MSEBAG dynamically selects the ensembles from the collection for the decision aggregation. The extended adaptive aggregation (EAA) approach, a bagging-style algorithm performed at the ensemble level, is employed for this task. EAA searches for the competent ensembles using a score function, which takes into consideration both the in-sample fitness and the confidence of the statistical inference, and averages the decisions of the selected ensembles to label the test pattern. The experimental results show that the proposed MSEBAG outperforms the benchmarks on average.

  3. Ensemble models of neutrophil trafficking in severe sepsis.

    Directory of Open Access Journals (Sweden)

    Sang Ok Song

    Full Text Available A hallmark of severe sepsis is systemic inflammation which activates leukocytes and can result in their misdirection. This leads to both impaired migration to the locus of infection and increased infiltration into healthy tissues. In order to better understand the pathophysiologic mechanisms involved, we developed a coarse-grained phenomenological model of the acute inflammatory response in CLP (cecal ligation and puncture-induced sepsis in rats. This model incorporates distinct neutrophil kinetic responses to the inflammatory stimulus and the dynamic interactions between components of a compartmentalized inflammatory response. Ensembles of model parameter sets consistent with experimental observations were statistically generated using a Markov-Chain Monte Carlo sampling. Prediction uncertainty in the model states was quantified over the resulting ensemble parameter sets. Forward simulation of the parameter ensembles successfully captured experimental features and predicted that systemically activated circulating neutrophils display impaired migration to the tissue and neutrophil sequestration in the lung, consequently contributing to tissue damage and mortality. Principal component and multiple regression analyses of the parameter ensembles estimated from survivor and non-survivor cohorts provide insight into pathologic mechanisms dictating outcome in sepsis. Furthermore, the model was extended to incorporate hypothetical mechanisms by which immune modulation using extracorporeal blood purification results in improved outcome in septic rats. Simulations identified a sub-population (about 18% of the treated population that benefited from blood purification. Survivors displayed enhanced neutrophil migration to tissue and reduced sequestration of lung neutrophils, contributing to improved outcome. The model ensemble presented herein provides a platform for generating and testing hypotheses in silico, as well as motivating further experimental

  4. A probabilistic approach of the Flash Flood Early Warning System (FF-EWS) in Catalonia based on radar ensemble generation

    Science.gov (United States)

    Velasco, David; Sempere-Torres, Daniel; Corral, Carles; Llort, Xavier; Velasco, Enrique

    2010-05-01

    Early Warning Systems (EWS) are commonly identified as the most efficient tools in order to improve the preparedness and risk management against heavy rains and Flash Floods (FF) with the objective of reducing economical losses and human casualties. In particular, flash floods affecting torrential Mediterranean catchments are a key element to be incorporated within operational EWSs. The characteristic high spatial and temporal variability of the storms requires high-resolution data and methods to monitor/forecast the evolution of rainfall and its hydrological impact in small and medium torrential basins. A first version of an operational FF-EWS has been implemented in Catalonia (NE Spain) under the name of EHIMI system (Integrated Tool for Hydrometeorological Forecasting) with the support of the Catalan Water Agency (ACA) and the Meteorological Service of Catalonia (SMC). Flash flood warnings are issued based on radar-rainfall estimates. Rainfall estimation is performed on radar observations with high spatial and temporal resolution (1km2 and 10 minutes) in order to adapt the warning scale to the 1-km grid of the EWS. The method is based on comparing observed accumulated rainfall against rainfall thresholds provided by the regional Intensity-Duration-Frequency (IDF) curves. The so-called "aggregated rainfall warning" at every river cell is obtained as the spatially averaged rainfall over its associated upstream draining area. Regarding the time aggregation of rainfall, the critical duration is thought to be an accumulation period similar to the concentration time of each cachtment. The warning is issued once the forecasted rainfall accumulation exceeds the rainfall thresholds mentioned above, which are associated to certain probability of occurrence. Finally, the hazard warning is provided and shown to the decision-maker in terms of exceeded return periods at every river cell covering the whole area of Catalonia. The objective of the present work includes the

  5. Unmanned Aerial System, New System Manning Prediction

    National Research Council Canada - National Science Library

    Hunn, Bruce P

    2006-01-01

    .... System safety and effectiveness, training, contractor operations and working conditions were evaluated for current UASs, including Hunter, Shadow, Predator, Improved Gnat, and to a lesser degree...

  6. Bridging the spectral divide: a case study with PAGES2k, the CESM Last Millennium Ensemble and proxy system models

    Science.gov (United States)

    Zhu, F.; Emile-Geay, J.; Ault, T.; McKay, N.; Dee, S.

    2017-12-01

    A grand challenge for paleoclimatology is to constrain climate model behavior on timescales longer than the instrumental record. Of particular interest is the spectrum of temperature as sensed by climate proxies. The "continuum" of climate variability [Huybers & Curry, Nature 2006] is often characterized by its scaling exponent β , where the spectral density S and the frequency f satisfy the power law S ∝ f-β . Recent studies have voiced concern that climate models underestimate scaling behavior compared to proxies [Laepple & Huybers, PNAS 2014]. Part of this discrepancy is known to lie in the complex processes whereby proxies transform climate signals [Dee et al, EPSL in press], yet many questions remain open. Here we leverage a recent multiproxy compilation [PAGES 2k Consortium, Sci Data 2017] to characterize scaling behavior over the Common Era using an interpolation-free method [Kirchner & Neal, PNAS 2013]. Proxy spectra are compared to spectra derived from the CESM Last Millennium Ensemble [Otto-Bliesner et al, BAMS 2016], using: (a) a naive model where proxies are assumed linearly related to annual temperature vs (b) proxy system models [Evans et al, QSR 2013] of varying complexity. Scaling behavior varies considerably by archive: on average the strongest centennial slopes are observed for lake sediments (β =1.2), while the smallest are observed for glacier ice (β =0.24). Results confirm that the CESM Last Millennium simulation (LM) exhibits decadal-centennial scaling closer to proxy spectra than the pre-industrial control run (PI): the latter shows a "blue" spectrum (β 0), suggesting that forcings are essential to reduce the spectral divide. Yet, even with forcings, LM spectra are flatter than the proxy spectra. Subsequent work will investigate the roles of seasonal sensitivity (trees, foraminifera, alkenones), multivariate influences (corals, trees), detrending (trees) and post-depositional processes (ice cores, lake & marine sediments) on spectral

  7. 100-Mc counting system; Ensemble de comptage a 100 megacycles; Schetnaya sistema na 100 megatsiklov; Sistema contador de 100 megaciclos

    Energy Technology Data Exchange (ETDEWEB)

    Sugarman, R; Higinbotham, W A; Yonda, A H [Brookhaven National Laboratory, Upton, Long Island, NY (United States)

    1962-04-15

    A complete 100-Mc counting system is described for use in experiments with accelerators. Current-switching logic, using both transistors and germanium tunnel diodes, is used for all high-speed logic. All critical circuits have a rise-time and time-jitter of 2 ns or less. The logical elements are a pulse-height limiter, a discriminator, a multichannel coincidence circuit, a four-fold fanout, and a scale of 8. The fanout enables a limiter or discriminator output to drive any combination of four elements. Each element is a separate plug-in module. Elements are interconnected by a 50-{Omega} cable with at least one termination. Most module inputs and outputs are compatible so that, for example, a discriminator can either drive or be driven from a coincidence circuit by switching cables. To insure reliable high-speed operation and good time and temperature stability the transistors were operated at unity change-gain either in a current-switching mode or in a linear mode as a distributed amplifier. Each tunnel diode provided an additional switching charge-gain of from 2 to 5 with the same stability and bandpass as the transistors. Each module was designed for operation up to a continuous counting rate of 100 megapulses per second. High system duty cycles were made possible by DC interconnections and by double-delay-line clipping for recovery between pulses. No loss in system performance is anticipated for counting rates to 50 Mc. The basic discriminator has a sensitivity adjustable from 2 to 10 mA with a DC-Coupled output of 10 mA at ground potential. Output rise and fall times are 1 ns; pulse width is set by delay cable; maximum output duty cycle is 50% for 95% input recovery. Time jitter from threshold firing to three times that level is 2 ns or less. A more sophisticated version has 10 times the sensitivity. It has a distributed amplifier and a switching chain of two tunnel diodes. Output specifications are the same. Other logic systems for discriminators will also be

  8. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    Directory of Open Access Journals (Sweden)

    Joeri Ruyssinck

    Full Text Available One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made

  9. A meteo-hydrological prediction system based on a multi-model approach for precipitation forecasting

    Directory of Open Access Journals (Sweden)

    S. Davolio

    2008-02-01

    Full Text Available The precipitation forecasted by a numerical weather prediction model, even at high resolution, suffers from errors which can be considerable at the scales of interest for hydrological purposes. In the present study, a fraction of the uncertainty related to meteorological prediction is taken into account by implementing a multi-model forecasting approach, aimed at providing multiple precipitation scenarios driving the same hydrological model. Therefore, the estimation of that uncertainty associated with the quantitative precipitation forecast (QPF, conveyed by the multi-model ensemble, can be exploited by the hydrological model, propagating the error into the hydrological forecast.

    The proposed meteo-hydrological forecasting system is implemented and tested in a real-time configuration for several episodes of intense precipitation affecting the Reno river basin, a medium-sized basin located in northern Italy (Apennines. These episodes are associated with flood events of different intensity and are representative of different meteorological configurations responsible for severe weather affecting northern Apennines.

    The simulation results show that the coupled system is promising in the prediction of discharge peaks (both in terms of amount and timing for warning purposes. The ensemble hydrological forecasts provide a range of possible flood scenarios that proved to be useful for the support of civil protection authorities in their decision.

  10. Multilevel ensemble Kalman filter

    KAUST Repository

    Chernov, Alexey; Hoel, Haakon; Law, Kody; Nobile, Fabio; Tempone, Raul

    2016-01-01

    This work embeds a multilevel Monte Carlo (MLMC) sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF). In terms of computational cost vs. approximation error the asymptotic performance of the multilevel ensemble Kalman filter (MLEnKF) is superior to the EnKF s.

  11. Multilevel ensemble Kalman filter

    KAUST Repository

    Chernov, Alexey

    2016-01-06

    This work embeds a multilevel Monte Carlo (MLMC) sampling strategy into the Monte Carlo step of the ensemble Kalman filter (EnKF). In terms of computational cost vs. approximation error the asymptotic performance of the multilevel ensemble Kalman filter (MLEnKF) is superior to the EnKF s.

  12. The Ensembl REST API: Ensembl Data for Any Language.

    Science.gov (United States)

    Yates, Andrew; Beal, Kathryn; Keenan, Stephen; McLaren, William; Pignatelli, Miguel; Ritchie, Graham R S; Ruffier, Magali; Taylor, Kieron; Vullo, Alessandro; Flicek, Paul

    2015-01-01

    We present a Web service to access Ensembl data using Representational State Transfer (REST). The Ensembl REST server enables the easy retrieval of a wide range of Ensembl data by most programming languages, using standard formats such as JSON and FASTA while minimizing client work. We also introduce bindings to the popular Ensembl Variant Effect Predictor tool permitting large-scale programmatic variant analysis independent of any specific programming language. The Ensembl REST API can be accessed at http://rest.ensembl.org and source code is freely available under an Apache 2.0 license from http://github.com/Ensembl/ensembl-rest. © The Author 2014. Published by Oxford University Press.

  13. Data assimilation for groundwater flow modelling using Unbiased Ensemble Square Root Filter: Case study in Guantao, North China Plain

    Science.gov (United States)

    Li, N.; Kinzelbach, W.; Li, H.; Li, W.; Chen, F.; Wang, L.

    2017-12-01

    Data assimilation techniques are widely used in hydrology to improve the reliability of hydrological models and to reduce model predictive uncertainties. This provides critical information for decision makers in water resources management. This study aims to evaluate a data assimilation system for the Guantao groundwater flow model coupled with a one-dimensional soil column simulation (Hydrus 1D) using an Unbiased Ensemble Square Root Filter (UnEnSRF) originating from the Ensemble Kalman Filter (EnKF) to update parameters and states, separately or simultaneously. To simplify the coupling between unsaturated and saturated zone, a linear relationship obtained from analyzing inputs to and outputs from Hydrus 1D is applied in the data assimilation process. Unlike EnKF, the UnEnSRF updates parameter ensemble mean and ensemble perturbations separately. In order to keep the ensemble filter working well during the data assimilation, two factors are introduced in the study. One is called damping factor to dampen the update amplitude of the posterior ensemble mean to avoid nonrealistic values. The other is called inflation factor to relax the posterior ensemble perturbations close to prior to avoid filter inbreeding problems. The sensitivities of the two factors are studied and their favorable values for the Guantao model are determined. The appropriate observation error and ensemble size were also determined to facilitate the further analysis. This study demonstrated that the data assimilation of both model parameters and states gives a smaller model prediction error but with larger uncertainty while the data assimilation of only model states provides a smaller predictive uncertainty but with a larger model prediction error. Data assimilation in a groundwater flow model will improve model prediction and at the same time make the model converge to the true parameters, which provides a successful base for applications in real time modelling or real time controlling strategies

  14. Musical ensembles in Ancient Mesapotamia

    NARCIS (Netherlands)

    Krispijn, T.J.H.; Dumbrill, R.; Finkel, I.

    2010-01-01

    Identification of musical instruments from ancient Mesopotamia by comparing musical ensembles attested in Sumerian and Akkadian texts with depicted ensembles. Lexicographical contributions to the Sumerian and Akkadian lexicon.

  15. Extending Climate Analytics as a Service to the Earth System Grid Federation Progress Report on the Reanalysis Ensemble Service

    Science.gov (United States)

    Tamkin, G.; Schnase, J. L.; Duffy, D.; Li, J.; Strong, S.; Thompson, J. H.

    2016-12-01

    We are extending climate analytics-as-a-service, including: (1) A high-performance Virtual Real-Time Analytics Testbed supporting six major reanalysis data sets using advanced technologies like the Cloudera Impala-based SQL and Hadoop-based MapReduce analytics over native NetCDF files. (2) A Reanalysis Ensemble Service (RES) that offers a basic set of commonly used operations over the reanalysis collections that are accessible through NASA's climate data analytics Web services and our client-side Climate Data Services Python library, CDSlib. (3) An Open Geospatial Consortium (OGC) WPS-compliant Web service interface to CDSLib to accommodate ESGF's Web service endpoints. This presentation will report on the overall progress of this effort, with special attention to recent enhancements that have been made to the Reanalysis Ensemble Service, including the following: - An CDSlib Python library that supports full temporal, spatial, and grid-based resolution services - A new reanalysis collections reference model to enable operator design and implementation - An enhanced library of sample queries to demonstrate and develop use case scenarios - Extended operators that enable single- and multiple reanalysis area average, vertical average, re-gridding, and trend, climatology, and anomaly computations - Full support for the MERRA-2 reanalysis and the initial integration of two additional reanalyses - A prototype Jupyter notebook-based distribution mechanism that combines CDSlib documentation with interactive use case scenarios and personalized project management - Prototyped uncertainty quantification services that combine ensemble products with comparative observational products - Convenient, one-stop shopping for commonly used data products from multiple reanalyses, including basic subsetting and arithmetic operations over the data and extractions of trends, climatologies, and anomalies - The ability to compute and visualize multiple reanalysis intercomparisons

  16. Ensemble mean climatology of snow darkening effect due to deposition of dust, black carbon, and organic carbon as simulated with the NASA GEOS-5 Earth System Model

    Science.gov (United States)

    Yasunari, T. J.; Lau, W. K.; Mahanama, S. P.; Colarco, P. R.; Koster, R. D.; Kim, K.; da Silva, A.

    2013-12-01

    The importance of the snow darkening effect (SDE) caused by solar absorbing aerosols such as dust and black carbon (BC) on climate has been discussed in previous studies. We have developed a snow darkening package for the catchment land surface model coupled to the NASA Goddard Earth Observing System, version 5 (GEOS-5), Earth System Model. Our snow darkening package includes the schemes for snow albedo and mass concentration calculations in polluted snow by dust, BC, and organic carbon (OC) depositions. The snow darkening package is currently available for seasonal snowpack over the model-defined land areas, excluding sea ice and inland of the ice sheets. The depositions of the solar absorbing aerosols are obtained from the GOCART aerosol module in the GEOS-5. Here we show the preliminary results of ensemble mean climatology (EMC) of the full SDE (i.e., dust+BC+OC). Ensemble simulations covering 10-year of 2002-2011 were carried out with the GEOS-5 including and excluding the full SDE for which each has 10 ensemble members. Shortwave radiative forcing (RF) at the top of atmosphere under all-sky condition for the 10-member EMC of the full SDE was relatively larger over Europe, Central Asia (CA), the Himalayas, the Tibetan Plateau (TP), East Asia (EA), Eastern Siberia (ES), the US, and Canadian Arctic. The RF was the strongest over the Himalayas and the TP in the northern hemisphere. The increases of surface air temperature also well correspond to the RF pattern. Larger reductions of snow water equivalent in seasonal snowpack were seen over the Himalayas, the TP, Alaska, Western Canada, and Arctic regions. We will discuss more on the day of the presentation.

  17. Stochastic Prediction of Ventilation System Performance

    DEFF Research Database (Denmark)

    Haghighat, F.; Brohus, Henrik; Frier, Christian

    The paper briefly reviews the existing techniques for predicting the airflow rate due to the random nature of forcing functions, e.g. wind speed. The effort is to establish the relationship between the statistics of the output of a system and the statistics of the random input variables and param......The paper briefly reviews the existing techniques for predicting the airflow rate due to the random nature of forcing functions, e.g. wind speed. The effort is to establish the relationship between the statistics of the output of a system and the statistics of the random input variables...

  18. Sensitivity of CAM-Chem/DART MOPITT CO Assimilation Performance to the Choice of Ensemble System Configuration: A Case Study for Fires in the Amazon

    Science.gov (United States)

    Arellano, A. F., Jr.; Tang, W.

    2017-12-01

    Assimilating observational data of chemical constituents into a modeling system is a powerful approach in assessing changes in atmospheric composition and estimating associated emissions. However, the results of such chemical data assimilation (DA) experiments are largely subject to various key factors such as: a) a priori information, b) error specification and representation, and c) structural biases in the modeling system. Here we investigate the sensitivity of an ensemble-based data assimilation state and emission estimates to these key factors. We focus on investigating the assimilation performance of the Community Earth System Model (CESM)/CAM-Chem with the Data Assimilation Research Testbed (DART) in representing biomass burning plumes in the Amazonia during the 2008 fire season. We conduct the following ensemble DA MOPITT CO experiments: 1) use of monthly-average NCAR's FINN surface fire emissionss, 2) use of daily FINN surface fire emissions, 3) use of daily FINN emissions with climatological injection heights, and 4) use of perturbed FINN emission parameters to represent not only the uncertainties in combustion activity but also in combustion efficiency. We show key diagnostics of assimilation performance for these experiments and verify with available ground-based and aircraft-based measurements.

  19. Creating ensembles of decision trees through sampling

    Science.gov (United States)

    Kamath, Chandrika; Cantu-Paz, Erick

    2005-08-30

    A system for decision tree ensembles that includes a module to read the data, a module to sort the data, a module to evaluate a potential split of the data according to some criterion using a random sample of the data, a module to split the data, and a module to combine multiple decision trees in ensembles. The decision tree method is based on statistical sampling techniques and includes the steps of reading the data; sorting the data; evaluating a potential split according to some criterion using a random sample of the data, splitting the data, and combining multiple decision trees in ensembles.

  20. JuPOETs: a constrained multiobjective optimization approach to estimate biochemical model ensembles in the Julia programming language.

    Science.gov (United States)

    Bassen, David M; Vilkhovoy, Michael; Minot, Mason; Butcher, Jonathan T; Varner, Jeffrey D

    2017-01-25

    Ensemble modeling is a promising approach for obtaining robust predictions and coarse grained population behavior in deterministic mathematical models. Ensemble approaches address model uncertainty by using parameter or model families instead of single best-fit parameters or fixed model structures. Parameter ensembles can be selected based upon simulation error, along with other criteria such as diversity or steady-state performance. Simulations using parameter ensembles can estimate confidence intervals on model variables, and robustly constrain model predictions, despite having many poorly constrained parameters. In this software note, we present a multiobjective based technique to estimate parameter or models ensembles, the Pareto Optimal Ensemble Technique in the Julia programming language (JuPOETs). JuPOETs integrates simulated annealing with Pareto optimality to estimate ensembles on or near the optimal tradeoff surface between competing training objectives. We demonstrate JuPOETs on a suite of multiobjective problems, including test functions with parameter bounds and system constraints as well as for the identification of a proof-of-concept biochemical model with four conflicting training objectives. JuPOETs identified optimal or near optimal solutions approximately six-fold faster than a corresponding implementation in Octave for the suite of test functions. For the proof-of-concept biochemical model, JuPOETs produced an ensemble of parameters that gave both the mean of the training data for conflicting data sets, while simultaneously estimating parameter sets that performed well on each of the individual objective functions. JuPOETs is a promising approach for the estimation of parameter and model ensembles using multiobjective optimization. JuPOETs can be adapted to solve many problem types, including mixed binary and continuous variable types, bilevel optimization problems and constrained problems without altering the base algorithm. JuPOETs is open

  1. Spatial Ensemble Postprocessing of Precipitation Forecasts Using High Resolution Analyses

    Science.gov (United States)

    Lang, Moritz N.; Schicker, Irene; Kann, Alexander; Wang, Yong

    2017-04-01

    Ensemble prediction systems are designed to account for errors or uncertainties in the initial and boundary conditions, imperfect parameterizations, etc. However, due to sampling errors and underestimation of the model errors, these ensemble forecasts tend to be underdispersive, and to lack both reliability and sharpness. To overcome such limitations, statistical postprocessing methods are commonly applied to these forecasts. In this study, a full-distributional spatial post-processing method is applied to short-range precipitation forecasts over Austria using Standardized Anomaly Model Output Statistics (SAMOS). Following Stauffer et al. (2016), observation and forecast fields are transformed into standardized anomalies by subtracting a site-specific climatological mean and dividing by the climatological standard deviation. Due to the need of fitting only a single regression model for the whole domain, the SAMOS framework provides a computationally inexpensive method to create operationally calibrated probabilistic forecasts for any arbitrary location or for all grid points in the domain simultaneously. Taking advantage of the INCA system (Integrated Nowcasting through Comprehensive Analysis), high resolution analyses are used for the computation of the observed climatology and for model training. The INCA system operationally combines station measurements and remote sensing data into real-time objective analysis fields at 1 km-horizontal resolution and 1 h-temporal resolution. The precipitation forecast used in this study is obtained from a limited area model ensemble prediction system also operated by ZAMG. The so called ALADIN-LAEF provides, by applying a multi-physics approach, a 17-member forecast at a horizontal resolution of 10.9 km and a temporal resolution of 1 hour. The performed SAMOS approach statistically combines the in-house developed high resolution analysis and ensemble prediction system. The station-based validation of 6 hour precipitation sums

  2. Ensemble data assimilation in the Red Sea: sensitivity to ensemble selection and atmospheric forcing

    KAUST Repository

    Toye, Habib; Zhan, Peng; Gopalakrishnan, Ganesh; Kartadikaria, Aditya R.; Huang, Huang; Knio, Omar; Hoteit, Ibrahim

    2017-01-01

    We present our efforts to build an ensemble data assimilation and forecasting system for the Red Sea. The system consists of the high-resolution Massachusetts Institute of Technology general circulation model (MITgcm) to simulate ocean circulation

  3. Evaluating the applicability of using daily forecasts from seasonal prediction systems (SPSs) for agriculture: a case study of Nepal's Terai with the NCEP CFSv2

    Science.gov (United States)

    Jha, Prakash K.; Athanasiadis, Panos; Gualdi, Silvio; Trabucco, Antonio; Mereu, Valentina; Shelia, Vakhtang; Hoogenboom, Gerrit

    2018-03-01

    Ensemble forecasts from dynamic seasonal prediction systems (SPSs) have the potential to improve decision-making for crop management to help cope with interannual weather variability. Because the reliability of crop yield predictions based on seasonal weather forecasts depends on the quality of the forecasts, it is essential to evaluate forecasts prior to agricultural applications. This study analyses the potential of Climate Forecast System version 2 (CFSv2) in predicting the Indian summer monsoon (ISM) for producing meteorological variables relevant to crop modeling. The focus area was Nepal's Terai region, and the local hindcasts were compared with weather station and reanalysis data. The results showed that the CFSv2 model accurately predicts monthly anomalies of daily maximum and minimum air temperature (Tmax and Tmin) as well as incoming total surface solar radiation (Srad). However, the daily climatologies of the respective CFSv2 hindcasts exhibit significant systematic biases compared to weather station data. The CFSv2 is less capable of predicting monthly precipitation anomalies and simulating the respective intra-seasonal variability over the growing season. Nevertheless, the observed daily climatologies of precipitation fall within the ensemble spread of the respective daily climatologies of CFSv2 hindcasts. These limitations in the CFSv2 seasonal forecasts, primarily in precipitation, restrict the potential application for predicting the interannual variability of crop yield associated with weather variability. Despite these limitations, ensemble averaging of the simulated yield using all CFSv2 members after applying bias correction may lead to satisfactory yield predictions.

  4. Impacts of projected maximum temperature extremes for C21 by an ensemble of regional climate models on cereal cropping systems in the Iberian Peninsula

    Directory of Open Access Journals (Sweden)

    M. Ruiz-Ramos

    2011-12-01

    Full Text Available Crops growing in the Iberian Peninsula may be subjected to damagingly high temperatures during the sensitive development periods of flowering and grain filling. Such episodes are considered important hazards and farmers may take insurance to offset their impact. Increases in value and frequency of maximum temperature have been observed in the Iberian Peninsula during the 20th century, and studies on climate change indicate the possibility of further increase by the end of the 21st century. Here, impacts of current and future high temperatures on cereal cropping systems of the Iberian Peninsula are evaluated, focusing on vulnerable development periods of winter and summer crops. Climate change scenarios obtained from an ensemble of ten Regional Climate Models (multimodel ensemble combined with crop simulation models were used for this purpose and related uncertainty was estimated. Results reveal that higher extremes of maximum temperature represent a threat to summer-grown but not to winter-grown crops in the Iberian Peninsula. The study highlights the different vulnerability of crops in the two growing seasons and the need to account for changes in extreme temperatures in developing adaptations in cereal cropping systems. Finally, this work contributes to clarifying the causes of high-uncertainty impact projections from previous studies.

  5. Ensemble Data Mining Methods

    Data.gov (United States)

    National Aeronautics and Space Administration — Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve...

  6. Nuclear multifragmentation within the framework of different statistical ensembles

    International Nuclear Information System (INIS)

    Aguiar, C.E.; Donangelo, R.; Souza, S.R.

    2006-01-01

    The sensitivity of the statistical multifragmentation model to the underlying statistical assumptions is investigated. We concentrate on its microcanonical, canonical, and isobaric formulations. As far as average values are concerned, our results reveal that all the ensembles make very similar predictions, as long as the relevant macroscopic variables (such as temperature, excitation energy, and breakup volume) are the same in all statistical ensembles. It also turns out that the multiplicity dependence of the breakup volume in the microcanonical version of the model mimics a system at (approximately) constant pressure, at least in the plateau region of the caloric curve. However, in contrast to average values, our results suggest that the distributions of physical observables are quite sensitive to the statistical assumptions. This finding may help in deciding which hypothesis corresponds to the best picture for the freeze-out stage

  7. System Predicts Critical Runway Performance Parameters

    Science.gov (United States)

    Millen, Ernest W.; Person, Lee H., Jr.

    1990-01-01

    Runway-navigation-monitor (RNM) and critical-distances-process electronic equipment designed to provide pilot with timely and reliable predictive navigation information relating to takeoff, landing and runway-turnoff operations. Enables pilot to make critical decisions about runway maneuvers with high confidence during emergencies. Utilizes ground-referenced position data only to drive purely navigational monitor system independent of statuses of systems in aircraft.

  8. An Ensemble of Classifiers based Approach for Prediction of Alzheimer's Disease using fMRI Images based on Fusion of Volumetric, Textural and Hemodynamic Features

    Directory of Open Access Journals (Sweden)

    MALIK, F.

    2018-02-01

    Full Text Available Alzheimer's is a neurodegenerative disease caused by the destruction and death of brain neurons resulting in memory loss, impaired thinking ability, and in certain behavioral changes. Alzheimer disease is a major cause of dementia and eventually death all around the world. Early diagnosis of the disease is crucial which can help the victims to maintain their level of independence for comparatively longer time and live a best life possible. For early detection of Alzheimer's disease, we are proposing a novel approach based on fusion of multiple types of features including hemodynamic, volumetric and textural features of the brain. Our approach uses non-invasive fMRI with ensemble of classifiers, for the classification of the normal controls and the Alzheimer patients. For performance evaluation, ten-fold cross validation is used. Individual feature sets and fusion of features have been investigated with ensemble classifiers for successful classification of Alzheimer's patients from normal controls. It is observed that fusion of features resulted in improved results for accuracy, specificity and sensitivity.

  9. Orbital magnetism in ensembles of ballistic billiards

    International Nuclear Information System (INIS)

    Ullmo, D.; Richter, K.; Jalabert, R.A.

    1993-01-01

    The magnetic response of ensembles of small two-dimensional structures at finite temperatures is calculated. Using semiclassical methods and numerical calculation it is demonstrated that only short classical trajectories are relevant. The magnetic susceptibility is enhanced in regular systems, where these trajectories appear in families. For ensembles of squares large paramagnetic susceptibility is obtained, in good agreement with recent measurements in the ballistic regime. (authors). 20 refs., 2 figs

  10. MDOT Pavement Management System : Prediction Models and Feedback System

    Science.gov (United States)

    2000-10-01

    As a primary component of a Pavement Management System (PMS), prediction models are crucial for one or more of the following analyses: : maintenance planning, budgeting, life-cycle analysis, multi-year optimization of maintenance works program, and a...

  11. Nonequilibrium statistical mechanics ensemble method

    CERN Document Server

    Eu, Byung Chan

    1998-01-01

    In this monograph, nonequilibrium statistical mechanics is developed by means of ensemble methods on the basis of the Boltzmann equation, the generic Boltzmann equations for classical and quantum dilute gases, and a generalised Boltzmann equation for dense simple fluids The theories are developed in forms parallel with the equilibrium Gibbs ensemble theory in a way fully consistent with the laws of thermodynamics The generalised hydrodynamics equations are the integral part of the theory and describe the evolution of macroscopic processes in accordance with the laws of thermodynamics of systems far removed from equilibrium Audience This book will be of interest to researchers in the fields of statistical mechanics, condensed matter physics, gas dynamics, fluid dynamics, rheology, irreversible thermodynamics and nonequilibrium phenomena

  12. Online sequential condition prediction method of natural circulation systems based on EOS-ELM and phase space reconstruction

    International Nuclear Information System (INIS)

    Chen, Hanying; Gao, Puzhen; Tan, Sichao; Tang, Jiguo; Yuan, Hongsheng

    2017-01-01

    Highlights: •An online condition prediction method for natural circulation systems in NPP was proposed based on EOS-ELM. •The proposed online prediction method was validated using experimental data. •The training speed of the proposed method is significantly fast. •The proposed method can achieve good accuracy in wide parameter range. -- Abstract: Natural circulation design is widely used in the passive safety systems of advanced nuclear power reactors. The irregular and chaotic flow oscillations are often observed in boiling natural circulation systems so it is difficult for operators to monitor and predict the condition of these systems. An online condition forecasting method for natural circulation system is proposed in this study as an assisting technique for plant operators. The proposed prediction approach was developed based on Ensemble of Online Sequential Extreme Learning Machine (EOS-ELM) and phase space reconstruction. Online Sequential Extreme Learning Machine (OS-ELM) is an online sequential learning neural network algorithm and EOS-ELM is the ensemble method of it. The proposed condition prediction method can be initiated by a small chunk of monitoring data and it can be updated by newly arrived data at very fast speed during the online prediction. Simulation experiments were conducted on the data of two natural circulation loops to validate the performance of the proposed method. The simulation results show that the proposed predication model can successfully recognize different types of flow oscillations and accurately forecast the trend of monitored plant variables. The influence of the number of hidden nodes and neural network inputs on prediction performance was studied and the proposed model can achieve good accuracy in a wide parameter range. Moreover, the comparison results show that the proposed condition prediction method has much faster online learning speed and better prediction accuracy than conventional neural network model.

  13. Ocean-Atmosphere Coupling Processes Affecting Predictability in the Climate System

    Science.gov (United States)

    Miller, A. J.; Subramanian, A. C.; Seo, H.; Eliashiv, J. D.

    2017-12-01

    Predictions of the ocean and atmosphere are often sensitive to coupling at the air-sea interface in ways that depend on the temporal and spatial scales of the target fields. We will discuss several aspects of these types of coupled interactions including oceanic and atmospheric forecast applications. For oceanic mesoscale eddies, the coupling can influence the energetics of the oceanic flow itself. For Madden-Julian Oscillation onset, the coupling timestep should resolve the diurnal cycle to properly raise time-mean SST and latent heat flux prior to deep convection. For Atmospheric River events, the evolving SST field can alter the trajectory and intensity of precipitation anomalies along the California coast. Improvements in predictions will also rely on identifying and alleviating sources of biases in the climate states of the coupled system. Surprisingly, forecast skill can also be improved by enhancing stochastic variability in the atmospheric component of coupled models as found in a multiscale ensemble modeling approach.

  14. Ensemble-Based Data Assimilation in Reservoir Characterization: A Review

    Directory of Open Access Journals (Sweden)

    Seungpil Jung

    2018-02-01

    Full Text Available This paper presents a review of ensemble-based data assimilation for strongly nonlinear problems on the characterization of heterogeneous reservoirs with different production histories. It concentrates on ensemble Kalman filter (EnKF and ensemble smoother (ES as representative frameworks, discusses their pros and cons, and investigates recent progress to overcome their drawbacks. The typical weaknesses of ensemble-based methods are non-Gaussian parameters, improper prior ensembles and finite population size. Three categorized approaches, to mitigate these limitations, are reviewed with recent accomplishments; improvement of Kalman gains, add-on of transformation functions, and independent evaluation of observed data. The data assimilation in heterogeneous reservoirs, applying the improved ensemble methods, is discussed on predicting unknown dynamic data in reservoir characterization.

  15. A retrospective streamflow ensemble forecast for an extreme hydrologic event: a case study of Hurricane Irene and on the Hudson River basin

    Science.gov (United States)

    Saleh, Firas; Ramaswamy, Venkatsundar; Georgas, Nickitas; Blumberg, Alan F.; Pullen, Julie

    2016-07-01

    This paper investigates the uncertainties in hourly streamflow ensemble forecasts for an extreme hydrological event using a hydrological model forced with short-range ensemble weather prediction models. A state-of-the art, automated, short-term hydrologic prediction framework was implemented using GIS and a regional scale hydrological model (HEC-HMS). The hydrologic framework was applied to the Hudson River basin ( ˜ 36 000 km2) in the United States using gridded precipitation data from the National Centers for Environmental Prediction (NCEP) North American Regional Reanalysis (NARR) and was validated against streamflow observations from the United States Geologic Survey (USGS). Finally, 21 precipitation ensemble members of the latest Global Ensemble Forecast System (GEFS/R) were forced into HEC-HMS to generate a retrospective streamflow ensemble forecast for an extreme hydrological event, Hurricane Irene. The work shows that ensemble stream discharge forecasts provide improved predictions and useful information about associated uncertainties, thus improving the assessment of risks when compared with deterministic forecasts. The uncertainties in weather inputs may result in false warnings and missed river flooding events, reducing the potential to effectively mitigate flood damage. The findings demonstrate how errors in the ensemble median streamflow forecast and time of peak, as well as the ensemble spread (uncertainty) are reduced 48 h pre-event by utilizing the ensemble framework. The methodology and implications of this work benefit efforts of short-term streamflow forecasts at regional scales, notably regarding the peak timing of an extreme hydrologic event when combined with a flood threshold exceedance diagram. Although the modeling framework was implemented on the Hudson River basin, it is flexible and applicable in other parts of the world where atmospheric reanalysis products and streamflow data are available.

  16. On the contribution of local feedback mechanisms to the range of climate sensitivity in two GCM ensembles

    Energy Technology Data Exchange (ETDEWEB)

    Webb, M.J.; Senior, C.A.; Sexton, D.M.H.; Ingram, W.J.; Williams, K.D.; Ringer, M.A. [Hadley Centre for Climate Prediction and Research, Met Office, Exeter (United Kingdom); McAvaney, B.J.; Colman, R. [Bureau of Meteorology Research Centre (BMRC), Melbourne (Australia); Soden, B.J. [University of Miami, Rosenstiel School for Marine and Atmospheric Science, Miami, FL (United States); Gudgel, R.; Knutson, T. [Geophysical Fluid Dynamics Laboratory (GFDL), Princeton, NJ (United States); Emori, S.; Ogura, T. [National Institute for Environmental Studies (NIES), Tsukuba (Japan); Tsushima, Y. [Japan Agency for Marine-Earth Science and Technology, Frontier Research Center for Global Change (FRCGC), Kanagawa (Japan); Andronova, N. [University of Michigan, Department of Atmospheric, Oceanic and Space Sciences, Ann Arbor, MI (United States); Li, B. [University of Illinois at Urbana-Champaign (UIUC), Department of Atmospheric Sciences, Urbana, IL (United States); Musat, I.; Bony, S. [Institut Pierre Simon Laplace (IPSL), Paris (France); Taylor, K.E. [Program for Climate Model Diagnosis and Intercomparison (PCMDI), Livermore, CA (United States)

    2006-07-15

    Global and local feedback analysis techniques have been applied to two ensembles of mixed layer equilibrium CO{sub 2} doubling climate change experiments, from the CFMIP (Cloud Feedback Model Intercomparison Project) and QUMP (Quantifying Uncertainty in Model Predictions) projects. Neither of these new ensembles shows evidence of a statistically significant change in the ensemble mean or variance in global mean climate sensitivity when compared with the results from the mixed layer models quoted in the Third Assessment Report of the IPCC. Global mean feedback analysis of these two ensembles confirms the large contribution made by inter-model differences in cloud feedbacks to those in climate sensitivity in earlier studies; net cloud feedbacks are responsible for 66% of the inter-model variance in the total feedback in the CFMIP ensemble and 85% in the QUMP ensemble. The ensemble mean global feedback components are all statistically indistinguishable between the two ensembles, except for the clear-sky shortwave feedback which is stronger in the CFMIP ensemble. While ensemble variances of the shortwave cloud feedback and both clear-sky feedback terms are larger in CFMIP, there is considerable overlap in the cloud feedback ranges; QUMP spans 80% or more of the CFMIP ranges in longwave and shortwave cloud feedback. We introduce a local cloud feedback classification system which distinguishes different types of cloud feedbacks on the basis of the relative strengths of their longwave and shortwave components, and interpret these in terms of responses of different cloud types diagnosed by the International Satellite Cloud Climatology Project simulator. In the CFMIP ensemble, areas where low-top cloud changes constitute the largest cloud response are responsible for 59% of the contribution from cloud feedback to the variance in the total feedback. A similar figure is found for the QUMP ensemble. Areas of positive low cloud feedback (associated with reductions in low level

  17. Decadal Prediction Skill in the GEOS-5 Forecast System

    Science.gov (United States)

    Ham, Yoo-Geun; Rienecker, Michele M.; Suarez, Max J.; Vikhliaev, Yury; Zhao, Bin; Marshak, Jelena; Vernieres, Guillaume; Schubert, Siegfried D.

    2013-01-01

    A suite of decadal predictions has been conducted with the NASA Global Modeling and Assimilation Office's (GMAO's) GEOS-5 Atmosphere-Ocean general circulation model. The hind casts are initialized every December 1st from 1959 to 2010, following the CMIP5 experimental protocol for decadal predictions. The initial conditions are from a multivariate ensemble optimal interpolation ocean and sea-ice reanalysis, and from GMAO's atmospheric reanalysis, the modern-era retrospective analysis for research and applications. The mean forecast skill of a three-member-ensemble is compared to that of an experiment without initialization but also forced with observed greenhouse gases. The results show that initialization increases the forecast skill of North Atlantic sea surface temperature compared to the uninitialized runs, with the increase in skill maintained for almost a decade over the subtropical and mid-latitude Atlantic. On the other hand, the initialization reduces the skill in predicting the warming trend over some regions outside the Atlantic. The annual-mean Atlantic meridional overturning circulation index, which is defined here as the maximum of the zonally-integrated overturning stream function at mid-latitude, is predictable up to a 4-year lead time, consistent with the predictable signal in upper ocean heat content over the North Atlantic. While the 6- to 9-year forecast skill measured by mean squared skill score shows 50 percent improvement in the upper ocean heat content over the subtropical and mid-latitude Atlantic, prediction skill is relatively low in the sub-polar gyre. This low skill is due in part to features in the spatial pattern of the dominant simulated decadal mode in upper ocean heat content over this region that differ from observations. An analysis of the large-scale temperature budget shows that this is the result of a model bias, implying that realistic simulation of the climatological fields is crucial for skillful decadal forecasts.

  18. Multi-model analysis in hydrological prediction

    Science.gov (United States)

    Lanthier, M.; Arsenault, R.; Brissette, F.

    2017-12-01

    Hydrologic modelling, by nature, is a simplification of the real-world hydrologic system. Therefore ensemble hydrological predictions thus obtained do not present the full range of possible streamflow outcomes, thereby producing ensembles which demonstrate errors in variance such as under-dispersion. Past studies show that lumped models used in prediction mode can return satisfactory results, especially when there is not enough information available on the watershed to run a distributed model. But all lumped models greatly simplify the complex processes of the hydrologic cycle. To generate more spread in the hydrologic ensemble predictions, multi-model ensembles have been considered. In this study, the aim is to propose and analyse a method that gives an ensemble streamflow prediction that properly represents the forecast probabilities and reduced ensemble bias. To achieve this, three simple lumped models are used to generate an ensemble. These will also be combined using multi-model averaging techniques, which generally generate a more accurate hydrogram than the best of the individual models in simulation mode. This new predictive combined hydrogram is added to the ensemble, thus creating a large ensemble which may improve the variability while also improving the ensemble mean bias. The quality of the predictions is then assessed on different periods: 2 weeks, 1 month, 3 months and 6 months using a PIT Histogram of the percentiles of the real observation volumes with respect to the volumes of the ensemble members. Initially, the models were run using historical weather data to generate synthetic flows. This worked for individual models, but not for the multi-model and for the large ensemble. Consequently, by performing data assimilation at each prediction period and thus adjusting the initial states of the models, the PIT Histogram could be constructed using the observed flows while allowing the use of the multi-model predictions. The under-dispersion has been

  19. 'Lazy' quantum ensembles

    International Nuclear Information System (INIS)

    Parfionov, George; Zapatrin, Roman

    2006-01-01

    We compare different strategies aimed to prepare an ensemble with a given density matrix ρ. Preparing the ensemble of eigenstates of ρ with appropriate probabilities can be treated as 'generous' strategy: it provides maximal accessible information about the state. Another extremity is the so-called 'Scrooge' ensemble, which is mostly stingy in sharing the information. We introduce 'lazy' ensembles which require minimal effort to prepare the density matrix by selecting pure states with respect to completely random choice. We consider two parties, Alice and Bob, playing a kind of game. Bob wishes to guess which pure state is prepared by Alice. His null hypothesis, based on the lack of any information about Alice's intention, is that Alice prepares any pure state with equal probability. Then, the average quantum state measured by Bob turns out to be ρ, and he has to make a new hypothesis about Alice's intention solely based on the information that the observed density matrix is ρ. The arising 'lazy' ensemble is shown to be the alternative hypothesis which minimizes type I error

  20. The semantic similarity ensemble

    Directory of Open Access Journals (Sweden)

    Andrea Ballatore

    2013-12-01

    Full Text Available Computational measures of semantic similarity between geographic terms provide valuable support across geographic information retrieval, data mining, and information integration. To date, a wide variety of approaches to geo-semantic similarity have been devised. A judgment of similarity is not intrinsically right or wrong, but obtains a certain degree of cognitive plausibility, depending on how closely it mimics human behavior. Thus selecting the most appropriate measure for a specific task is a significant challenge. To address this issue, we make an analogy between computational similarity measures and soliciting domain expert opinions, which incorporate a subjective set of beliefs, perceptions, hypotheses, and epistemic biases. Following this analogy, we define the semantic similarity ensemble (SSE as a composition of different similarity measures, acting as a panel of experts having to reach a decision on the semantic similarity of a set of geographic terms. The approach is evaluated in comparison to human judgments, and results indicate that an SSE performs better than the average of its parts. Although the best member tends to outperform the ensemble, all ensembles outperform the average performance of each ensemble's member. Hence, in contexts where the best measure is unknown, the ensemble provides a more cognitively plausible approach.

  1. Aircraft noise prediction program theoretical manual: Rotorcraft System Noise Prediction System (ROTONET), part 4

    Science.gov (United States)

    Weir, Donald S.; Jumper, Stephen J.; Burley, Casey L.; Golub, Robert A.

    1995-01-01

    This document describes the theoretical methods used in the rotorcraft noise prediction system (ROTONET), which is a part of the NASA Aircraft Noise Prediction Program (ANOPP). The ANOPP code consists of an executive, database manager, and prediction modules for jet engine, propeller, and rotor noise. The ROTONET subsystem contains modules for the prediction of rotor airloads and performance with momentum theory and prescribed wake aerodynamics, rotor tone noise with compact chordwise and full-surface solutions to the Ffowcs-Williams-Hawkings equations, semiempirical airfoil broadband noise, and turbulence ingestion broadband noise. Flight dynamics, atmosphere propagation, and noise metric calculations are covered in NASA TM-83199, Parts 1, 2, and 3.

  2. The Earth System Prediction Suite: Toward a Coordinated U.S. Modeling Capability

    Science.gov (United States)

    Theurich, Gerhard; DeLuca, C.; Campbell, T.; Liu, F.; Saint, K.; Vertenstein, M.; Chen, J.; Oehmke, R.; Doyle, J.; Whitcomb, T.; hide

    2016-01-01

    The Earth System Prediction Suite (ESPS) is a collection of flagship U.S. weather and climate models and model components that are being instrumented to conform to interoperability conventions, documented to follow metadata standards, and made available either under open source terms or to credentialed users.The ESPS represents a culmination of efforts to create a common Earth system model architecture, and the advent of increasingly coordinated model development activities in the U.S. ESPS component interfaces are based on the Earth System Modeling Framework (ESMF), community-developed software for building and coupling models, and the National Unified Operational Prediction Capability (NUOPC) Layer, a set of ESMF-based component templates and interoperability conventions. This shared infrastructure simplifies the process of model coupling by guaranteeing that components conform to a set of technical and semantic behaviors. The ESPS encourages distributed, multi-agency development of coupled modeling systems, controlled experimentation and testing, and exploration of novel model configurations, such as those motivated by research involving managed and interactive ensembles. ESPS codes include the Navy Global Environmental Model (NavGEM), HYbrid Coordinate Ocean Model (HYCOM), and Coupled Ocean Atmosphere Mesoscale Prediction System (COAMPS); the NOAA Environmental Modeling System (NEMS) and the Modular Ocean Model (MOM); the Community Earth System Model (CESM); and the NASA ModelE climate model and GEOS-5 atmospheric general circulation model.

  3. Hybrid vs Adaptive Ensemble Kalman Filtering for Storm Surge Forecasting

    Science.gov (United States)

    Altaf, M. U.; Raboudi, N.; Gharamti, M. E.; Dawson, C.; McCabe, M. F.; Hoteit, I.

    2014-12-01

    Recent storm surge events due to Hurricanes in the Gulf of Mexico have motivated the efforts to accurately forecast water levels. Toward this goal, a parallel architecture has been implemented based on a high resolution storm surge model, ADCIRC. However the accuracy of the model notably depends on the quality and the recentness of the input data (mainly winds and bathymetry), model parameters (e.g. wind and bottom drag coefficients), and the resolution of the model grid. Given all these uncertainties in the system, the challenge is to build an efficient prediction system capable of providing accurate forecasts enough ahead of time for the authorities to evacuate the areas at risk. We have developed an ensemble-based data assimilation system to frequently assimilate available data into the ADCIRC model in order to improve the accuracy of the model. In this contribution we study and analyze the performances of different ensemble Kalman filter methodologies for efficient short-range storm surge forecasting, the aim being to produce the most accurate forecasts at the lowest possible computing time. Using Hurricane Ike meteorological data to force the ADCIRC model over a domain including the Gulf of Mexico coastline, we implement and compare the forecasts of the standard EnKF, the hybrid EnKF and an adaptive EnKF. The last two schemes have been introduced as efficient tools for enhancing the behavior of the EnKF when implemented with small ensembles by exploiting information from a static background covariance matrix. Covariance inflation and localization are implemented in all these filters. Our results suggest that both the hybrid and the adaptive approach provide significantly better forecasts than those resulting from the standard EnKF, even when implemented with much smaller ensembles.

  4. Incipient-signature identification of mechanical anomalies in a ship-borne satellite antenna system using an ensemble multiwavelet

    International Nuclear Information System (INIS)

    He, Shuilong; Zi, Yanyang; Chen, Jinglong; Chen, Binqiang; He, Zhengjia; Zhao, Chenlu; Yuan, Jing

    2014-01-01

    The instrumented tracking and telemetry ship with a ship-borne satellite antenna (SSA) is the critical device to ensure high quality of space exploration work. To effectively detect mechanical anomalies that can lead to unexpected downtime of the SSA, an ensemble multiwavelet (EM) is presented for identifying the anomaly related incipient-signatures within the measured dynamic signals. Rather than using a predetermined basis as in a conventional multiwavelet, an EM optimizes the matching basis which satisfactorily adapts to the anomaly related incipient-signatures. The construction technique of an EM is based on the conjunction of a two-scale similarity transform (TST) and lifting scheme (LS). For the technique above, the TST improves the regularity by increasing the approximation order of multiscaling functions, while subsequently the LS enhances the smoothness and localizability via utilizing the vanishing moment of multiwavelet functions. Moreover, combining the Hilbert transform with EM decomposition, we identify the incipient-signatures induced by the mechanical anomalies from the measured dynamic signals. A numerical simulation and two successful applications of diagnosis cases (a planetary gearbox and a roller bearing) demonstrate that the proposed technique is capable of dealing with the challenging incipient-signature identification task even though spectral complexity, as well as the strong amplitude/frequency modulation effect, is present in the dynamic signals. (paper)

  5. Hybrid neural intelligent system to predict business failure in small-to-medium-size enterprises.

    Science.gov (United States)

    Borrajo, M Lourdes; Baruque, Bruno; Corchado, Emilio; Bajo, Javier; Corchado, Juan M

    2011-08-01

    During the last years there has been a growing need of developing innovative tools that can help small to medium sized enterprises to predict business failure as well as financial crisis. In this study we present a novel hybrid intelligent system aimed at monitoring the modus operandi of the companies and predicting possible failures. This system is implemented by means of a neural-based multi-agent system that models the different actors of the companies as agents. The core of the multi-agent system is a type of agent that incorporates a case-based reasoning system and automates the business control process and failure prediction. The stages of the case-based reasoning system are implemented by means of web services: the retrieval stage uses an innovative weighted voting summarization of self-organizing maps ensembles-based method and the reuse stage is implemented by means of a radial basis function neural network. An initial prototype was developed and the results obtained related to small and medium enterprises in a real scenario are presented.

  6. ECMWF seasonal forecast system 3 and its prediction of sea surface temperature

    Energy Technology Data Exchange (ETDEWEB)

    Stockdale, Timothy N.; Anderson, David L.T.; Balmaseda, Magdalena A.; Ferranti, Laura; Mogensen, Kristian; Palmer, Timothy N.; Molteni, Franco; Vitart, Frederic [ECMWF, Reading (United Kingdom); Doblas-Reyes, Francisco [ECMWF, Reading (United Kingdom); Institut Catala de Ciencies del Clima (IC3), Barcelona (Spain)

    2011-08-15

    The latest operational version of the ECMWF seasonal forecasting system is described. It shows noticeably improved skill for sea surface temperature (SST) prediction compared with previous versions, particularly with respect to El Nino related variability. Substantial skill is shown for lead times up to 1 year, although at this range the spread in the ensemble forecast implies a loss of predictability large enough to account for most of the forecast error variance, suggesting only moderate scope for improving long range El Nino forecasts. At shorter ranges, particularly 3-6 months, skill is still substantially below the model-estimated predictability limit. SST forecast skill is higher for more recent periods than earlier ones. Analysis shows that although various factors can affect scores in particular periods, the improvement from 1994 onwards seems to be robust, and is most plausibly due to improvements in the observing system made at that time. The improvement in forecast skill is most evident for 3-month forecasts starting in February, where predictions of NINO3.4 SST from 1994 to present have been almost without fault. It is argued that in situations where the impact of model error is small, the value of improved observational data can be seen most clearly. Significant skill is also shown in the equatorial Indian Ocean, although predictive skill in parts of the tropical Atlantic are relatively poor. SST forecast errors can be especially high in the Southern Ocean. (orig.)

  7. Ensemble cloud-resolving modelling of a historic back-building mesoscale convective system over Liguria: the San Fruttuoso case of 1915

    Science.gov (United States)

    Parodi, Antonio; Ferraris, Luca; Gallus, William; Maugeri, Maurizio; Molini, Luca; Siccardi, Franco; Boni, Giorgio

    2017-05-01

    Highly localized and persistent back-building mesoscale convective systems represent one of the most dangerous flash-flood-producing storms in the north-western Mediterranean area. Substantial warming of the Mediterranean Sea in recent decades raises concerns over possible increases in frequency or intensity of these types of events as increased atmospheric temperatures generally support increases in water vapour content. However, analyses of the historical record do not provide a univocal answer, but these are likely affected by a lack of detailed observations for older events. In the present study, 20th Century Reanalysis Project initial and boundary condition data in ensemble mode are used to address the feasibility of performing cloud-resolving simulations with 1 km horizontal grid spacing of a historic extreme event that occurred over Liguria: the San Fruttuoso case of 1915. The proposed approach focuses on the ensemble Weather Research and Forecasting (WRF) model runs that show strong convergence over the Ligurian Sea (17 out of 56 members) as these runs are the ones most likely to best simulate the event. It is found that these WRF runs generally do show wind and precipitation fields that are consistent with the occurrence of highly localized and persistent back-building mesoscale convective systems, although precipitation peak amounts are underestimated. Systematic small north-westward position errors with regard to the heaviest rain and strongest convergence areas imply that the reanalysis members may not be adequately representing the amount of cool air over the Po Plain outflowing into the Ligurian Sea through the Apennines gap. Regarding the role of historical data sources, this study shows that in addition to reanalysis products, unconventional data, such as historical meteorological bulletins, newspapers, and even photographs, can be very valuable sources of knowledge in the reconstruction of past extreme events.

  8. Supersymmetry applied to the spectrum edge of random matrix ensembles

    International Nuclear Information System (INIS)

    Andreev, A.V.; Simons, B.D.; Taniguchi, N.

    1994-01-01

    A new matrix ensemble has recently been proposed to describe the transport properties in mesoscopic quantum wires. Both analytical and numerical studies have shown that the ensemble of Laguerre or of chiral random matrices provides a good description of scattering properties in this class of systems. Until now only conventional methods of random matrix theory have been used to study statistical properties within this ensemble. We demonstrate that the supersymmetry method, already employed in the study Dyson ensembles, can be extended to treat this class of random matrix ensembles. In developing this approach we investigate both new, as well as verify known statistical measures. Although we focus on ensembles in which T-invariance is violated our approach lays the foundation for future studies of T-invariant systems. ((orig.))

  9. On evaluation of ensemble precipitation forecasts with observation-based ensembles

    Directory of Open Access Journals (Sweden)

    S. Jaun

    2007-04-01

    Full Text Available Spatial interpolation of precipitation data is uncertain. How important is this uncertainty and how can it be considered in evaluation of high-resolution probabilistic precipitation forecasts? These questions are discussed by experimental evaluation of the COSMO consortium's limited-area ensemble prediction system COSMO-LEPS. The applied performance measure is the often used Brier skill score (BSS. The observational references in the evaluation are (a analyzed rain gauge data by ordinary Kriging and (b ensembles of interpolated rain gauge data by stochastic simulation. This permits the consideration of either a deterministic reference (the event is observed or not with 100% certainty or a probabilistic reference that makes allowance for uncertainties in spatial averaging. The evaluation experiments show that the evaluation uncertainties are substantial even for the large area (41 300 km2 of Switzerland with a mean rain gauge distance as good as 7 km: the one- to three-day precipitation forecasts have skill decreasing with forecast lead time but the one- and two-day forecast performances differ not significantly.

  10. Using ensemble forecasting for wind power

    Energy Technology Data Exchange (ETDEWEB)

    Giebel, G.; Landberg, L.; Badger, J. [Risoe National Lab., Roskilde (Denmark); Sattler, K.

    2003-07-01

    Short-term prediction of wind power has a long tradition in Denmark. It is an essential tool for the operators to keep the grid from becoming unstable in a region like Jutland, where more than 27% of the electricity consumption comes from wind power. This means that the minimum load is already lower than the maximum production from wind energy alone. Danish utilities have therefore used short-term prediction of wind energy since the mid-90ies. However, the accuracy is still far from being sufficient in the eyes of the utilities (used to have load forecasts accurate to within 5% on a one-week horizon). The Ensemble project tries to alleviate the dependency of the forecast quality on one model by using multiple models, and also will investigate the possibilities of using the model spread of multiple models or of dedicated ensemble runs for a prediction of the uncertainty of the forecast. Usually, short-term forecasting works (especially for the horizon beyond 6 hours) by gathering input from a Numerical Weather Prediction (NWP) model. This input data is used together with online data in statistical models (this is the case eg in Zephyr/WPPT) to yield the output of the wind farms or of a whole region for the next 48 hours (only limited by the NWP model horizon). For the accuracy of the final production forecast, the accuracy of the NWP prediction is paramount. While many efforts are underway to increase the accuracy of the NWP forecasts themselves (which ultimately are limited by the amount of computing power available, the lack of a tight observational network on the Atlantic and limited physics modelling), another approach is to use ensembles of different models or different model runs. This can be either an ensemble of different models output for the same area, using different data assimilation schemes and different model physics, or a dedicated ensemble run by a large institution, where the same model is run with slight variations in initial conditions and

  11. Monthly ENSO Forecast Skill and Lagged Ensemble Size

    Science.gov (United States)

    Trenary, L.; DelSole, T.; Tippett, M. K.; Pegion, K.

    2018-04-01

    The mean square error (MSE) of a lagged ensemble of monthly forecasts of the Niño 3.4 index from the Climate Forecast System (CFSv2) is examined with respect to ensemble size and configuration. Although the real-time forecast is initialized 4 times per day, it is possible to infer the MSE for arbitrary initialization frequency and for burst ensembles by fitting error covariances to a parametric model and then extrapolating to arbitrary ensemble size and initialization frequency. Applying this method to real-time forecasts, we find that the MSE consistently reaches a minimum for a lagged ensemble size between one and eight days, when four initializations per day are included. This ensemble size is consistent with the 8-10 day lagged ensemble configuration used operationally. Interestingly, the skill of both ensemble configurations is close to the estimated skill of the infinite ensemble. The skill of the weighted, lagged, and burst ensembles are found to be comparable. Certain unphysical features of the estimated error growth were tracked down to problems with the climatology and data discontinuities.

  12. Ensemble forecasting of potential habitat for three invasive fishes

    Science.gov (United States)

    Poulos, Helen M.; Chernoff, Barry; Fuller, Pam L.; Butman, David

    2012-01-01

    Aquatic invasive species pose major ecological and economic threats to aquatic ecosystems worldwide via displacement, predation, or hybridization with native species and the alteration of aquatic habitats and hydrologic cycles. Modeling the habitat suitability of alien aquatic species through spatially explicit mapping is an increasingly important risk assessment tool. Habitat modeling also facilitates identification of key environmental variables influencing invasive species distributions. We compared four modeling methods to predict the potential continental United States distributions of northern snakehead Channa argus (Cantor, 1842), round goby Neogobius melanostomus (Pallas, 1814), and silver carp Hypophthalmichthys molitrix (Valenciennes, 1844) using maximum entropy (Maxent), the genetic algorithm for rule set production (GARP), DOMAIN, and support vector machines (SVM). We used inventory records from the USGS Nonindigenous Aquatic Species Database and a geographic information system of 20 climatic and environmental variables to generate individual and ensemble distribution maps for each species. The ensemble maps from our study performed as well as or better than all of the individual models except Maxent. The ensemble and Maxent models produced significantly higher accuracy individual maps than GARP, one-class SVMs, or DOMAIN. The key environmental predictor variables in the individual models were consistent with the tolerances of each species. Results from this study provide insights into which locations and environmental conditions may promote the future spread of invasive fish in the US.

  13. Thermodynamics and kinetics of a molecular motor ensemble.

    Science.gov (United States)

    Baker, J E; Thomas, D D

    2000-10-01

    If, contrary to conventional models of muscle, it is assumed that molecular forces equilibrate among rather than within molecular motors, an equation of state and an expression for energy output can be obtained for a near-equilibrium, coworking ensemble of molecular motors. These equations predict clear, testable relationships between motor structure, motor biochemistry, and ensemble motor function, and we discuss these relationships in the context of various experimental studies. In this model, net work by molecular motors is performed with the relaxation of a near-equilibrium intermediate step in a motor-catalyzed reaction. The free energy available for work is localized to this step, and the rate at which this free energy is transferred to work is accelerated by the free energy of a motor-catalyzed reaction. This thermodynamic model implicitly deals with a motile cell system as a dynamic network (not a rigid lattice) of molecular motors within which the mechanochemistry of one motor influences and is influenced by the mechanochemistry of other motors in the ensemble.

  14. Ensemble of ground subsidence hazard maps using fuzzy logic

    Science.gov (United States)

    Park, Inhye; Lee, Jiyeong; Saro, Lee

    2014-06-01

    Hazard maps of ground subsidence around abandoned underground coal mines (AUCMs) in Samcheok, Korea, were constructed using fuzzy ensemble techniques and a geographical information system (GIS). To evaluate the factors related to ground subsidence, a spatial database was constructed from topographic, geologic, mine tunnel, land use, groundwater, and ground subsidence maps. Spatial data, topography, geology, and various ground-engineering data for the subsidence area were collected and compiled in a database for mapping ground-subsidence hazard (GSH). The subsidence area was randomly split 70/30 for training and validation of the models. The relationships between the detected ground-subsidence area and the factors were identified and quantified by frequency ratio (FR), logistic regression (LR) and artificial neural network (ANN) models. The relationships were used as factor ratings in the overlay analysis to create ground-subsidence hazard indexes and maps. The three GSH maps were then used as new input factors and integrated using fuzzy-ensemble methods to make better hazard maps. All of the hazard maps were validated by comparison with known subsidence areas that were not used directly in the analysis. As the result, the ensemble model was found to be more effective in terms of prediction accuracy than the individual model.

  15. Extension of the GHJW theorem for operator ensembles

    International Nuclear Information System (INIS)

    Choi, Jeong Woon; Hong, Dowon; Chang, Ku-Young; Chi, Dong Pyo; Lee, Soojoon

    2011-01-01

    The Gisin-Hughston-Jozsa-Wootters theorem plays an important role in analyzing various theories about quantum information, quantum communication, and quantum cryptography. It means that any purifications on the extended system which yield indistinguishable state ensembles on their subsystem should have a specific local unitary relation. In this Letter, we show that the local relation is also established even when the indistinguishability of state ensembles is extended to that of operator ensembles.

  16. Multilevel ensemble Kalman filtering

    KAUST Repository

    Hoel, Haakon

    2016-01-08

    The ensemble Kalman filter (EnKF) is a sequential filtering method that uses an ensemble of particle paths to estimate the means and covariances required by the Kalman filter by the use of sample moments, i.e., the Monte Carlo method. EnKF is often both robust and efficient, but its performance may suffer in settings where the computational cost of accurate simulations of particles is high. The multilevel Monte Carlo method (MLMC) is an extension of classical Monte Carlo methods which by sampling stochastic realizations on a hierarchy of resolutions may reduce the computational cost of moment approximations by orders of magnitude. In this work we have combined the ideas of MLMC and EnKF to construct the multilevel ensemble Kalman filter (MLEnKF) for the setting of finite dimensional state and observation spaces. The main ideas of this method is to compute particle paths on a hierarchy of resolutions and to apply multilevel estimators on the ensemble hierarchy of particles to compute Kalman filter means and covariances. Theoretical results and a numerical study of the performance gains of MLEnKF over EnKF will be presented. Some ideas on the extension of MLEnKF to settings with infinite dimensional state spaces will also be presented.

  17. Neural Network Ensembles

    DEFF Research Database (Denmark)

    Hansen, Lars Kai; Salamon, Peter

    1990-01-01

    We propose several means for improving the performance an training of neural networks for classification. We use crossvalidation as a tool for optimizing network parameters and architecture. We show further that the remaining generalization error can be reduced by invoking ensembles of similar...... networks....

  18. Multilevel ensemble Kalman filtering

    KAUST Repository

    Hoel, Haakon; Chernov, Alexey; Law, Kody; Nobile, Fabio; Tempone, Raul

    2016-01-01

    The ensemble Kalman filter (EnKF) is a sequential filtering method that uses an ensemble of particle paths to estimate the means and covariances required by the Kalman filter by the use of sample moments, i.e., the Monte Carlo method. EnKF is often both robust and efficient, but its performance may suffer in settings where the computational cost of accurate simulations of particles is high. The multilevel Monte Carlo method (MLMC) is an extension of classical Monte Carlo methods which by sampling stochastic realizations on a hierarchy of resolutions may reduce the computational cost of moment approximations by orders of magnitude. In this work we have combined the ideas of MLMC and EnKF to construct the multilevel ensemble Kalman filter (MLEnKF) for the setting of finite dimensional state and observation spaces. The main ideas of this method is to compute particle paths on a hierarchy of resolutions and to apply multilevel estimators on the ensemble hierarchy of particles to compute Kalman filter means and covariances. Theoretical results and a numerical study of the performance gains of MLEnKF over EnKF will be presented. Some ideas on the extension of MLEnKF to settings with infinite dimensional state spaces will also be presented.

  19. Bayesian energy landscape tilting: towards concordant models of molecular ensembles.

    Science.gov (United States)

    Beauchamp, Kyle A; Pande, Vijay S; Das, Rhiju

    2014-03-18

    Predicting biological structure has remained challenging for systems such as disordered proteins that take on myriad conformations. Hybrid simulation/experiment strategies have been undermined by difficulties in evaluating errors from computational model inaccuracies and data uncertainties. Building on recent proposals from maximum entropy theory and nonequilibrium thermodynamics, we address these issues through a Bayesian energy landscape tilting (BELT) scheme for computing Bayesian hyperensembles over conformational ensembles. BELT uses Markov chain Monte Carlo to directly sample maximum-entropy conformational ensembles consistent with a set of input experimental observables. To test this framework, we apply BELT to model trialanine, starting from disagreeing simulations with the force fields ff96, ff99, ff99sbnmr-ildn, CHARMM27, and OPLS-AA. BELT incorporation of limited chemical shift and (3)J measurements gives convergent values of the peptide's α, β, and PPII conformational populations in all cases. As a test of predictive power, all five BELT hyperensembles recover set-aside measurements not used in the fitting and report accurate errors, even when starting from highly inaccurate simulations. BELT's principled framework thus enables practical predictions for complex biomolecular systems from discordant simulations and sparse data. Copyright © 2014 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  20. Statistical ensembles in quantum mechanics

    International Nuclear Information System (INIS)

    Blokhintsev, D.

    1976-01-01

    The interpretation of quantum mechanics presented in this paper is based on the concept of quantum ensembles. This concept differs essentially from the canonical one by that the interference of the observer into the state of a microscopic system is of no greater importance than in any other field of physics. Owing to this fact, the laws established by quantum mechanics are not of less objective character than the laws governing classical statistical mechanics. The paradoxical nature of some statements of quantum mechanics which result from the interpretation of the wave functions as the observer's notebook greatly stimulated the development of the idea presented. (Auth.)

  1. 3-D visualization of ensemble weather forecasts - Part 2: Forecasting warm conveyor belt situations for aircraft-based field campaigns

    Science.gov (United States)

    Rautenhaus, M.; Grams, C. M.; Schäfler, A.; Westermann, R.

    2015-02-01

    We present the application of interactive 3-D visualization of ensemble weather predictions to forecasting warm conveyor belt situations during aircraft-based atmospheric research campaigns. Motivated by forecast requirements of the T-NAWDEX-Falcon 2012 campaign, a method to predict 3-D probabilities of the spatial occurrence of warm conveyor belts has been developed. Probabilities are derived from Lagrangian particle trajectories computed on the forecast wind fields of the ECMWF ensemble prediction system. Integration of the method into the 3-D ensemble visualization tool Met.3D, introduced in the first part of this study, facilitates interactive visualization of WCB features and derived probabilities in the context of the ECMWF ensemble forecast. We investigate the sensitivity of the method with respect to trajectory seeding and forecast wind field resolution. Furthermore, we propose a visual analysis method to quantitatively analyse the contribution of ensemble members to a probability region and, thus, to assist the forecaster in interpreting the obtained probabilities. A case study, revisiting a forecast case from T-NAWDEX-Falcon, illustrates the practical application of Met.3D and demonstrates the use of 3-D and uncertainty visualization for weather forecasting and for planning flight routes in the medium forecast range (three to seven days before take-off).

  2. Prediction strategies in a TV recommender system - Method and experiments

    NARCIS (Netherlands)

    van Setten, M.J.; Veenstra, M.; van Dijk, Elisabeth M.A.G.; Nijholt, Antinus; Isaísas, P.; Karmakar, N.

    2003-01-01

    Predicting the interests of a user in information is an important process in personalized information systems. In this paper, we present a way to create prediction engines that allow prediction techniques to be easily combined into prediction strategies. Prediction strategies choose one or a

  3. Bioactive focus in conformational ensembles: a pluralistic approach

    Science.gov (United States)

    Habgood, Matthew

    2017-12-01

    Computational generation of conformational ensembles is key to contemporary drug design. Selecting the members of the ensemble that will approximate the conformation most likely to bind to a desired target (the bioactive conformation) is difficult, given that the potential energy usually used to generate and rank the ensemble is a notoriously poor discriminator between bioactive and non-bioactive conformations. In this study an approach to generating a focused ensemble is proposed in which each conformation is assigned multiple rankings based not just on potential energy but also on solvation energy, hydrophobic or hydrophilic interaction energy, radius of gyration, and on a statistical potential derived from Cambridge Structural Database data. The best ranked structures derived from each system are then assembled into a new ensemble that is shown to be better focused on bioactive conformations. This pluralistic approach is tested on ensembles generated by the Molecular Operating Environment's Low Mode Molecular Dynamics module, and by the Cambridge Crystallographic Data Centre's conformation generator software.

  4. Evaluation of quantitative precipitation forecasts by TIGGE ensembles for south China during the presummer rainy season

    Science.gov (United States)

    Huang, Ling; Luo, Yali

    2017-08-01

    Based on The Observing System Research and Predictability Experiment Interactive Grand Global Ensemble (TIGGE) data set, this study evaluates the ability of global ensemble prediction systems (EPSs) from the European Centre for Medium-Range Weather Forecasts (ECMWF), U.S. National Centers for Environmental Prediction, Japan Meteorological Agency (JMA), Korean Meteorological Administration, and China Meteorological Administration (CMA) to predict presummer rainy season (April-June) precipitation in south China. Evaluation of 5 day forecasts in three seasons (2013-2015) demonstrates the higher skill of probability matching forecasts compared to simple ensemble mean forecasts and shows that the deterministic forecast is a close second. The EPSs overestimate light-to-heavy rainfall (0.1 to 30 mm/12 h) and underestimate heavier rainfall (>30 mm/12 h), with JMA being the worst. By analyzing the synoptic situations predicted by the identified more skillful (ECMWF) and less skillful (JMA and CMA) EPSs and the ensemble sensitivity for four representative cases of torrential rainfall, the transport of warm-moist air into south China by the low-level southwesterly flow, upstream of the torrential rainfall regions, is found to be a key synoptic factor that controls the quantitative precipitation forecast. The results also suggest that prediction of locally produced torrential rainfall is more challenging than prediction of more extensively distributed torrential rainfall. A slight improvement in the performance is obtained by shortening the forecast lead time from 30-36 h to 18-24 h to 6-12 h for the cases with large-scale forcing, but not for the locally produced cases.

  5. Multicomponent ensemble models to forecast induced seismicity

    Science.gov (United States)

    Király-Proag, E.; Gischig, V.; Zechar, J. D.; Wiemer, S.

    2018-01-01

    In recent years, human-induced seismicity has become a more and more relevant topic due to its economic and social implications. Several models and approaches have been developed to explain underlying physical processes or forecast induced seismicity. They range from simple statistical models to coupled numerical models incorporating complex physics. We advocate the need for forecast testing as currently the best method for ascertaining if models are capable to reasonably accounting for key physical governing processes—or not. Moreover, operational forecast models are of great interest to help on-site decision-making in projects entailing induced earthquakes. We previously introduced a standardized framework following the guidelines of the Collaboratory for the Study of Earthquake Predictability, the Induced Seismicity Test Bench, to test, validate, and rank induced seismicity models. In this study, we describe how to construct multicomponent ensemble models based on Bayesian weightings that deliver more accurate forecasts than individual models in the case of Basel 2006 and Soultz-sous-Forêts 2004 enhanced geothermal stimulation projects. For this, we examine five calibrated variants of two significantly different model groups: (1) Shapiro and Smoothed Seismicity based on the seismogenic index, simple modified Omori-law-type seismicity decay, and temporally weighted smoothed seismicity; (2) Hydraulics and Seismicity based on numerically modelled pore pressure evolution that triggers seismicity using the Mohr-Coulomb failure criterion. We also demonstrate how the individual and ensemble models would perform as part of an operational Adaptive Traffic Light System. Investigating seismicity forecasts based on a range of potential injection scenarios, we use forecast periods of different durations to compute the occurrence probabilities of seismic events M ≥ 3. We show that in the case of the Basel 2006 geothermal stimulation the models forecast hazardous levels

  6. Wave Extremes in the Northeast Atlantic from Ensemble Forecasts

    Science.gov (United States)

    Breivik, Øyvind; Aarnes, Ole Johan; Bidlot, Jean-Raymond; Carrasco, Ana; Saetra, Øyvind

    2013-10-01

    A method for estimating return values from ensembles of forecasts at advanced lead times is presented. Return values of significant wave height in the North-East Atlantic, the Norwegian Sea and the North Sea are computed from archived +240-h forecasts of the ECMWF ensemble prediction system (EPS) from 1999 to 2009. We make three assumptions: First, each forecast is representative of a six-hour interval and collectively the data set is then comparable to a time period of 226 years. Second, the model climate matches the observed distribution, which we confirm by comparing with buoy data. Third, the ensemble members are sufficiently uncorrelated to be considered independent realizations of the model climate. We find anomaly correlations of 0.20, but peak events (>P97) are entirely uncorrelated. By comparing return values from individual members with return values of subsamples of the data set we also find that the estimates follow the same distribution and appear unaffected by correlations in the ensemble. The annual mean and variance over the 11-year archived period exhibit no significant departures from stationarity compared with a recent reforecast, i.e., there is no spurious trend due to model upgrades. EPS yields significantly higher return values than ERA-40 and ERA-Interim and is in good agreement with the high-resolution hindcast NORA10, except in the lee of unresolved islands where EPS overestimates and in enclosed seas where it is biased low. Confidence intervals are half the width of those found for ERA-Interim due to the magnitude of the data set.

  7. Ensemble streamflow assimilation with the National Water Model.

    Science.gov (United States)

    Rafieeinasab, A.; McCreight, J. L.; Noh, S.; Seo, D. J.; Gochis, D.

    2017-12-01

    Through case studies of flooding across the US, we compare the performance of the National Water Model (NWM) data assimilation (DA) scheme to that of a newly implemented ensemble Kalman filter approach. The NOAA National Water Model (NWM) is an operational implementation of the community WRF-Hydro modeling system. As of August 2016, the NWM forecasts of distributed hydrologic states and fluxes (including soil moisture, snowpack, ET, and ponded water) over the contiguous United States have been publicly disseminated by the National Center for Environmental Prediction (NCEP) . It also provides streamflow forecasts at more than 2.7 million river reaches up to 30 days in advance. The NWM employs a nudging scheme to assimilate more than 6,000 USGS streamflow observations and provide initial conditions for its forecasts. A problem with nudging is how the forecasts relax quickly to open-loop bias in the forecast. This has been partially addressed by an experimental bias correction approach which was found to have issues with phase errors during flooding events. In this work, we present an ensemble streamflow data assimilation approach combining new channel-only capabilities of the NWM and HydroDART (a coupling of the offline WRF-Hydro model and NCAR's Data Assimilation Research Testbed; DART). Our approach focuses on the single model state of discharge and incorporates error distributions on channel-influxes (overland and groundwater) in the assimilation via an ensemble Kalman filter (EnKF). In order to avoid filter degeneracy associated with a limited number of ensemble at large scale, DART's covariance inflation (Anderson, 2009) and localization capabilities are implemented and evaluated. The current NWM data assimilation scheme is compared to preliminary results from the EnKF application for several flooding case studies across the US.

  8. Natural systems prediction of radionuclide migration

    International Nuclear Information System (INIS)

    Ewing, R.C.

    1991-01-01

    This paper reviews the application (and limitations) of data from natural systems to the verification of performance assessments, particularly as they apply to the evaluation of the long-term performance of waste forms, backfill, canister materials, and finally, the integrity of the repository itself. Two specific examples, the corrosion of borosilicate glass and the formation of alteration products of spent fuel, will be discussed. In both cases, inferences are of three types: 1) directly applicable data (i.e. radiation effects, stable phase assemblages): 2) inferences based on the analogous behaviour of the natural and repository systems (e.g. long-term corrosion rate); 3) specific identification of new phenomena that could not have been anticipated from the short term laboratory data (i.e. new mechanisms for the retention or release of radionuclides). The latter can only be derived from the observation of natural systems. Finally, specific attention will be paid to the limitations in the use of natural systems, particularly as the spatial and temporal scales expand, and to the inherent limitations of prediction and verification. (J.P.N.)

  9. Rotating Machinery Predictive Maintenance Through Expert System

    Directory of Open Access Journals (Sweden)

    M. Sarath Kumar

    2000-01-01

    Full Text Available Modern rotating machines such as turbomachines, either produce or absorb huge amount of power. Some of the common applications are: steam turbine-generator and gas turbine-compressor-generator trains produce power and machines, such as pumps, centrifugal compressors, motors, generators, machine tool spindles, etc., are being used in industrial applications. Condition-based maintenance of rotating machinery is a common practice where the machine's condition is monitored constantly, so that timely maintenance can be done. Since modern machines are complex and the amount of data to be interpreted is huge, we need precise and fast methods in order to arrive at the best recommendations to prevent catastrophic failure and to prolong the life of the equipment. In the present work using vibration characteristics of a rotor-bearing system, the condition of a rotating machinery (electrical rotor is predicted using an off-line expert system. The analysis of the problem is carried out in an Object Oriented Programming (OOP framework using the finite element method. The expert system which is also developed in an OOP paradigm gives the type of the malfunctions, suggestions and recommendations. The system is implemented in C++.

  10. System for prediction of environmental emergency dose information network system

    International Nuclear Information System (INIS)

    Misawa, Makoto; Nagamori, Fumio

    2009-01-01

    In cases when an accident happens to arise with some risk for emission of a large amount radioactivity from the nuclear facilities, the environmental emergency due to this accident should be predicted rapidly and be informed immediately. The SPEEDI network system for such purpose was completed and now operated by Nuclear Safety Technology Center (NUSTEC) commissioned to do by Ministry of Education, Culture, Sports, Science and Technology, Japan. Fujitsu has been contributing to this project by developing the principal parts of the network performance, by introducing necessary servers, and also by keeping the network in good condition, such as with construction of the system followed by continuous operation and maintenance of the system. Real-time prediction of atmospheric diffusion of radionuclides for nuclear accidents in the world is now available with experimental verification for the real-time emergency response system. Improvement of worldwide version of the SPEEDI network system, accidental discharge of radionuclides with the function of simultaneous prediction for multiple domains and its evaluation is possible. (S. Ohno)

  11. STUDENT PREDICTION SYSTEM FOR PLACEMENT TRAINING USING FUZZY INFERENCE SYSTEM

    Directory of Open Access Journals (Sweden)

    Ravi Kumar Rathore

    2017-04-01

    Full Text Available Proposed student prediction system is most vital approach which may be used to differentiate the student data/information on the basis of the student performance. Managing placement and training records in any larger organization is quite difficult as the student number are high; in such condition differentiation and classification on different categories becomes tedious. Proposed fuzzy inference system will classify the student data with ease and will be helpful to many educational organizations. There are lots of classification algorithms and statistical base technique which may be taken as good assets for classify the student data set in the education field. In this paper, Fuzzy Inference system has been applied to predict student performance which will help to identify performance of the students and also provides an opportunity to improve to performance. For instance, here we will classify the student’s data set for placement and non-placement classes.

  12. Embedded random matrix ensembles in quantum physics

    CERN Document Server

    Kota, V K B

    2014-01-01

    Although used with increasing frequency in many branches of physics, random matrix ensembles are not always sufficiently specific to account for important features of the physical system at hand. One refinement which retains the basic stochastic approach but allows for such features consists in the use of embedded ensembles.  The present text is an exhaustive introduction to and survey of this important field. Starting with an easy-to-read introduction to general random matrix theory, the text then develops the necessary concepts from the beginning, accompanying the reader to the frontiers of present-day research. With some notable exceptions, to date these ensembles have primarily been applied in nuclear spectroscopy. A characteristic example is the use of a random two-body interaction in the framework of the nuclear shell model. Yet, topics in atomic physics, mesoscopic physics, quantum information science and statistical mechanics of isolated finite quantum systems can also be addressed using these ensemb...

  13. Implementation and testing of the travel time prediction system (TIPS)

    OpenAIRE

    PANT, Prahlad D; UNIVERSITY OF CINCINNATI

    2001-01-01

    RAPPORT DE RECHERCHE FINAL The Travel Time Prediction System (TIPS) is a portable automated system for predicting and displaying travel time for motorists in advance of and through freeway construction work zones,on a real-time basis

  14. Development and verification of a new wind speed forecasting system using an ensemble Kalman filter data assimilation technique in a fully coupled hydrologic and atmospheric model

    Science.gov (United States)

    Williams, John L.; Maxwell, Reed M.; Monache, Luca Delle

    2013-12-01

    Wind power is rapidly gaining prominence as a major source of renewable energy. Harnessing this promising energy source is challenging because of the chaotic nature of wind and its inherently intermittent nature. Accurate forecasting tools are critical to support the integration of wind energy into power grids and to maximize its imp